<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
LAN/NCSI experts,<br>
<br>
I'm looking for some guidance to correctly report LAN state when
RJ45 cables are inserted/removed from the NIC's monitored by the
BMC.<br>
<br>
Background:<br>
<ol>
<li>Intel uses one NIC that is dedicated to the BMC (1Gib phy,
eth0)<br>
</li>
<li>There is another LAN channel managed by the BMC via NCSI
(100Mib NCSI, eth1)<br>
</li>
<li>Prior generations of Intel servers correctly report the
presence of an active LAN link upon insertion/removal</li>
<li>Prior generations correctly update
/sys/class/net/eth(x)/carrier based on cable insertion</li>
<li>LAN carrier state is propagated to the IPMI system, and is
logged via IPMI events.<br>
</li>
</ol>
<p>OpenBMC:</p>
<ol>
<li>The dedicated NIC (aka eth0) correctly propagates
/sys/class/net/eth0/carrier state to the kernel</li>
<li>The eth0 carrier state does not propagate to Redfish/IPMI. I
believe this to be a problem related to DBus update.</li>
<li>The NCSI NIC (aka eth1) does not propagate the cable presence
to /sys/class/net/eth1, which I believe to be invalid behavior.</li>
<li>The eth1 carrier state does not propagate to Redfish/IPMI. I
believe this is a two-fold problem with DBus update, and kernel
update.</li>
<li>LAN carrier state is not propagated to IPMI or Redfish, and is
not logged.<br>
</li>
</ol>
<p>I've already pushed a change to handle synchronous enable/disable
actions via DBus. Unfortunately it's only half the solution. I've
been trying to work my way through the FTGMAC100 driver code, and
the phosphor-networkd code in an effort to tie the async event
into the DBus space. So far without success. <br>
</p>
Desired results:
<ol>
<li>Redfish GET /redfish/v1/Managers/bmc/EthernetInterfaces/eth0
(eth1) correctly reports the InterfaceEnabled state when a cable
is pulled/inserted.</li>
<li>Redfish receives an event that permits the state change to be
logged.</li>
<li>IPMI correctly reports the link status (up/down)</li>
<li>ip link correctly reports the link status (up/down).<br>
</li>
</ol>
<p>Does anyone on the list have some recommendations?</p>
<p><br>
</p>
<p><br>
</p>
<p>Details of my investigation for the strong of heart:</p>
<ul>
<li>Synchronous NIC Enable/Disable change</li>
</ul>
<ol>
<ol>
<li><a class="moz-txt-link-freetext" href="https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-networkd/+/26696">https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-networkd/+/26696</a></li>
<li><a class="moz-txt-link-freetext" href="https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-dbus-interfaces/+/26694">https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-dbus-interfaces/+/26694</a></li>
<li><a class="moz-txt-link-freetext" href="https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/26693">https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/26693</a></li>
</ol>
</ol>
<ul>
<li>FTGMAC driver:
(kernel-source/drivers/net/ethernet/faraday/ftgmac100.c)</li>
</ul>
This code looks suspect, as it uses "speed" to determine if
the link is up/down. For the Intel system the NCSI channel is
always running 100Mibps, so does this really work?<br>
<blockquote><font face="Courier New, Courier, monospace"> /* Link
is down, do nothing else */</font><br>
<font face="Courier New, Courier, monospace"> if (!new_speed)</font><br>
<font face="Courier New, Courier, monospace"> return;</font><br>
</blockquote>
<p> Likewise, this code announces the link is "down", which
seems would cause the kernel to alter the state of
/sys/class/net/eth1/carrier. Insertion of the cable does not
cause a corresponding "link is up" message:</p>
<p><font face="Courier New, Courier, monospace"> static void
ftgmac100_ncsi_handler(struct ncsi_dev *nd)</font><br>
<font face="Courier New, Courier, monospace"> {</font><br>
<font face="Courier New, Courier, monospace"> if
(unlikely(nd->state != ncsi_dev_state_functional))</font><br>
<font face="Courier New, Courier, monospace"> return;</font><br>
<br>
<font face="Courier New, Courier, monospace">
netdev_dbg(nd->dev, "NCSI interface %s\n",</font><br>
<font face="Courier New, Courier, monospace">
nd->link_up ? "up" : "down");</font><br>
}<br>
</p>
<ul>
<li>PHY driver (kernel-source/drivers/net/phy/phy.c and phylink.c)<br>
</li>
</ul>
<p> The PHY driver for eth0 seems to be working correctly.
Removal shows the link going down, and the
/sys/class/net/eth0/carrier file changing state from 1->0.
Insertion works correctly changing the state from 0->1. What
isn't present is some action to phosphor-network or DBus
indicating a link state change.</p>
<ul>
<li>ip link</li>
</ul>
<p> The "ip link" command always returns
"<BROADCAST,MULTICAST,UP,LOWER_UP>" for eth1 (NCSI) channel.<br>
</p>
-- <br>
<div class="moz-signature">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<title></title>
<font color="#1F497D"><font face="Century Gothic">Johnathan Mantey<br>
<small>Senior Software Engineer</small><br>
<big><font color="#555555"><small><b>azad te</b><b>chnology
partners</b></small><br>
<small><font color="#1F497D"><small>Contributing to
Technology Innovation since 1992</small></font><small><br>
<font color="#1F497D">Phone: (503) 712-6764<br>
Email: <a href="mailto:johnathanx.mantey@intel.com">johnathanx.mantey@intel.com</a></font></small><br>
<br>
</small></font></big></font></font> </div>
</body>
</html>