Checking for network online

Johnathan Mantey johnathanx.mantey at intel.com
Sat Feb 19 06:39:03 AEDT 2022



On 2/18/22 11:04, Doman, Jonathan wrote:
> On Thu, 2022-02-17 at 14:54 -0800, Johnathan Mantey wrote:
>> /I/ntel has recently run into an issue regarding a systemd service, and
>> we're interested in soliciting feedback from the community.
>>
>> Issue: systemd-networkd-wait-online.service stalls for 120 seconds when
>> the managed NICs do not have a network connection.
>>
>> TLDR: Should OpenBMC remove systemd-networkd-wait-online.service
>> universally?
>>
>> System Config: All NICs in the system are not connected to an active
>> network.
>>
>> Test Process: The system under test (SUT) has AC removed, and some time
>> later AC applied. Wait for BMC/BIOS to boot
>>
>> Behavior: U-Boot hands control to the Linux boot process, and the
>> systemd services are started. When systemd-networkd-wait-online.service
>> starts it stalls waiting for the NICs to enter a fully functional state.
>> This never happens during the default 120 second timeout period for this
>> service. When the timeout elapses, an error message is logged to the
>> journal reporting the service exited unsuccessfully.
>>
>> Issues: This service blocks entry to multi-user.target.
>> phosphor-state-manager uses multi-user.target to report the BMC is ready
>> to use.
>> This is reported via IPMI Get Device ID.
>> The Intel BIOS is blocked from booting until
>> systemd-networkd-wait-online times out.
>> BMC entry to multi-user.target is delayed. Journal entries are created.
>>
>> Question for the community: Given the negative side effects caused by
>> running this service does the community want to have this service
>> collectively removed from global build image?
> 
> I think the initial discussion in #general got to the root of the
> issue: multi-user.target Wants rsyslog.service, which in turn is
> ordered After network-online.target. rsyslog seems to be the only thing
> tying multi-user to network-online.

I assume you mean OpenBMC Discord #general channel?

> 
> Did you try removing the Wants/After=network-online.target from
> rsyslog.service to see if the situation improves? If it does, then we
> can discuss removing that dependency or making it configurable.

No, I had not tried that. My take on doing so is that it'll be like 
playing whack a mole. Some other service may decide to rely on 
systemd-networkd-wait-online. The issue is now compounded as a result.

I basically took it on faith that rsyslog needed this service. I did not 
investigate what issues arise in rsyslog when no network is present.

Ultimately there will be times where the BMC will have to operate sans 
network. It's unfortunate that the wait-online service doesn't seem to 
perform the expected operation for the --ignore and the min/max 
operational state functions. This may be a mismatch between my 
expectations and the actual implementation of wait-online.

-- 
Johnathan Mantey
Senior Software Engineer
*azad te**chnology partners*
Contributing to Technology Innovation since 1992
Phone: (503) 712-6764
Email: johnathanx.mantey at intel.com <mailto:johnathanx.mantey at intel.com>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20220218/5cfd63dd/attachment-0001.sig>


More information about the openbmc mailing list