The ipmid failed to start

Andrew Jeffery andrew at aj.id.au
Tue Dec 11 11:35:23 AEDT 2018


Hello,

On Mon, 10 Dec 2018, at 14:33, P. K. Lee (李柏寬) wrote:
> Hi Brad,
> 
> I saw the issue, which is "recipes-phosphor: Fix systemd unit 
> dependences of ipmid, mboxd" appending 
> "Requires=org.openbmc.HostIpmi.service" and 
> "Afteorg.openbmc.HostIpmi.service", you merged on Nov 29.
> 
> After that, the ipmid failed to start.
> 
> Because the file meta-phosphor/recipes-phosphor/ipmi/phosphor-ipmi-kcs/
> org.openbmc.HostIpmi.service already included "After=phosphor-ipmi-
> host.service" before, it caused two services waiting for each other.

Ah, KCS. This is why I didn't hit the problem in my testing.

> 
> If the "Requires=org.openbmc.HostIpmi.service" and 
> "Afteorg.openbmc.HostIpmi.service" are removed in the meta-phosphor/
> recipes-phosphor/ipmi/phosphor-ipmi-host/phosphor-ipmi-host.service or 
> the "After=phosphor-ipmi-host.service" is removed in the meta-phosphor/
> recipes-phosphor/ipmi/phosphor-ipmi-kcs/org.openbmc.HostIpmi.service, 
> the ipmid will start normally.
> 
> I wonder if there is a better way to go about it.

It's a bit dicey. Something to consider here is that systemd does it's tear-down in stack-order with respect to the Before=/After= relationships defined in the service files. With org.openbmc.HostIpmi.service having an After= relationship with phosphor-ipmi-host.service it is torn down before phosphor-ipmi-host. This impacts processes that want to send events to the host during the BMC shutdown path: These applications need to depend on phosphor-ipmi-host to access its APIs, and it's a layering violation for the applications to also depend on the transport that phosphor-ipmi-host is using (KCS, BT etc). In order to propagate the event to the host, phosphor-ipmi-host needs its transport, and therefore must have an After= relationship with the transport process due to systemd's stack ordering on tear-down.

Having said that, requiring that After= relationship means that phosphor-ipmi-host.service is started after org.openbmc.HostIpmi.service, which means messages that arrive in the period between the two starting may get lost. I think it's worth keeping in mind that org.openbmc.HostIpmi simply emits a signal when a message is received, and this provides no guarantee of anyone actually caring. This is in contrast to sending events the other way, where org.openbmc.HostIpmi's interface provides a DBus method, which guarantees someone cares or the caller receives an error. This design fits the After= relationship that I made phosphor-ipmi-host have on org.openbmc.HostIpmi.service, so I'd suggest it's worth changing the relationship for the KCS implementation as well.

However, maybe that's a decision to be made per-platform rather than generally. I guess the unit files can be overridden in the packaging if this is a concern?

> 
> By the way, after I pushed the code to the Gerrit, "Jenkins Patch Set 1: 
> User not approved, see admin, no CI" it showed. How can I solve this 
> problem?

You need to convince someone that your patch isn't going to exploit our CI infrastructure :) Once you've earned some trust in the community with your contributions generally someone will add you to the list that allows CI to test your patches straight away.

Cheers,

Andrew

> 
> Sincerely,
> P. K. Lee


More information about the openbmc mailing list