How to deal with failing services in the boot targets
Andrew Geissler
geissonator at gmail.com
Thu Jan 26 11:33:33 AEDT 2017
Hey Xo,
Pretty perfect timing on this email, my new story for this sprint is
https://github.com/openbmc/openbmc/issues/1033 (handle service
failures in openbmc) precisely for the reasons you mentioned above.
We've had a few hallway talks but not much in the way of design yet.
One thought was to use the "OnFailure=" tag to start up a target that
conflicts with everything (to stop everything) and puts the system
into some sort of "termination" state.
We have two different types of targets in openbmc. The one's that
have "wants' relationships (i.e. run these services) and targets that
are more for synchronization among those services. I recently added a
new chassis power "wants" target, I think I'll need to add some
dependencies that you mention above there, so if someone starts the
"boot host" target, it will automatically run the "turn on power"
target.
Anyway, hope to have a few more thoughts on this out by the end of
this week. Will do some experimenting with your notes.
Andrew
On Wed, Jan 25, 2017 at 5:29 PM, Xo Wang <xow at google.com> wrote:
> Hi folks,
>
> I'm seeing vcs-on at 0.service failing occasionally. I know the cause of
> it (i2c errors) but I'd like to know how to deal with failing services
> in the context of OpenBMC boot sequencing.
>
> For example, the service failure isn't reflect by any subsequent
> target failures (it reaches obmc-chassis-start at 0.target with no
> command line errors, only a journal error for vcs-on at 0.service
> itself), nor did it prevent the boot from proceeding to pdbg host
> control.
>
> This is expected behavior given the systemd Unit relationships I used,
> but I don't see a clean way to make a unit like vcs-on at .service block
> the boot.
>
> I tried making vcs-on at .service [Install]
> RequiredBy=obmc-chassis-start@%i.target (and modifying the service
> install similarly), but this only prints out a message that
> obmc-chassis-start at 0.target couldn't be reached due to its failed
> dependency. It did not stop the pdbg start IPL.
>
> I also tried RequiredBy=obmc-host-start-pre@%i.target. This turned out
> even worse because our targets don't require their precedent targets,
> so obmc-host-start at 0.target is still reachable even with a failure in
> obmc-host-start-pre at 0.target. Likewise for
> obmc-chassis-start at 0.target, which now prints no console error at all.
>
> Finally I could add RequiredBy=start_host@%i.service to
> vcs-on at 0.service, but this seems fragile compared to using the targets
> as synchronization points.
>
> 1) How should I make a host boot service be a blocking step in the chain?
>
> 2) Will this require a structural change in the OpenBMC targets?
> Making targets require their precedent targets comes to mind. This
> would make targets useful not only for sequencing but also for
> dependency checking.
>
> 3) Do other people also want this? To me it seems obvious that failure
> to power on should always block starting IPL, but maybe somebody else
> has a good reason to use weaker relationships.
>
> thanks
> xo
> _______________________________________________
> openbmc mailing list
> openbmc at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/openbmc
More information about the openbmc
mailing list