Services to stop on quiesce target start

Andrew Geissler geissonator at gmail.com
Thu Sep 28 07:29:03 AEST 2017


The goal of the quiesce target was to provide a point from which the
bmc firmware can do 2 things
o Initiate a recovery policy (i.e. power off and back on again)
o Sit in indefinitely and leave the system in the fail state for debug

With these goals in mind, the initial design was to basically do
nothing to other targets and services when we got into the quiesce
state.

This causes an issue with checkstops and host timeouts.  If the system
hits a checkstop (processor suddenly halts), bmc firmware detects the
checkstop, logs an error, and puts the bmc into the quiesce state.
But, since the host was running, the host watchdog was enabled so the
bmc ends up also hitting the host watchdog timeout and logging an
error for that as well.  Doesn’t hurt anything, but the bmc has
generated 2 errors when there really should have just been one.

The simple solution is to just have the watchdog service conflict with
the quiesce target.  If the bmc starts the quiesce target, the host
watchdog target is stopped.

Some other options:
o Run the obmc-host-stop target, this would potentially violate our
goal of staying as close to the fail state as possible in quiesce
o Create a new target that quiesce runs where we could stuff things
like this - would give us a single “thing” to look for but currently
we really only have this one use case so a bit overkill.

My thoughts - Keep it simple, lets just have the
phosphor-watchdog at poweron.service conflict with the quiesce target.
If we find more of these use cases in the future we can revisit a
special target to keep them all straight.

Relevant issue is https://github.com/openbmc/openbmc/issues/2282

Thoughts or comments appreciated,
Andrew


More information about the openbmc mailing list