Handling BMC Reboots when Host is Running
joel at jms.id.au
Thu Mar 2 12:01:43 AEDT 2017
On Wed, Mar 1, 2017 at 3:33 AM, Andrew Geissler <geissonator at gmail.com> wrote:
> My story this sprint, https://github.com/openbmc/openbmc/issues/1094,
> is to allow the BMC to be rebooted while the host is up and running.
Cool. In addition to this work, we need to make sure the required bits
of the Aspeed hardware are not reset to a default state, either by the
SoC's reset mechanism, or by the loading of drivers.
GPIOs are a good example of this. When we know that the host is
already up, any access should read the current state of the GPIO and
before it does any toggling. Another is the operation of flash (which
will now need to be handled by mboxd).
I realise this is outside the scope of your "lets get the targets
sorted" work. I thought I'd mention it now so we can be on the look
out for strange behaviour as you do your rework.
> After the BMC reboot, we need to keep the host running and also get to
> the appropriate systemd target states to represent this. The
> challenge here is that if we just re-ran the existing targets and
> services, we would do things like run P9 vcs workarounds, bit bang the
> FSI bus, and even potentially toggle pgood. We obviously need to
> avoid this in order to keep the host up and running. This divides our
> services started during a obmc-host-start.target into 2 categories,
> services required to boot the system, and services which are required
> to support the host running. We only want to run the latter in a
> situation where the BMC is reset while the host is up and running.
> - The applications should have no knowledge of Host state
> - i.e. the service starting or not starting is where we control what runs
> - Must handle being able to start and not start any arbitrary service
> within the host power on targets
> - Lots of services have dependencies on each other and
> synchronization targets, this design has to handle starting services
> that depend on other services or targets that may not be required when
> the host is running
> - The obmc-host-start.target needs to get to the running state when
> the host is already running after a BMC reset
> - This will ensure that any re-starts of this target do not harm the
> system and that the power off targets will work as expected
> Use the ConditionPathExists= systemd unit feature.
> From the man page: "Before starting a unit, verify that the specified
> condition is true. If it is not true, the starting of the unit will be
> (mostly silently) skipped, however all ordering dependencies of it are
> still respected. A failing condition will not result in the unit being
> moved into a failure state. The condition is checked at the time the
> queued start job is to be executed. Use condition expressions in order
> to silently skip units that do not apply to the local running system,
> for example because the kernel or runtime environment doesn't require
> its functionality. "
> This will be put in the service files that we do not want to run in
> this reset scenario (services required to boot the system). The first
> service we will run on a power on, is a service that detects whether
> the host is already running. If the host is running, then this
> service will create a file which will then be used to determine
> whether the boot services are run or not.
> The nice part about ConditionPathExists= is that it doesn’t execute
> the application in the services, but it allows dependencies on that
> service to be satisfied so systemd will still start the dependent
> services and reach the dependent synchronization targets.
> This proposal has not gone off as well as I would have hoped
> internally here :) There’s definitely a desire to not have this at
> the service level, but rather at the target level. I have not found a
> solution in this area though that satisfies the above requirements.
> Thoughts/ideas are definitely appreciated.
More information about the openbmc