Power Restore Policy

Thu Aug 31 11:49:44 AEST 2017

On Sun, Aug 27, 2017 at 6:49 PM, Andrew Jeffery <andrew at aj.id.au> wrote:
> Hi Andrew,
>
> On Fri, 2017-08-25 at 14:32 -0500, Andrew Geissler wrote:
>> The BMC has the concept of a Power Restore Policy.  It’s defined
>> within https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Control/Power/RestorePolicy.interface.yaml
>>
>> The point of this policy is to define the host boot behavior by the
>> BMC if it is rebooted.  The policies are:
>> o AlwaysOn
>> o AlwaysOff
>> o Restore
>>
>> What complicates things a bit is the scenario where the BMC is
>> rebooted, but the host is booted.  Rule number one in bmc management
>> firmware is don’t mess with the host if it’s up and running.  So
>> basically, if the host is up and running, and the bmc is rebooted, the
>> bmc should do no actions other then to get it’s chassis and host state
>> objects to reflect that of the chassis and host (on and running).
>>
>> If the host is not running, then here’s what the policies mean:
>>
>> o AlwaysOn -> Power on
>> o AlwaysOff -> Leave system in off state
>> o Restore -> Read last requested host state and re-request it
>>
>> OBMC has an application, somewhat misnamed, discover_system_state
>> which enforces the AlwaysOn logic.
>
> Out of curiousity, why do you think it is misnamed? Is it the ambiguity of
> 'system' with respect to BMC vs host? Or something else? Is there a better
> name?
>

discover_system_state implies discovery of a state but what this
application is really doing is applying_system_restore_policy
I may rename some day but not with this set of changes.

>> The Restore logic was put into
>> phosphor-host-state-manager but I don’t believe this is the correct
>> place for it.
>
> Naive question as I'm not intimately familiar with the details: why not (on
> reflection, this is probably answered below)?
>
>> It does it when it reads it’s
>
> Mate.
>
> Can you rephrase that for clarity?

The host-state application currently reads the last requested state
from the BMC filesystem and then blindly tries to enforce that state.
The main goal I'm going for here is to take that enforcement out of
host-state and put it in discover_state so this "what do I do after a
bmc reboot" logic is all in one place.

>
>> persisted value for the
>> last requested state when it starts.  I’d like to move the Restore
>> logic in with the discover_system_state application.
>>
>> The reasons are the following:
>> o discover_system_state’s service can easily be configured to not run
>> when the host is already running
>
> Is this also part of your suggestion to rename the service? Because to me, from
> the name, it seems crucial to run it precisely because we want to know the
> system's (host's) state. But it seems we know the state before running the
> service? What does the actual discovery?
>

The application rename would be good, but not something I'm looking to
tackle here.  The discovery of the system state (i.e. is the host up)
is actually done within the chassis and host reset targets.  That
drives the discovery of pgood and host status and sets up the
host-state applications appropriately.

>> o phosphor-host-state-manager has to be started always and early, so
>> that it can monitor for any state change requests (like in the case
>> where the bmc is rebooted while the host is still up).
>
> This isn't a state change for the host though, if it remains up?
>
> I guess the idea is to start phosphor-host-state-manager seeded with the
> discovered state, and then let a service (started after
> phosphor-host-state-manager) poke phosphor-host-state-manager to enforce the
> policy?
>

Yep!  Except phosphor-host-state-manager initially just uses the value
written to the BMC filesystem (it does no real discovery).  The
discover_state service will then enforce the policy.

>> o Having all the policy in a single application is more testable and obvious
>
> Sounds ideal on the surface of things.
>
>>
>> https://github.com/openbmc/openbmc/issues/2210 is tracking this.
>>
>> Thoughts or questions?
>
> Yeah - well a comment more than anything: From reading your email and then
> reading #2210, it was not at all clear in the email that the *host reboot* case
> is what you're trying to resolve. It seemed like you were just trying to clean
> up a case of misplaced code ("[have] all policy in a single application").
>
> Andrew
>
>>
>> Andrew