[Skiboot] [PATCH] Run pollers in time_wait() when not booting

Stewart Smith stewart at linux.ibm.com
Wed Oct 31 17:00:35 AEDT 2018


Stewart Smith <stewart at linux.ibm.com> writes:
> This only bit us hard with hioma in one scenario.
>
> Our OPAL API has been OPAL_POLL_EVENTS may be needed to make forward
> progress on ongoing operations, and the internal to skiboot API has been
> that time_wait() of a suitable time will run pollers (on at least one
> CPU) to help ensure forward progress can be made.
>
> In a perfect world, interrupts are used but they may a) be disabled, or
> b) the thing we're doing can't use interrupts because computers are
> generally terrible.
>
> Back in 3db397ea5892a (circa 2015), we changed skiboot so that we'd run
> pollers only on the boot CPU, and not if we held any locks. This was to
> reduce the chance of programming code that could deadlock, as well as to
> ensure that we didn't just thrash all the cachelines for running pollers
> all over a large system during boot, or hard spin on the same locks on
> all secondary CPUs.
>
> The problem arises if the OS we're booting makes an OPAL call early on,
> with interrupts disabled, that requires a poller to run to make forward
> progress. An example of this would be OPAL_WRITE_NVRAM early in Linux
> boot (where Linux sets up the partitions it wants) - something that
> occurs iff we've had to reformat NVRAM this boot (i.e. first boot or
> corrupted NVRAM).
>
> The hiomap implementation should arguably *not* rely on synchronous IPMI
> messages, but this is a future improvement (as was for mbox before it).
> The mbox-flash code solved this problem by spinning on check_timers().
>
> More generically though, the approach of running the pollers when no
> longer booting means we behave more in line with what the API is meant
> to be, rather than have this odd case of "time_wait() for a condition
> that could also be tripped by an interrupt works fine unless the OS is
> up and running but hasn't set interrupts up yet".
>
> Fixes: 529bdca0bc546a7ae3ecbd2c3134b7260072d8b0
> Fixes: 3db397ea5892a8b348cf412739996731884561b3
> Signed-off-by: Stewart Smith <stewart at linux.ibm.com>
> ---
>  core/timebase.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

FIxed up Andrew's comments and merged to master as of d10862e40cd8ef36fe49b298f3c615ac73dd3c4e

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list