[Skiboot] [PATCH] Run pollers in time_wait() when not booting

Stewart Smith stewart at linux.ibm.com
Tue Oct 30 18:13:47 AEDT 2018

This only bit us hard with hioma in one scenario.

Our OPAL API has been OPAL_POLL_EVENTS may be needed to make forward
progress on ongoing operations, and the internal to skiboot API has been
that time_wait() of a suitable time will run pollers (on at least one
CPU) to help ensure forward progress can be made.

In a perfect world, interrupts are used but they may a) be disabled, or
b) the thing we're doing can't use interrupts because computers are
generally terrible.

Back in 3db397ea5892a (circa 2015), we changed skiboot so that we'd run
pollers only on the boot CPU, and not if we held any locks. This was to
reduce the chance of programming code that could deadlock, as well as to
ensure that we didn't just thrash all the cachelines for running pollers
all over a large system during boot, or hard spin on the same locks on
all secondary CPUs.

The problem arises if the OS we're booting makes an OPAL call early on,
with interrupts disabled, that requires a poller to run to make forward
progress. An example of this would be OPAL_WRITE_NVRAM early in Linux
boot (where Linux sets up the partitions it wants) - something that
occurs iff we've had to reformat NVRAM this boot (i.e. first boot or
corrupted NVRAM).

The hiomap implementation should arguably *not* rely on synchronous IPMI
messages, but this is a future improvement (as was for mbox before it).
The mbox-flash code solved this problem by spinning on check_timers().

More generically though, the approach of running the pollers when no
longer booting means we behave more in line with what the API is meant
to be, rather than have this odd case of "time_wait() for a condition
that could also be tripped by an interrupt works fine unless the OS is
up and running but hasn't set interrupts up yet".

Fixes: 529bdca0bc546a7ae3ecbd2c3134b7260072d8b0
Fixes: 3db397ea5892a8b348cf412739996731884561b3
Signed-off-by: Stewart Smith <stewart at linux.ibm.com>
 core/timebase.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/timebase.c b/core/timebase.c
index 24e72959f542..f9f1a55b8dbb 100644
--- a/core/timebase.c
+++ b/core/timebase.c
@@ -57,7 +57,7 @@ void time_wait(unsigned long duration)
-	if (c != boot_cpu)
+	if (c != boot_cpu && opal_booting())

More information about the Skiboot mailing list