[Skiboot] [PATCH 0/3] improve ability for OPAL to cope with re-entry due

Fri Mar 23 16:54:52 AEDT 2018

Nicholas Piggin <npiggin at gmail.com> writes:
> The main patch of interest here is 2. This allows through a subset of
> opal calls that are used in the xmon/crash path even when we have
> interrupted another opal context. Idea is that stack is already trashed,
> so we might as well let some messages get to the console etc.
>
> Some of the Linux-side patches and results are posted here
>
> https://marc.info/?l=linuxppc-embedded&m=152119458117643&w=2
>
> Linux work is quite a bit more involved, so I would like to hopefully
> get the skiboot part merged sooner, and I've just send the Linux patches
> for RFC until we agree on skiboot.
>
> I'd like to keep chipping away at skiboot robustness and debugability
> vs these kinds of situations. It's slow going but I think doing it
> incrementally is working okay. 
>
> One important patch I recently posted was this one to make a quiesced
> CPU safe vs reentrancy.
>
> https://lists.ozlabs.org/pipermail/skiboot/2018-March/010729.html
>
> Future direction would be to detect reentrant call before we touch
> the stack, and then flip over to an emergency stack. Then we could
> have an OPAL call to print the OPAL stack and do other debuggy stuff
> which Linux crash dumps and xmon could use.
>
> Comments?

This set noticably improves some situations that I was hitting with
some tests, so that's good... Merged to master as of
1090f346713ae43319cfeb3eb65ca24f9864b628

I'm thinking we actually want these in 5.10.x too, so merged there as of
c09197f74e2fe42034ecdc862cbea06f71767947. That way those on 5.10.x will
end up with a bit nicer situation rather than a flood of OPAL re-entry
warnings (I was looking at a bug like that today, you got 16 chars of
kernel backtrace at a time between dozens of cpu threads trying to
re-enter OPAL)

I've been meaning to go poke at the quiesced cpu vs reentrancy patch,
but keep running out of hours in the day :(

-- 
Stewart Smith
OPAL Architect, IBM.