a3b2cb30 broken panic reporting for qemu guests

David Gibson david at gibson.dropbear.id.au
Mon Dec 4 16:45:43 AEDT 2017


On Wed, Nov 29, 2017 at 02:23:43PM +1000, Nicholas Piggin wrote:
> On Wed, 29 Nov 2017 15:06:52 +1100
> David Gibson <david at gibson.dropbear.id.au> wrote:
> 
> > a3b2cb30 "powerpc: Do not call ppc_md.panic in fadump panic notifier"
> > purports to fix a problem when the kernel panics with fadump not
> > registered, but it breaks something else instead.  I _think_ it was
> > working on the incorrect assumption that ppc_md.panic was (or should
> > be) only used with fadump, but I'm not really sure.
> > 
> > Panic works with kdump enabled, and (I think) with fadump enabled).
> > However, with neither of these enabled, we always go to the generic
> > panic logic.
> 
> Yeah thanks, I can't remember what assumption I was working on tbh.
>  
> > That's incorrect for PAPR guests - they should call ibm,os-term via
> > RTAS.  Under qemu this leads to a "GUEST_PANICKED" event notification
> > which higher-level management pays attention to.  Since a3b2cb30 we
> > now reboot instead of reporting that.
> > 
> > I believe it will also break panic for PS3 machines, but since that
> > platform basically no longer exists, we probably don't care.
> 
> I (hope) it should just go down to the normal panic path and not do
> much worse than it already does -- although it won't print out that
> message.

Sounds plausible.

> > I'm not entirely sure how to fix this.  I _think_ what we want is to
> > call ppc_md.panic from a late panic notifier, the way this patch does
> > for fadump_panic_event() if fadump is registered.
> 
> The problem I had there is that some of the printk and console stuff
> wasn't getting flushed out, so I was getting a blank screen. This was
> probably in conjunction with panicing from NMI context that we're now
> starting to introduce.

Ok.  What was the exact bit of panic() that wasn't getting invoked
that needed to be?

AFAICT ppc_md.panic was already being called at the end of the panic
notifiers, by using INT_MIN priority.  Note that this is the same way
that the pvpanic device (used on x86 for similar panic notification
functionality) does it.  Well, pvpanic uses priority 1, which seems
less thorough than INT_MIN.

> So it's a bit annoying. There's other ugliness we have for being unable
> to control panic code well enough from arch code
> (arch/powerpc/platforms/powernv/opal.c)
> 
> I guess a really minimal fix is to put an #ifdef powerpc down the bottom
> there (/me *cries*).

That would work for the PAPR os-term thing, but wouldn't if we ever
had a specific device that worked like pvpanic.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20171204/b44d0ab3/attachment-0001.sig>


More information about the Linuxppc-dev mailing list