a3b2cb30 broken panic reporting for qemu guests

David Gibson david at gibson.dropbear.id.au
Thu Nov 30 16:11:26 AEDT 2017


On Wed, Nov 29, 2017 at 02:23:43PM +1000, Nicholas Piggin wrote:
> On Wed, 29 Nov 2017 15:06:52 +1100
> David Gibson <david at gibson.dropbear.id.au> wrote:
> 
> > a3b2cb30 "powerpc: Do not call ppc_md.panic in fadump panic notifier"
> > purports to fix a problem when the kernel panics with fadump not
> > registered, but it breaks something else instead.  I _think_ it was
> > working on the incorrect assumption that ppc_md.panic was (or should
> > be) only used with fadump, but I'm not really sure.
> > 
> > Panic works with kdump enabled, and (I think) with fadump enabled).
> > However, with neither of these enabled, we always go to the generic
> > panic logic.
> 
> Yeah thanks, I can't remember what assumption I was working on tbh.
>  
> > That's incorrect for PAPR guests - they should call ibm,os-term via
> > RTAS.  Under qemu this leads to a "GUEST_PANICKED" event notification
> > which higher-level management pays attention to.  Since a3b2cb30 we
> > now reboot instead of reporting that.
> > 
> > I believe it will also break panic for PS3 machines, but since that
> > platform basically no longer exists, we probably don't care.
> 
> I (hope) it should just go down to the normal panic path and not do
> much worse than it already does -- although it won't print out that
> message.
> 
> > I'm not entirely sure how to fix this.  I _think_ what we want is to
> > call ppc_md.panic from a late panic notifier, the way this patch does
> > for fadump_panic_event() if fadump is registered.
> 
> The problem I had there is that some of the printk and console stuff
> wasn't getting flushed out, so I was getting a blank screen. This was
> probably in conjunction with panicing from NMI context that we're now
> starting to introduce.
> 
> So it's a bit annoying. There's other ugliness we have for being unable
> to control panic code well enough from arch code
> (arch/powerpc/platforms/powernv/opal.c)
> 
> I guess a really minimal fix is to put an #ifdef powerpc down the bottom
> there (/me *cries*).

Um.. right.  I'm not really sure from that how to go forward from
here.  We want to fix this for RHEL7.5, which doesn't give us a lot of
time.

Adding the #ifdef at the bottom of the generic panic code is gross,
but there's already a bunch of that, so maybe adequate until a better
solution can be found?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20171130/72169d7a/attachment.sig>


More information about the Linuxppc-dev mailing list