a3b2cb30 broken panic reporting for qemu guests

Mon Dec 4 16:49:14 AEDT 2017

On Fri, Dec 01, 2017 at 09:40:38PM +1000, Nicholas Piggin wrote:
> On Fri, 01 Dec 2017 22:11:50 +1100
> Michael Ellerman <mpe at ellerman.id.au> wrote:
> 
> > David Gibson <david at gibson.dropbear.id.au> writes:
> > 
> > > On Wed, Nov 29, 2017 at 02:23:43PM +1000, Nicholas Piggin wrote:  
> > >> On Wed, 29 Nov 2017 15:06:52 +1100
> > >> David Gibson <david at gibson.dropbear.id.au> wrote:
> > >>   
> > >> > a3b2cb30 "powerpc: Do not call ppc_md.panic in fadump panic notifier"
> > >> > purports to fix a problem when the kernel panics with fadump not
> > >> > registered, but it breaks something else instead.  I _think_ it was
> > >> > working on the incorrect assumption that ppc_md.panic was (or should
> > >> > be) only used with fadump, but I'm not really sure.
> > >> > 
> > >> > Panic works with kdump enabled, and (I think) with fadump enabled).
> > >> > However, with neither of these enabled, we always go to the generic
> > >> > panic logic.  
> > >> 
> > >> Yeah thanks, I can't remember what assumption I was working on tbh.
> > >>    
> > >> > That's incorrect for PAPR guests - they should call ibm,os-term via
> > >> > RTAS.  Under qemu this leads to a "GUEST_PANICKED" event notification
> > >> > which higher-level management pays attention to.  Since a3b2cb30 we
> > >> > now reboot instead of reporting that.
> > >> > 
> > >> > I believe it will also break panic for PS3 machines, but since that
> > >> > platform basically no longer exists, we probably don't care.  
> > >> 
> > >> I (hope) it should just go down to the normal panic path and not do
> > >> much worse than it already does -- although it won't print out that
> > >> message.
> > >>   
> > >> > I'm not entirely sure how to fix this.  I _think_ what we want is to
> > >> > call ppc_md.panic from a late panic notifier, the way this patch does
> > >> > for fadump_panic_event() if fadump is registered.  
> > >> 
> > >> The problem I had there is that some of the printk and console stuff
> > >> wasn't getting flushed out, so I was getting a blank screen. This was
> > >> probably in conjunction with panicing from NMI context that we're now
> > >> starting to introduce.
> > >> 
> > >> So it's a bit annoying. There's other ugliness we have for being unable
> > >> to control panic code well enough from arch code
> > >> (arch/powerpc/platforms/powernv/opal.c)
> > >> 
> > >> I guess a really minimal fix is to put an #ifdef powerpc down the bottom
> > >> there (/me *cries*).  
> > >
> > > Um.. right.  I'm not really sure from that how to go forward from
> > > here.  We want to fix this for RHEL7.5, which doesn't give us a lot of
> > > time.
> > >
> > > Adding the #ifdef at the bottom of the generic panic code is gross,
> > > but there's already a bunch of that, so maybe adequate until a better
> > > solution can be found?  
> > 
> > I think you mean put an #ifdef at the bottom of panic(). If so that
> > won't work. Our default panic_timeout is 180 so we never get to the
> > bottom of panic(), we call emergency_restart().
> > 
> > You *could* put an #ifdef powerpc before that, but that's even more
> > gross because it's in a different place to the sparc/s390 #ifdefs.
> > 
> > I notice we don't implement machine_emergency_restart(), it just
> > becomes machine_restart(NULL).
> > 
> > So it seems like that's the place we should be hooking,
> > machine_emergency_restart(). That's what x86 does.
> > 
> > But panic() is not the only caller of emergency_restart(), so it's not
> > an entirely straight forward conversion.
> > 
> > So I think for 4.15 and 4.14 I'm inclined to revert. Then we can do a
> > bigger rework for 4.16.
> > 
> > Nick that shouldn't break your original aim too badly I think? ie. worst
> > case is we panic() but don't see the output if we came from NMI?
> 
> If the fix is not pretty obvious, then I guess revert. We actually
> do have a bit of a regression though, since we've started marking
> system reset interrupts as NMI. Previously a system reset would have
> a better chance of printing something there.
> 
> So I wonder is an ifdef really all that much worse just because it's not
> in the same exact place as the others? We do get bug reports that were
> triggered by a system reset from hypervisor console. Hopefully they would
> be running with crash dumps usually.

I don't think an ifdef there is really correct though.  Sounds like we
might instead need to move some console flushes before calling of all
the notifiers, so that things like platforms/pseries or pvpanic can
insert non-returning notifiers.

Or else we need a different mechanism from the existing panic
notifiers for hooking in a "how to die" function from platform or
device.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20171204/28117535/attachment-0001.sig>