[PATCH 05/14] powerpc/eeh: Defer printing stack trace

Oliver O'Halloran oohall at gmail.com
Tue Sep 17 11:45:14 AEST 2019


On Tue, Sep 17, 2019 at 11:04 AM Sam Bobroff <sbobroff at linux.ibm.com> wrote:
>
> On Tue, Sep 03, 2019 at 08:15:56PM +1000, Oliver O'Halloran wrote:
> > Currently we print a stack trace in the event handler to help with
> > debugging EEH issues. In the case of suprise hot-unplug this is unneeded,
> > so we want to prevent printing the stack trace unless we know it's due to
> > an actual device error. To accomplish this, we can save a stack trace at
> > the point of detection and only print it once the EEH recovery handler has
> > determined the freeze was due to an actual error.
> >
> > Since the whole point of this is to prevent spurious EEH output we also
> > move a few prints out of the detection thread, or mark them as pr_debug
> > so anyone interested can get output from the eeh_check_dev_failure()
> > if they want.
> >
> > Signed-off-by: Oliver O'Halloran <oohall at gmail.com>
>
> I think this is a good change, and even in the normal case it will place
> the stacktrace closer to the rest of the recovery information.
>
> But, I think it would make more sense to put the stacktrace into the
> struct eeh_event, rather than the struct eeh_pe. Is there some reason
> we can't do that? (It would save a fair bit of memory!)

Two reasons:

1) the eeh_event structures are allocated with GFP_ATOMIC since
eeh_dev_check_failure() can be called from any context. Minimising the
number of atomic allocations we do is a good idea as a matter of
course.
2) We don't pass the eeh_event structure to the event handler
function. I guess we could, but... eh

I don't see the memory saving as hugely significant either. There's
always fewer eeh_pe structures than there are PCI devices since some
will share PEs (e.g. switches, multifunction cards) so you'd be saving
a dozen KB at most.

root at zaius1:~# lspci | wc -l
59
root at zaius1:~# echo $(( $(lspci | wc -l) * 64 * 8))
30208

I think we'll live...

>
> Cheers,
> Sam.


More information about the Linuxppc-dev mailing list