[PATCH] powerpc/eeh: Only dump stack once if an MMIO loop is detected

Michael Ellerman patch-notifications at ellerman.id.au
Wed Jan 29 16:17:22 AEDT 2020


On Wed, 2019-10-16 at 01:25:36 UTC, Oliver O'Halloran wrote:
> Many drivers don't check for errors when they get a 0xFFs response from an
> MMIO load. As a result after an EEH event occurs a driver can get stuck in
> a polling loop unless it some kind of internal timeout logic.
> 
> Currently EEH tries to detect and report stuck drivers by dumping a stack
> trace after eeh_dev_check_failure() is called EEH_MAX_FAILS times on an
> already frozen PE. The value of EEH_MAX_FAILS was chosen so that a dump
> would occur every few seconds if the driver was spinning in a loop. This
> results in a lot of spurious stack traces in the kernel log.
> 
> Fix this by limiting it to printing one stack trace for each PE freeze. If
> the driver is truely stuck the kernel's hung task detector is better suited
> to reporting the probelm anyway.
> 
> Cc: Sam Bobroff <sbobroff at linux.ibm.com>
> Signed-off-by: Oliver O'Halloran <oohall at gmail.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/4e0942c0302b5ad76b228b1a7b8c09f658a1d58a

cheers


More information about the Linuxppc-dev mailing list