[PATCH] Don't panic when EEH_MAX_FAILS is exceeded

Nathan Lynch ntl at pobox.com
Mon Jul 21 06:33:33 EST 2008


Mike Mason wrote:
>
> This patch changes the EEH_MAX_FAILS action from panic to printing
> an error message.  Panicking under under this condition is too
> harsh.  Although performance will be affected and the device may not
> recover, the system is still running, which at the very least,
> should allow for a more graceful shutdown.  The panic() is now
> wrapped in a DEBUG statement for development purposes.  The patch
> also removes the msleep() within a spinlock, which is not allowed.

> @@ -509,18 +510,24 @@

For ease of review, please try to use diff -p to generate patches.

> 	rc = 1;
> 	if (pdn->eeh_mode & EEH_MODE_ISOLATED) {
> 		pdn->eeh_check_count ++;
> -		if (pdn->eeh_check_count >= EEH_MAX_FAILS) {
> -			printk (KERN_ERR "EEH: Device driver ignored %d bad reads, panicing\n",
> -			        pdn->eeh_check_count);
> +		if (pdn->eeh_check_count % EEH_MAX_FAILS == 0) {
> +			location = (char *) of_get_property(dn, "ibm,loc-code", NULL);

Unneeded cast here, I think.

> +			printk (KERN_ERR "EEH: %d reads ignored for recovering device at "
> +				"location=%s driver=%s pci addr=%s\n",
> +				pdn->eeh_check_count, location,
> +				dev->driver->name, pci_name(dev));
> +			printk (KERN_ERR "EEH: Might be infinite loop in %s driver\n",
> +				dev->driver->name);
> +#ifdef DEBUG
> 			dump_stack();
> -			msleep(5000);
> -			
> +
> 			/* re-read the slot reset state */
> 			if (read_slot_reset_state(pdn, rets) != 0)
> 				rets[0] = -1;	/* reset state unknown */
>
> 			/* If we are here, then we hit an infinite loop. Stop. */
> 			panic("EEH: MMIO halt (%d) on device:%s\n", rets[0], pci_name(dev));
> +#endif

While I tend to agree that panic() is unnecessary, don't we want a
stack dump unconditionally (i.e. not bracketed in #ifdef DEBUG)?

I'd prefer just removing the code instead of adding #ifdef's in the
middle of this function.  eeh.c needs less #ifdef DEBUG, not more :)



More information about the Linuxppc-dev mailing list