[PATCH] powerpc/eeh: crash caused by null eeh_dev

Anton Blanchard anton at samba.org
Wed Apr 18 11:16:00 EST 2012


Hi Gavin,

> The problem was reported by Anton Blanchard. While EEH error
> happened to the PCI device without the corresponding device
> driver, kernel crash was seen. Eventually, I successfully
> reproduced the problem on Firebird-L machine with utility
> "errinjct". Initially, the device driver for Emulex ethernet
> MAC has been disabled from .config and force data parity on
> the Emulex ethernet MAC with help of "errinjct". Eventually,
> I saw the kernel crash after issueing couple of "lspci -v"
> command.
> 
> The root cause behind is that the PCI device, including the
> reference to the corresponding eeh device, will be removed
> from the system while EEH does recovery. Afterwards, the
> PCI device will be probed again and added into the system
> accordingly. So it's not safe to retrieve the eeh device from
> the corresponding PCI device after the PCI device has been removed
> and not added again.
> 
> The patch fixes the issue and retrieve the eeh device from OF node
> instead of PCI device after the PCI device has been removed.

Thanks, this does fix the oops I see.

Tested-by: Anton Blanchard <anton at samba.org>

Anton

> Signed-off-by: Gavin Shan <shangw at linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/eeh.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/eeh.c
> b/arch/powerpc/platforms/pseries/eeh.c index 309d38e..a75e37d 100644
> --- a/arch/powerpc/platforms/pseries/eeh.c
> +++ b/arch/powerpc/platforms/pseries/eeh.c
> @@ -1076,7 +1076,7 @@ static void eeh_add_device_late(struct pci_dev
> *dev) pr_debug("EEH: Adding device %s\n", pci_name(dev));
>  
>  	dn = pci_device_to_OF_node(dev);
> -	edev = pci_dev_to_eeh_dev(dev);
> +	edev = of_node_to_eeh_dev(dn);
>  	if (edev->pdev == dev) {
>  		pr_debug("EEH: Already referenced !\n");
>  		return;



More information about the Linuxppc-dev mailing list