[PATCH] powerpc/eeh: crash caused by null eeh_dev
Anton Blanchard
anton at samba.org
Wed Apr 18 11:16:00 EST 2012
Hi Gavin,
> The problem was reported by Anton Blanchard. While EEH error
> happened to the PCI device without the corresponding device
> driver, kernel crash was seen. Eventually, I successfully
> reproduced the problem on Firebird-L machine with utility
> "errinjct". Initially, the device driver for Emulex ethernet
> MAC has been disabled from .config and force data parity on
> the Emulex ethernet MAC with help of "errinjct". Eventually,
> I saw the kernel crash after issueing couple of "lspci -v"
> command.
>
> The root cause behind is that the PCI device, including the
> reference to the corresponding eeh device, will be removed
> from the system while EEH does recovery. Afterwards, the
> PCI device will be probed again and added into the system
> accordingly. So it's not safe to retrieve the eeh device from
> the corresponding PCI device after the PCI device has been removed
> and not added again.
>
> The patch fixes the issue and retrieve the eeh device from OF node
> instead of PCI device after the PCI device has been removed.
Thanks, this does fix the oops I see.
Tested-by: Anton Blanchard <anton at samba.org>
Anton
> Signed-off-by: Gavin Shan <shangw at linux.vnet.ibm.com>
> ---
> arch/powerpc/platforms/pseries/eeh.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/eeh.c
> b/arch/powerpc/platforms/pseries/eeh.c index 309d38e..a75e37d 100644
> --- a/arch/powerpc/platforms/pseries/eeh.c
> +++ b/arch/powerpc/platforms/pseries/eeh.c
> @@ -1076,7 +1076,7 @@ static void eeh_add_device_late(struct pci_dev
> *dev) pr_debug("EEH: Adding device %s\n", pci_name(dev));
>
> dn = pci_device_to_OF_node(dev);
> - edev = pci_dev_to_eeh_dev(dev);
> + edev = of_node_to_eeh_dev(dn);
> if (edev->pdev == dev) {
> pr_debug("EEH: Already referenced !\n");
> return;
More information about the Linuxppc-dev
mailing list