[PATCH] powerpc/eeh: Probe after unbalanced kref check

Gavin Shan gwshan at linux.vnet.ibm.com
Fri Aug 14 17:30:45 AEST 2015

On Fri, Aug 14, 2015 at 04:03:19PM +1000, Daniel Axtens wrote:
>In the complete hotplug case, EEH PEs are supposed to be released
>and set to NULL. Normally, this is done by eeh_remove_device(),
>which is called from pcibios_release_device().
>However, if something is holding a kref to the device, it will not
>be released, and the PE will remain. eeh_add_device_late() has
>a check for this which will explictly destroy the PE in this case.
>This check in eeh_add_device_late() occurs after a call to
>eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
>which will exit without probing if there is an existing PE.
>This means that on PowerNV, devices with outstanding krefs will not
>be rediscovered by EEH correctly after a complete hotplug. This is
>affecting CXL (CAPI) devices in the field.
>Put the probe after the kref check so that the PE is destroyed
>and affected devices are correctly rediscovered by EEH.
>Fixes: d91dafc02f42 ("powerpc/eeh: Delay probing EEH device during hotplug")
>Cc: stable at vger.kernel.org
>Cc: Gavin Shan <gwshan at linux.vnet.ibm.com>
>Signed-off-by: Daniel Axtens <dja at axtens.net>

Acked-by: Gavin Shan <gwshan at linux.vnet.ibm.com>


> arch/powerpc/kernel/eeh.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index af9b597b10af..8e61d717915e 100644
>--- a/arch/powerpc/kernel/eeh.c
>+++ b/arch/powerpc/kernel/eeh.c
>@@ -1116,9 +1116,6 @@ void eeh_add_device_late(struct pci_dev *dev)
> 		return;
> 	}
>-	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>-		eeh_ops->probe(pdn, NULL);
> 	/*
> 	 * The EEH cache might not be removed correctly because of
> 	 * unbalanced kref to the device during unplug time, which
>@@ -1142,6 +1139,9 @@ void eeh_add_device_late(struct pci_dev *dev)
> 		dev->dev.archdata.edev = NULL;
> 	}
>+	if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>+		eeh_ops->probe(pdn, NULL);
> 	edev->pdev = dev;
> 	dev->dev.archdata.edev = edev;

More information about the Linuxppc-dev mailing list