[PATCH] powerpc/eeh: Probe after unbalanced kref check
gwshan at linux.vnet.ibm.com
Fri Aug 14 17:30:45 AEST 2015
On Fri, Aug 14, 2015 at 04:03:19PM +1000, Daniel Axtens wrote:
>In the complete hotplug case, EEH PEs are supposed to be released
>and set to NULL. Normally, this is done by eeh_remove_device(),
>which is called from pcibios_release_device().
>However, if something is holding a kref to the device, it will not
>be released, and the PE will remain. eeh_add_device_late() has
>a check for this which will explictly destroy the PE in this case.
>This check in eeh_add_device_late() occurs after a call to
>eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
>which will exit without probing if there is an existing PE.
>This means that on PowerNV, devices with outstanding krefs will not
>be rediscovered by EEH correctly after a complete hotplug. This is
>affecting CXL (CAPI) devices in the field.
>Put the probe after the kref check so that the PE is destroyed
>and affected devices are correctly rediscovered by EEH.
>Fixes: d91dafc02f42 ("powerpc/eeh: Delay probing EEH device during hotplug")
>Cc: stable at vger.kernel.org
>Cc: Gavin Shan <gwshan at linux.vnet.ibm.com>
>Signed-off-by: Daniel Axtens <dja at axtens.net>
Acked-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
> arch/powerpc/kernel/eeh.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
>index af9b597b10af..8e61d717915e 100644
>@@ -1116,9 +1116,6 @@ void eeh_add_device_late(struct pci_dev *dev)
>- if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>- eeh_ops->probe(pdn, NULL);
> * The EEH cache might not be removed correctly because of
> * unbalanced kref to the device during unplug time, which
>@@ -1142,6 +1139,9 @@ void eeh_add_device_late(struct pci_dev *dev)
> dev->dev.archdata.edev = NULL;
>+ if (eeh_has_flag(EEH_PROBE_MODE_DEV))
>+ eeh_ops->probe(pdn, NULL);
> edev->pdev = dev;
> dev->dev.archdata.edev = edev;
More information about the Linuxppc-dev