[PATCH] powerpc/powernv/npu: Fix reference leak

Alexey Kardashevskiy aik at ozlabs.ru
Mon Apr 29 16:01:29 AEST 2019



On 20/04/2019 01:34, Greg Kurz wrote:
> Since 902bdc57451c, get_pci_dev() calls pci_get_domain_bus_and_slot(). This
> has the effect of incrementing the reference count of the PCI device, as
> explained in drivers/pci/search.c:
> 
>  * Given a PCI domain, bus, and slot/function number, the desired PCI
>  * device is located in the list of PCI devices. If the device is
>  * found, its reference count is increased and this function returns a
>  * pointer to its data structure.  The caller must decrement the
>  * reference count by calling pci_dev_put().  If no device is found,
>  * %NULL is returned.
> 
> Nothing was done to call pci_dev_put() and the reference count of GPU and
> NPU PCI devices rockets up.
> 
> A natural way to fix this would be to teach the callers about the change,
> so that they call pci_dev_put() when done with the pointer. This turns
> out to be quite intrusive, as it affects many paths in npu-dma.c,
> pci-ioda.c and vfio_pci_nvlink2.c.


afaict this referencing is only done to protect the current traverser
and what you've done is actually a natural way (and the generic
pci_get_dev_by_id() does exactly the same), although this looks a bit weird.


> Also, the issue appeared in 4.16 and
> some affected code got moved around since then: it would be problematic
> to backport the fix to stable releases.
> 
> All that code never cared for reference counting anyway. Call pci_dev_put()
> from get_pci_dev() to revert to the previous behavior.
>> Fixes: 902bdc57451c ("powerpc/powernv/idoa: Remove unnecessary pcidev
from pci_dn")
> Cc: stable at vger.kernel.org # v4.16
> Signed-off-by: Greg Kurz <groug at kaod.org>
> ---
>  arch/powerpc/platforms/powernv/npu-dma.c |   15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> index e713ade30087..d8f3647e8fb2 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -31,9 +31,22 @@ static DEFINE_SPINLOCK(npu_context_lock);
>  static struct pci_dev *get_pci_dev(struct device_node *dn)
>  {
>  	struct pci_dn *pdn = PCI_DN(dn);
> +	struct pci_dev *pdev;
>  
> -	return pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
> +	pdev = pci_get_domain_bus_and_slot(pci_domain_nr(pdn->phb->bus),
>  					   pdn->busno, pdn->devfn);
> +
> +	/*
> +	 * pci_get_domain_bus_and_slot() increased the reference count of
> +	 * the PCI device, but callers don't need that actually as the PE
> +	 * already holds a reference to the device.

Imho this would be just enough.

Anyway,

Reviewed-by: Alexey Kardashevskiy <aik at ozlabs.ru>


How did you find it? :)


> Since callers aren't
> +	 * aware of the reference count change, call pci_dev_put() now to
> +	 * avoid leaks.
> +	 */
> +	if (pdev)
> +		pci_dev_put(pdev);
> +
> +	return pdev;
>  }
>  
>  /* Given a NPU device get the associated PCI device. */
> 

-- 
Alexey


More information about the Linuxppc-dev mailing list