[Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs

Greg Kurz groug at kaod.org
Wed Nov 27 19:24:08 AEDT 2019


On Wed, 27 Nov 2019 18:09:40 +1100
Alexey Kardashevskiy <aik at ozlabs.ru> wrote:

> 
> 
> On 20/11/2019 12:28, Oliver O'Halloran wrote:
> > The comment here implies that we don't need to take a ref to the pci_dev
> > because the ioda_pe will always have one. This implies that the current
> > expection is that the pci_dev for an NPU device will *never* be torn
> > down since the ioda_pe having a ref to the device will prevent the
> > release function from being called.
> > 
> > In other words, the desired behaviour here appears to be leaking a ref.
> > 
> > Nice!
> 
> 
> There is a history: https://patchwork.ozlabs.org/patch/1088078/
> 
> We did not fix anything in particular then, we do not seem to be fixing
> anything now (in other words - we cannot test it in a normal natural
> way). I'd drop this one.
> 

Yeah, I didn't fix anything at the time. Just reverted to the ref
count behavior we had before:

https://patchwork.ozlabs.org/patch/829172/

Frederic recently posted his take on the same topic from the OpenCAPI
point of view:

http://patchwork.ozlabs.org/patch/1198947/

He seems to indicate the NPU devices as the real culprit because
nobody ever cared for them to be removable. Fixing that seems be
a chore nobody really wants to address obviously... :-\

> 
> 
> > 
> > Signed-off-by: Oliver O'Halloran <oohall at gmail.com>
> > ---
> >  arch/powerpc/platforms/powernv/npu-dma.c | 11 +++--------
> >  1 file changed, 3 insertions(+), 8 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
> > index 72d3749da02c..2eb6e6d45a98 100644
> > --- a/arch/powerpc/platforms/powernv/npu-dma.c
> > +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> > @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn)
> >  			break;
> >  
> >  	/*
> > -	 * pci_get_domain_bus_and_slot() increased the reference count of
> > -	 * the PCI device, but callers don't need that actually as the PE
> > -	 * already holds a reference to the device. Since callers aren't
> > -	 * aware of the reference count change, call pci_dev_put() now to
> > -	 * avoid leaks.
> > +	 * NB: for_each_pci_dev() elevates the pci_dev refcount.
> > +	 * Caller is responsible for dropping the ref when it's
> > +	 * finished with it.
> >  	 */
> > -	if (pdev)
> > -		pci_dev_put(pdev);
> > -
> >  	return pdev;
> >  }
> >  
> > 
> 



More information about the Linuxppc-dev mailing list