[PATCH] cxl: Fix unbalanced pci_dev_get in cxl_probe

Daniel Axtens dja at axtens.net
Wed Sep 9 16:43:29 AEST 2015


mpe asked me to clarify that this was correct. Turns out it's not.

Currently, cxl_probe(pdev):
 1) calls pci_dev_get(pdev)
 2) calls cxl_adapter_init
    a) init calls cxl_adapter_alloc, which creates a struct cxl, 
       conventionally called adapter. This struct contains a 
       device entry, adapter->dev.

    b) init calls cxl_configure_adapter, where we set 
       adapter->dev.parent = &dev->dev (here dev is the pci dev)

So at this point, the cxl adapter's device's parent is the pci device
that I want to be refcounted.

    c) init calls cxl_register_adapter (which inexplicably is in file.c)

       *) cxl_register_adapter calls device_register(&adapter->dev) 

So now we're in device_register, where dev is the adapter device, and we
want to know if the PCI device is safe after we return.

device_register(&adapter->dev) calls device_initialize() and then
device_add().

device_add() does a get_device(). That ends up being a kobject_get() on
the adapter device kobj, which will increment the kref on the adapter
device. The PCI device doesn't get it's kref incremented explicitly.

It looks like we're not protected against that disappearing randomly.

So thanks, mpe, that was a good catch. I'll do a v2 that keeps the get
and adds a matching put.

Regards,
Daniel

On Wed, 2015-09-09 at 15:17 +1000, Daniel Axtens wrote:
> Currently the first thing we do in cxl_probe is to grab a reference
> on the pci device. Later on, we call device_register on our adapter,
> which also holds the PCI device.
> 
> In our remove path, we call device_unregister, but we never call
> pci_dev_put. We therefore leak the device every time we do a
> reflash.
> 
> device_register/unregister is sufficient to hold the reference.
> Drop the call to pci_dev_get.
> 
> Fixes: f204e0b8cedd ("cxl: Driver code for powernv PCIe based cards for userspace access")
> Cc: stable at vger.kernel.org
> Signed-off-by: Daniel Axtens <dja at axtens.net>
> 
> ---
> 
> This is the cxl bug that caused me to catch this a few weeks back:
> e642d11bdbfe ("powerpc/eeh: Probe after unbalanced kref check")
> 
> I put an printk in the unbalanced kref path and confirmed that it
> was printed with the pci_dev_get in and went away with the
> pci_dev_get out.
> ---
>  drivers/misc/cxl/pci.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
> index 02c85160bfe9..a5e977192b61 100644
> --- a/drivers/misc/cxl/pci.c
> +++ b/drivers/misc/cxl/pci.c
> @@ -1249,8 +1249,6 @@ static int cxl_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  	int slice;
>  	int rc;
>  
> -	pci_dev_get(dev);
> -
>  	if (cxl_verbose)
>  		dump_cxl_config_space(dev);
>  

-- 
Regards,
Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 860 bytes
Desc: This is a digitally signed message part
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20150909/cc086bb1/attachment.sig>


More information about the Linuxppc-dev mailing list