[PATCH V9 08/18] powrepc/pci: Refactor pci_dn

Wei Yang weiyang at linux.vnet.ibm.com
Thu Nov 20 18:25:08 AEDT 2014


On Thu, Nov 20, 2014 at 12:02:13PM +1100, Gavin Shan wrote:
>On Wed, Nov 19, 2014 at 04:30:24PM -0700, Bjorn Helgaas wrote:
>>On Sun, Nov 02, 2014 at 11:41:24PM +0800, Wei Yang wrote:
>>> From: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>> 
>>> pci_dn is the extension of PCI device node and it's created from
>>> device node. Unfortunately, VFs that are enabled dynamically by
>>> PF's driver and they don't have corresponding device nodes, and
>>> pci_dn. The patch refactors pci_dn to support VFs:
>>> 
>>>    * pci_dn is organized as a hierarchy tree. VF's pci_dn is put
>>>      to the child list of pci_dn of PF's bridge. pci_dn of other
>>>      device put to the child list of pci_dn of its upstream bridge.
>>> 
>>>    * VF's pci_dn is expected to be created dynamically when applying
>>>      final fixup to PF. VF's pci_dn will be destroyed when releasing
>>>      PF's pci_dev instance. pci_dn of other device is still created
>>>      from device node as before.
>>> 
>>>    * For one particular PCI device (VF or not), its pci_dn can be
>>>      found from pdev->dev.archdata.firmware_data, PCI_DN(devnode),
>>>      or parent's list. The fast path (fetching pci_dn through PCI
>>>      device instance) is populated during early fixup time.
>>> 
>>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>> ---
>>> ...
>>
>>> +struct pci_dn *add_dev_pci_info(struct pci_dev *pdev)
>>> +{
>>> +#ifdef CONFIG_PCI_IOV
>>> +	struct pci_dn *parent, *pdn;
>>> +	int i;
>>> +
>>> +	/* Only support IOV for now */
>>> +	if (!pdev->is_physfn)
>>> +		return pci_get_pdn(pdev);
>>> +
>>> +	/* Check if VFs have been populated */
>>> +	pdn = pci_get_pdn(pdev);
>>> +	if (!pdn || (pdn->flags & PCI_DN_FLAG_IOV_VF))
>>> +		return NULL;
>>> +
>>> +	pdn->flags |= PCI_DN_FLAG_IOV_VF;
>>> +	parent = pci_bus_to_pdn(pdev->bus);
>>> +	if (!parent)
>>> +		return NULL;
>>> +
>>> +	for (i = 0; i < pci_sriov_get_totalvfs(pdev); i++) {
>>> +		pdn = add_one_dev_pci_info(parent, NULL,
>>> +					   pci_iov_virtfn_bus(pdev, i),
>>> +					   pci_iov_virtfn_devfn(pdev, i));
>>
>>I'm not sure this makes sense, but I certainly don't know this code, so
>>maybe I'm missing something.
>>
>
>For ARI, Richard had some patches to fix the issue from firmware side.
>
>>pci_iov_virtfn_bus() and pci_iov_virtfn_devfn() depend on
>>pdev->sriov->stride and pdev->sriov->offset.  These are read from VF Stride
>>and First VF Offset in the SR-IOV capability by sriov_init(), which is
>>called before add_dev_pci_info():
>>
>>  pci_scan_child_bus
>>    pci_scan_slot
>>      pci_scan_single_device
>>	pci_device_add
>>	  pci_init_capabilities
>>	    pci_iov_init(PF)
>>	      sriov_init(PF, pos)
>>		pci_write_config_word(dev, pos + PCI_SRIOV_NUM_VF, 0)
>>		pci_read_config_word(dev, pos + PCI_SRIOV_VF_OFFSET, &offset)
>>		pci_read_config_word(dev, pos + PCI_SRIOV_VF_STRIDE, &stride)
>>		iov->offset = offset
>>		iov->stride = stride
>>
>>  pci_bus_add_devices
>>    pci_bus_add_device
>>      pci_fixup_device(pci_fixup_final)
>>	add_dev_pci_info
>>	  pci_iov_virtfn_bus
>>	    return ... + sriov->offset + (sriov->stride * id) ...
>>
>>But both First VF Offset and VF Stride change when ARI Capable Hierarchy or
>>NumVFs changes (SR-IOV spec sec 3.3.9, 3.3.10).  We set NumVFs to zero in
>>sriov_init() above.  We will change NumVFs to something different when a
>>driver calls pci_enable_sriov():
>>
>>  pci_enable_sriov
>>    sriov_enable
>>      pci_write_config_word(dev, iov->pos + PCI_SRIOV_NUM_VF, nr_virtfn)
>>
>>Now First VF Offset and VF Stride have changed from what they were when we
>>called pci_iov_virtfn_bus() above.
>>
>
>It's the case we missed: First VF Offset and VF Stride can change when
>PF's number of VFs is changed. It means the BDFN (Bus/Device/Function
>number) for one VF can't be determined until PF's number of VFs is
>populated and updated to HW (before calling to virtfn_add()).
>
>The dynamically created pci_dn is used in PCI config accessors currently.
>That means we have to get it ready before first PCI config request to the
>VF in pci_setup_device(). In the code of old revision, we had some weak
>function called in pci_alloc_dev(), which gave platform chance to create
>pci_dn. I think we have to switch back to the old way in order to fix
>the problem you catched. However, the old way is implemented with cost
>of more weak function, which you're probably unhappy to see.
>
>  sriov_enable()
>    virtfn_add()
>      virtfn_add_bus()
>      pci_alloc_dev()
>      pci_setup_device()

Ok, sounds my solution in previous reply can't work. We need the pci_dn ready
before access the configuration space of VFs.


-- 
Richard Yang
Help you, Help me



More information about the Linuxppc-dev mailing list