[PATCH 1/3] powernv/iov: Ensure the pdn for VFs always contains a valid PE number
Bjorn Helgaas
helgaas at kernel.org
Tue Oct 1 03:09:48 AEST 2019
On Mon, Sep 30, 2019 at 12:08:46PM +1000, Oliver O'Halloran wrote:
This is all powerpc, so I assume Michael will handle this. Just
random things I noticed; ignore if they don't make sense:
> On PowerNV we use the pcibios_sriov_enable() hook to do two things:
>
> 1. Create a pci_dn structure for each of the VFs, and
> 2. Configure the PHB's internal BARs that map MMIO ranges to PEs
> so that each VF has it's own PE. Note that the PE also determines
s/it's/its/
> the IOMMU table the HW uses for the device.
>
> Currently we do not set the pe_number field of the pci_dn immediately after
> assigning the PE number for the VF that it represents. Instead, we do that
> in a fixup (see pnv_pci_dma_dev_setup) which is run inside the
> pcibios_add_device() hook which is run prior to adding the device to the
> bus.
>
> On PowerNV we add the device to it's IOMMU group using a bus notifier and
s/it's/its/
> in order for this to work the PE number needs to be known when the bus
> notifier is run. This works today since the PE number is set in the fixup
> which runs before adding the device to the bus. However, if we want to move
> the fixup to a later stage this will break.
>
> We can fix this by setting the pdn->pe_number inside of
> pcibios_sriov_enable(). There's no good to avoid this since we already have
s/no good/no good reason/ ?
Not quite sure what "this" refers to ... "no good reason to avoid
setting pdn->pe_number in pcibios_sriov_enable()"? The double
negative makes it a little hard to parse.
> all the required information at that point, so... do that. Moving this
> earlier does cause two problems:
>
> 1. We trip the WARN_ON() in the fixup code, and
> 2. The EEH core will clear pdn->pe_number while recovering VFs.
>
> The only justification for either of these is a comment in eeh_rmv_device()
> suggesting that pdn->pe_number *must* be set to IODA_INVALID_PE in order
> for the VF to be scanned. However, this comment appears to have no basis in
> reality so just delete it.
>
> Signed-off-by: Oliver O'Halloran <oohall at gmail.com>
> ---
> Can't get rid of the fixup entirely since we need it to set the
> ioda_pe->pdev back-pointer. I'll look at killing that another time.
> ---
> arch/powerpc/kernel/eeh_driver.c | 6 ------
> arch/powerpc/platforms/powernv/pci-ioda.c | 19 +++++++++++++++----
> arch/powerpc/platforms/powernv/pci.c | 4 ----
> 3 files changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index d9279d0..7955fba 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -541,12 +541,6 @@ static void eeh_rmv_device(struct eeh_dev *edev, void *userdata)
>
> pci_iov_remove_virtfn(edev->physfn, pdn->vf_index);
> edev->pdev = NULL;
> -
> - /*
> - * We have to set the VF PE number to invalid one, which is
> - * required to plug the VF successfully.
> - */
> - pdn->pe_number = IODA_INVALID_PE;
> #endif
> if (rmv_data)
> list_add(&edev->rmv_entry, &rmv_data->removed_vf_list);
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5e3172d..70508b3 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1558,6 +1558,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
>
> /* Reserve PE for each VF */
> for (vf_index = 0; vf_index < num_vfs; vf_index++) {
> + int vf_devfn = pci_iov_virtfn_devfn(pdev, vf_index);
> + int vf_bus = pci_iov_virtfn_bus(pdev, vf_index);
> + struct pci_dn *vf_pdn;
> +
> if (pdn->m64_single_mode)
> pe_num = pdn->pe_num_map[vf_index];
> else
> @@ -1570,13 +1574,11 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> pe->pbus = NULL;
> pe->parent_dev = pdev;
> pe->mve_number = -1;
> - pe->rid = (pci_iov_virtfn_bus(pdev, vf_index) << 8) |
> - pci_iov_virtfn_devfn(pdev, vf_index);
> + pe->rid = (vf_bus << 8) | vf_devfn;
>
> pe_info(pe, "VF %04d:%02d:%02d.%d associated with PE#%x\n",
Not related to *this* patch, but this looks like maybe it's supposed
to match the pci_name(), e.g., "%04x:%02x:%02x.%d" from
pci_setup_device()? If so, the "%04d:%02d:%02d" here will be
confusing since the decimal & hex won't always match.
> hose->global_number, pdev->bus->number,
Consider pci_domain_nr(bus) instead of hose->global_number? It would
be nice if you had the pci_dev * for each VF so you could just use
pci_name(vf) instead of all this domain/bus/PCI_SLOT/FUNC.
> - PCI_SLOT(pci_iov_virtfn_devfn(pdev, vf_index)),
> - PCI_FUNC(pci_iov_virtfn_devfn(pdev, vf_index)), pe_num);
> + PCI_SLOT(vf_devfn), PCI_FUNC(vf_devfn), pe_num);
>
> if (pnv_ioda_configure_pe(phb, pe)) {
> /* XXX What do we do here ? */
> @@ -1590,6 +1592,15 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
> list_add_tail(&pe->list, &phb->ioda.pe_list);
> mutex_unlock(&phb->ioda.pe_list_mutex);
>
> + /* associate this pe to it's pdn */
> + list_for_each_entry(vf_pdn, &pdn->parent->child_list, list) {
> + if (vf_pdn->busno == vf_bus &&
> + vf_pdn->devfn == vf_devfn) {
> + vf_pdn->pe_number = pe_num;
> + break;
> + }
> + }
> +
> pnv_pci_ioda2_setup_dma_pe(phb, pe);
> #ifdef CONFIG_IOMMU_API
> iommu_register_group(&pe->table_group,
> diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> index 2825d00..b7761e2 100644
> --- a/arch/powerpc/platforms/powernv/pci.c
> +++ b/arch/powerpc/platforms/powernv/pci.c
> @@ -816,16 +816,12 @@ void pnv_pci_dma_dev_setup(struct pci_dev *pdev)
> struct pnv_phb *phb = hose->private_data;
> #ifdef CONFIG_PCI_IOV
> struct pnv_ioda_pe *pe;
> - struct pci_dn *pdn;
>
> /* Fix the VF pdn PE number */
> if (pdev->is_virtfn) {
> - pdn = pci_get_pdn(pdev);
> - WARN_ON(pdn->pe_number != IODA_INVALID_PE);
> list_for_each_entry(pe, &phb->ioda.pe_list, list) {
> if (pe->rid == ((pdev->bus->number << 8) |
> (pdev->devfn & 0xff))) {
> - pdn->pe_number = pe->pe_number;
> pe->pdev = pdev;
> break;
> }
> --
> 2.9.5
>
More information about the Linuxppc-dev
mailing list