[PATCH kernel] powerpc/powernv/ioda2: Update iommu table base on ownership change
David Gibson
david at gibson.dropbear.id.au
Wed Feb 22 13:23:28 AEDT 2017
On Tue, Feb 21, 2017 at 01:41:31PM +1100, Alexey Kardashevskiy wrote:
> On POWERNV platform, in order to do DMA via IOMMU (i.e. 32bit DMA in
> our case), a device needs an iommu_table pointer set via
> set_iommu_table_base().
>
> The codeflow is:
> - pnv_pci_ioda2_setup_dma_pe()
> - pnv_pci_ioda2_setup_default_config()
> - pnv_ioda_setup_bus_dma() [1]
>
> pnv_pci_ioda2_setup_dma_pe() creates IOMMU groups,
> pnv_pci_ioda2_setup_default_config() does default DMA setup,
> pnv_ioda_setup_bus_dma() takes a bus PE (on IODA2, all physical function
> PEs as bus PEs except NPU), walks through all underlying buses and
> devices, adds all devices to an IOMMU group and sets iommu_table.
>
> On IODA2, when VFIO is used, it takes ownership over a PE which means it
> removes all tables and creates new ones (with a possibility of sharing
> them among PEs). So when the ownership is returned from VFIO to
> the kernel, the iommu_table pointer written to a device at [1] is
> stale and needs an update.
>
> This adds an "add_to_group" parameter to pnv_ioda_setup_bus_dma()
> (in fact re-adds as it used to be there a while ago for different
> reasons) to tell the helper if a device needs to be added to
> an IOMMU group with an iommu_table update or just the latter.
>
> This calls pnv_ioda_setup_bus_dma(..., false) from
> pnv_ioda2_release_ownership() so when the ownership is restored,
> 32bit DMA can work again for a device. This does the same thing
> on obtaining ownership as the iommu_table point is stale at this point
> anyway and it is safer to have NULL there.
>
> We did not hit this earlier as all tested devices in recent years were
> only using 64bit DMA; the rare exception for this is MPT3 SAS adapter
> which uses both 32bit and 64bit DMA access and it has not been tested
> with VFIO much.
>
> Cc: Gavin Shan <gwshan at linux.vnet.ibm.com>
> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
Reviewed-by: David Gibson <david at gibson.dropbear.id.au>
> ---
>
> If this is applied before "powerpc/powernv/npu: Remove dead iommu code",
> there will be a minor conflict.
> ---
> arch/powerpc/platforms/powernv/pci-ioda.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 51ec0dc1dfde..f5a2421bf164 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -1774,17 +1774,20 @@ static u64 pnv_pci_ioda_dma_get_required_mask(struct pci_dev *pdev)
> }
>
> static void pnv_ioda_setup_bus_dma(struct pnv_ioda_pe *pe,
> - struct pci_bus *bus)
> + struct pci_bus *bus,
> + bool add_to_group)
> {
> struct pci_dev *dev;
>
> list_for_each_entry(dev, &bus->devices, bus_list) {
> set_iommu_table_base(&dev->dev, pe->table_group.tables[0]);
> set_dma_offset(&dev->dev, pe->tce_bypass_base);
> - iommu_add_device(&dev->dev);
> + if (add_to_group)
> + iommu_add_device(&dev->dev);
>
> if ((pe->flags & PNV_IODA_PE_BUS_ALL) && dev->subordinate)
> - pnv_ioda_setup_bus_dma(pe, dev->subordinate);
> + pnv_ioda_setup_bus_dma(pe, dev->subordinate,
> + add_to_group);
> }
> }
>
> @@ -2190,7 +2193,7 @@ static void pnv_pci_ioda1_setup_dma_pe(struct pnv_phb *phb,
> set_iommu_table_base(&pe->pdev->dev, tbl);
> iommu_add_device(&pe->pdev->dev);
> } else if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
> - pnv_ioda_setup_bus_dma(pe, pe->pbus);
> + pnv_ioda_setup_bus_dma(pe, pe->pbus, true);
>
> return;
> fail:
> @@ -2425,6 +2428,8 @@ static void pnv_ioda2_take_ownership(struct iommu_table_group *table_group)
>
> pnv_pci_ioda2_set_bypass(pe, false);
> pnv_pci_ioda2_unset_window(&pe->table_group, 0);
> + if (pe->pbus)
> + pnv_ioda_setup_bus_dma(pe, pe->pbus, false);
> pnv_ioda2_table_free(tbl);
> }
>
> @@ -2434,6 +2439,8 @@ static void pnv_ioda2_release_ownership(struct iommu_table_group *table_group)
> table_group);
>
> pnv_pci_ioda2_setup_default_config(pe);
> + if (pe->pbus)
> + pnv_ioda_setup_bus_dma(pe, pe->pbus, false);
> }
>
> static struct iommu_table_group_ops pnv_pci_ioda2_ops = {
> @@ -2725,7 +2732,7 @@ static void pnv_pci_ioda2_setup_dma_pe(struct pnv_phb *phb,
> return;
>
> if (pe->flags & (PNV_IODA_PE_BUS | PNV_IODA_PE_BUS_ALL))
> - pnv_ioda_setup_bus_dma(pe, pe->pbus);
> + pnv_ioda_setup_bus_dma(pe, pe->pbus, true);
> }
>
> #ifdef CONFIG_PCI_MSI
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20170222/c697c0c1/attachment.sig>
More information about the Linuxppc-dev
mailing list