[PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()

Gavin Shan gwshan at linux.vnet.ibm.com
Wed Apr 30 10:28:12 EST 2014


On Tue, Apr 29, 2014 at 02:49:55PM +0800, Wei Yang wrote:
>On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>On 04/23/2014 12:26 PM, Wei Yang wrote:

.../...

>Generally, when kernel enumerate on the pci device, following functions will
>be invoked.
>
>     pci_device_add
>        pcibios_setup_bus_device
>        ...
>        set_iommu_table_base_and_group   
>     device_add
>        ...
>        tce_iommu_bus_notifier           
>     pcibios_fixup_bus
>        pcibios_add_pci_devices
>        ...
>        pcibios_setup_bus_devices        
>
>From the call flow, we see for a normall pci device, the
>pcibios_setup_bus_device() will be invoked twice.
>
>At the bootup time, none of them succeed to setup the dma, since the PE is not
>assigned or the tbl is not set. The iommu tbl and group is setup in
>pnv_pci_ioda_setup_DMA().
>

Yes, we don't assign PE# for PCI devices until ppc_md.pcibios_fixup().
We gets IOMMU group and IOMMU group device registered in ppc_md.pcibios_fixup().

As Alexy already pointed out, "tce_iommu_bus_notifier" doesn't take effect
during system boot stage.

>This call flow maintains the same when EEH error happens on Bus PE, while the
>behavior is a little different. 
>
>     pci_device_add
>        pcibios_setup_bus_device
>        ...
>        set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>     device_add
>        ...
>        tce_iommu_bus_notifier           <- succeed
>     pcibios_fixp_bus
>        pcibios_add_pci_devices
>        ...
>        pcibios_setup_bus_devices        <- warning, re-attach
>
>While this call flow will change a little on a VF. For a VF,
>pcibios_fixp_bus() will not be invoked. Current behavior is this.
>
>     pci_device_add
>        pcibios_setup_bus_device
>        ...
>        set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>     device_add
>        ...
>        tce_iommu_bus_notifier           <- succeed
>

It seems that we have 2 problems here:

- For non-SRIOV case, pcibios_setup_device() is called for towice. That
  seems incorrect. We could simply remove pcibios_setup_bus_devices()
  from pcibios_fixup_bus().

- It's too early to register IOMMU group/device in pnv_pci_ioda_dma_dev_setup()
  because the sysfs entries of the PCI device aren't finalized yet. So we could
  remove all logic we have in pnv_pci_ioda_dma_dev_setup() and just purely rely
  on "tce_iommu_bus_notifier".

By the way, I never tried EEH on SRIOV PF/VFs. However, I never hit similar
issue in non-SRIOV cases.

Thanks,
Gavin



More information about the Linuxppc-dev mailing list