[PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()
Gavin Shan
gwshan at linux.vnet.ibm.com
Wed Apr 30 10:28:12 EST 2014
On Tue, Apr 29, 2014 at 02:49:55PM +0800, Wei Yang wrote:
>On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>>On 04/23/2014 12:26 PM, Wei Yang wrote:
.../...
>Generally, when kernel enumerate on the pci device, following functions will
>be invoked.
>
> pci_device_add
> pcibios_setup_bus_device
> ...
> set_iommu_table_base_and_group
> device_add
> ...
> tce_iommu_bus_notifier
> pcibios_fixup_bus
> pcibios_add_pci_devices
> ...
> pcibios_setup_bus_devices
>
>From the call flow, we see for a normall pci device, the
>pcibios_setup_bus_device() will be invoked twice.
>
>At the bootup time, none of them succeed to setup the dma, since the PE is not
>assigned or the tbl is not set. The iommu tbl and group is setup in
>pnv_pci_ioda_setup_DMA().
>
Yes, we don't assign PE# for PCI devices until ppc_md.pcibios_fixup().
We gets IOMMU group and IOMMU group device registered in ppc_md.pcibios_fixup().
As Alexy already pointed out, "tce_iommu_bus_notifier" doesn't take effect
during system boot stage.
>This call flow maintains the same when EEH error happens on Bus PE, while the
>behavior is a little different.
>
> pci_device_add
> pcibios_setup_bus_device
> ...
> set_iommu_table_base_and_group <- fail, kobj->sd is not initialized
> device_add
> ...
> tce_iommu_bus_notifier <- succeed
> pcibios_fixp_bus
> pcibios_add_pci_devices
> ...
> pcibios_setup_bus_devices <- warning, re-attach
>
>While this call flow will change a little on a VF. For a VF,
>pcibios_fixp_bus() will not be invoked. Current behavior is this.
>
> pci_device_add
> pcibios_setup_bus_device
> ...
> set_iommu_table_base_and_group <- fail, kobj->sd is not initialized
> device_add
> ...
> tce_iommu_bus_notifier <- succeed
>
It seems that we have 2 problems here:
- For non-SRIOV case, pcibios_setup_device() is called for towice. That
seems incorrect. We could simply remove pcibios_setup_bus_devices()
from pcibios_fixup_bus().
- It's too early to register IOMMU group/device in pnv_pci_ioda_dma_dev_setup()
because the sysfs entries of the PCI device aren't finalized yet. So we could
remove all logic we have in pnv_pci_ioda_dma_dev_setup() and just purely rely
on "tce_iommu_bus_notifier".
By the way, I never tried EEH on SRIOV PF/VFs. However, I never hit similar
issue in non-SRIOV cases.
Thanks,
Gavin
More information about the Linuxppc-dev
mailing list