[PATCH 1/2] powerpc/powernv: reduce multi-hit of iommu_add_device()

Alexey Kardashevskiy aik at au1.ibm.com
Tue Apr 29 17:55:48 EST 2014


On 04/29/2014 04:49 PM, Wei Yang wrote:
> On Mon, Apr 28, 2014 at 11:35:32PM +1000, Alexey Kardashevskiy wrote:
>> On 04/23/2014 12:26 PM, Wei Yang wrote:
>>> During the EEH hotplug event, iommu_add_device() will be invoked three times
>>> and two of them will trigger warning or error.
>>>
>>> The three times to invoke the iommu_add_device() are:
>>>
>>>     pci_device_add
>>>        ...
>>>        set_iommu_table_base_and_group   <- 1st time, fail
>>>     device_add
>>>        ...
>>>        tce_iommu_bus_notifier           <- 2nd time, succees
>>>     pcibios_add_pci_devices
>>>        ...
>>>        pcibios_setup_bus_devices        <- 3rd time, re-attach
>>>
>>> The first time fails, since the dev->kobj->sd is not initialized. The
>>> dev->kobj->sd is initialized in device_add().
>>> The third time's warning is triggered by the re-attach of the iommu_group.
>>>
>>> After applying this patch, the error
>>
>> Nack.
>>
>> The patch still seems incorrect and we actually need to remove
>> tce_iommu_bus_notifier completely as pcibios_setup_bus_devices is called
>>from another notifier anyway. Could you please test it?
>>
>>
> 
> Hi, Alexey,
> 
> Nice to see your comment. Let me show what I got fist.
> 
> Generally, when kernel enumerate on the pci device, following functions will
> be invoked.
> 
>      pci_device_add
>         pcibios_setup_bus_device
>         ...
>         set_iommu_table_base_and_group   
>      device_add
>         ...
>         tce_iommu_bus_notifier           
>      pcibios_fixp_bus
>         pcibios_add_pci_devices
>         ...
>         pcibios_setup_bus_devices        
> 
>>From the call flow, we see for a normall pci device, the
> pcibios_setup_bus_device() will be invoked twice.


tce_iommu_bus_notifier should not be called for devices during boot-time
PCI discovery as the discovery is done from subsys_initcall handler and
tce_iommu_bus_notifier is registered as subsys_initcall_sync. Are you sure
this is happening? You should see warnings as for PF's EEH but you do not
say that.


> At the bootup time, none of them succeed to setup the dma, since the PE is not
> assigned or the tbl is not set. The iommu tbl and group is setup in
> pnv_pci_ioda_setup_DMA().
> 
> This call flow maintains the same when EEH error happens on Bus PE, while the
> behavior is a little different. 
> 
>      pci_device_add
>         pcibios_setup_bus_device
>         ...
>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized


Sorry, what is uninitialized? The only kobj here is the one in iommu_group
and it must have been taken care of before adding a device.


>      device_add
>         ...
>         tce_iommu_bus_notifier           <- succeed
>      pcibios_fixp_bus
>         pcibios_add_pci_devices
>         ...
>         pcibios_setup_bus_devices        <- warning, re-attach


This is why I am suggesting getting rid of tce_iommu_bus_notifier - we will
avoid the warning.


> While this call flow will change a little on a VF. For a VF,
> pcibios_fixp_bus() will not be invoked. Current behavior is this.


You meant pcibios_fixup_bus? I would expect it to get called (from
pci_rescan_bus() or something like that?) and this seems to be the problem
here.

How are VFs added in terms of code flow? Is there any git tree to look at?



>      pci_device_add
>         pcibios_setup_bus_device
>         ...
>         set_iommu_table_base_and_group   <- fail, kobj->sd is not initialized
>      device_add
>         ...
>         tce_iommu_bus_notifier           <- succeed
> 
> And if an EEH error happens just on a VF, I believe the same call flow should
> go for recovery. (This is not set down, still under discussion with Gavin.)
> 
> My conclusion is:
> 1. remove the tce_iommu_bus_notifier completely will make the VF not work.
>    So I choose to revert the code and attach make the iommu group attachment
>    just happens in one place.
> 
> BTW, I know my patch is not a perfect one. For a PF, the tbl will still be set
> twice. I am not sure why we need to invoke pcibios_setup_device() twice on a
> PF, maybe some platform need to fixup some thing after the pci_bus is added.
> So I don't remove one of them to solve the problem.
> 
> If you have a better idea, I am glad to take it.
> 


-- 
Alexey Kardashevskiy
IBM OzLabs, LTC Team

e-mail: aik at au1.ibm.com
notes: Alexey Kardashevskiy/Australia/IBM



More information about the Linuxppc-dev mailing list