pci_pcie_cap invalid on AER/EEH enabled PPC?
Richard A Lary
rlary at linux.vnet.ibm.com
Wed Jul 6 10:14:40 EST 2011
On 7/5/2011 1:34 PM, Richard A Lary wrote:
> On 7/5/2011 10:22 AM, Richard A Lary wrote:
>> On 7/5/2011 9:18 AM, Jon Mason wrote:
>>> On Tue, Jul 5, 2011 at 10:41 AM, Richard A Lary
>>> <rlary at linux.vnet.ibm.com> wrote:
>>>> On 7/1/2011 1:00 PM, Richard A Lary wrote:
>>>>>
>>>>> On 7/1/2011 12:02 PM, Jon Mason wrote:
>>>>>>
>>>>>> On Fri, Jul 1, 2011 at 1:30 PM, Richard A Lary<rlary at linux.vnet.ibm.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> On 7/1/2011 8:24 AM, Jon Mason wrote:
>>>>>>>>
>>>>>>>> I recently sent out a number of patches to migrate drivers calling
>>>>>>>> `pci_find_capability(pdef, PCI_CAP_ID_EXP)` to pci_pcie_cap. This
>>>>>>>> function takes uses a PCI-E capability offset that was determined by
>>>>>>>> calling pci_find_capability during the PCI bus walking. In response
>>>>>>>> to one of the patches, James Smart posted:
>>>>>>>>
>>>>>>>> "The reason is due to an issue on PPC platforms whereby use of
>>>>>>>> "pdev->is_pcie" and pci_is_pcie() will erroneously fail under some
>>>>>>>> conditions, but explicit search for the capability struct via
>>>>>>>> pci_find_capability() is always successful. I expect this to be due
>>>>>>>> a shadowing of pci config space in the hal/platform that isn't
>>>>>>>> sufficiently built up. We detected this issue while testing AER/EEH,
>>>>>>>> and are functional only if the pci_find_capability() option is used."
>>>>>>>>
>>>>>>>> See http://marc.info/?l=linux-scsi&m=130946649427828&w=2 for the whole
>>>>>>>> post.
>>>>>>>>
>>>>>>>> Based on his description above pci_pcie_cap
>>>>>>>> andpci_find_capability(pdef, PCI_CAP_ID_EXP) should be functionally
>>>>>>>> equivalent. If this is not safe, then the PCI bus walking code is
>>>>>>>> most likely busted on EEH enabled PPC systems (and that is a BIG
>>>>>>>> problem). Can anyone confirm this is still an issue?
>>>>>>>
>>>>>>> Jon,
>>>>>>>
>>>>>>> I applied the following debug patch to lpfc driver in a 2.6.32 distro
>>>>>>> kernel ( I had this one handy, I can try with mainline later today )
>>>>>>>
>>>>>>> ---
>>>>>>> drivers/scsi/lpfc/lpfc_init.c | 10 10 + 0 - 0 !
>>>>>>> 1 file changed, 10 insertions(+)
>>>>>>>
>>>>>>> Index: b/drivers/scsi/lpfc/lpfc_init.c
>>>>>>> ===================================================================
>>>>>>> --- a/drivers/scsi/lpfc/lpfc_init.c
>>>>>>> +++ b/drivers/scsi/lpfc/lpfc_init.c
>>>>>>> @@ -3958,6 +3958,16 @@ lpfc_enable_pci_dev(struct lpfc_hba *phb
>>>>>>> pci_try_set_mwi(pdev);
>>>>>>> pci_save_state(pdev);
>>>>>>>
>>>>>>> + printk(KERN_WARNING "pcicap: is_pcie=%x pci_cap=%x pcie_type=%x\n",
>>>>>>> + pdev->is_pcie,
>>>>>>> + pdev->pcie_cap,
>>>>>>> + pdev->pcie_type);
>>>>>>> +
>>>>>>> + if (pci_is_pcie(pdev))
>>>>>>> + printk(KERN_WARNING "pcicap: true\n");
>>>>>>> + else
>>>>>>> + printk(KERN_WARNING "pcicap: false\n");
>>>>>>> +
>>>>>>> /* PCIe EEH recovery on powerpc platforms needs fundamental reset */
>>>>>>> if (pci_find_capability(pdev, PCI_CAP_ID_EXP))
>>>>>>> pdev->needs_freset = 1;
>>>>>>>
>>>>>>> This is output upon driver load on an IBM Power 7 model 8233-E8B server.
>>>>>>>
>>>>>>> dmesg | grep pcicap
>>>>>>> Linux version 2.6.32.42-pcicap-ppc64 (geeko at buildhost) (gcc version
>>>>>>> 4.3.4
>>>>>>> [gcc-4_3-branch revision 152973] (SUSE Linux) ) #1 SMP Fri Jul 1
>>>>>>> 09:31:27
>>>>>>> PDT 2011
>>>>>>> pcicap: is_pcie=0 pci_cap=0 pcie_type=0
>>>>>>> pcicap: false
>>>>>>> pcicap: is_pcie=0 pci_cap=0 pcie_type=0
>>>>>>> pcicap: false
>>>>>>> pcicap: is_pcie=0 pci_cap=0 pcie_type=0
>>>>>>> pcicap: false
>>>>>>> pcicap: is_pcie=0 pci_cap=0 pcie_type=0
>>>>>>> pcicap: false
>>>>>>>
>>>>>>> It would appear that the pcie information is not set in pci_dev
>>>>>>> structure
>>>>>>> for
>>>>>>> this device at the time the driver is being initialized during boot.
>>>>>>
>>>>>> Thanks for trying this. Can you confirm that the other devices in the
>>>>>> system have this issue as well (or show that it is isolated to the lpr
>>>>>> device)? You can add printks in set_pcie_port_type() to verify what
>>>>>> is being set on bus walking and to see when it is being called with
>>>>>> respect to when it is being populated by firmware.
>>>>>
>>>>> Jon,
>>>>>
>>>>> I will give this suggestion a try and post results
>>>>
>>>> On Power PC platforms, set_pcie_port_type() is not called. On Power PC,
>>>> pci_dev structure is initialized by of_create_pci_dev(). However, the
>>>> structure member pcie_cap is NOT computed nor set in this function.
>>>
>>> Yes, it is. of_create_pci_dev() calls set_pcie_port_type()
>>> http://lxr.linux.no/linux+v2.6.39/arch/powerpc/kernel/pci_of_scan.c#L144
>>>
>>> That function sets pdev->pcie_cap
>>> http://lxr.linux.no/linux+v2.6.39/drivers/pci/probe.c#L896
>>>
>>> So, it should be set. It looks like there is a bug in
>>> of_create_pci_dev, as set_pcie_port_type is being called BEFORE the
>>> BARs are setup. If you move set_pcie_port_type prior to
>>> pci_device_add (perhaps even after), then I bet the issue is resolved.
>>>
>>
>> The claim above was based upon observation that with this patch applied
>> to a 2.6.32 kernel, the printk output did not appear in dmesg upon boot.
>>
>> static void set_pcie_hotplug_bridge(struct pci_dev *pdev)
>> @@ -774,6 +780,8 @@ int pci_setup_device(struct pci_dev *dev
>> u8 hdr_type;
>> struct pci_slot *slot;
>>
>> + printk(KERN_WARNING "pcicap: setup_device %p\n", dev);
>> +
>> if (pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type))
>> return -EIO;
>>
>> I can make no claim about my understanding of pci device initialization
>> on Power PC, so I have added Anton Blanchard to the cc list.
>>
>> Perhaps Anton can explain why pcie_cap is always 0 on Power PC.
>
> I pulled down 3.0-rc6 kernel, I will build and test the pci_is_pcie(pdev)
> patch in lpfc driver now that set_pcie_port_type() is called from
> of_create_pci_dev().
Jon,
I applied the debug patches mentioned above along with the lpfc patch
http://marc.info/?l=linux-scsi&m=130919648513685&w=2
to linux 3.0-rc6 kernel.
The debug patches show that pci_dev members "is_pcie, pci_cap and pcie_type" are
now all being set to correct values now that 'of_create_pci_dev()' calls
'set_pcie_port_type()'. I was not able to determine when this patch went in.
My tests show that lpfc driver now recovers from injected PCIe bus errors
using test the 'if (pci_is_pcie(pdev))' for PCIe adapter type.
I did not apply the test patch which changed the location of
set_pcie_port_type(dev) in of_create_pci_dev(). I can apply and test
this change if you think it is necessary?
Based upon these results, I will ACK the change to the lpfc driver for
"[PATCH 03/19] lpfc: remove unnecessary read of PCI_CAP_ID_EXP".
I suspect that other drivers which are modified to use 'if (pci_is_pcie(pdev))'
will work on Power PC as well, but I did not test any drivers other than lpfc.
-rich
More information about the Linuxppc-dev
mailing list