[PATCH 14/14] powerpc/eeh: Move PE tree setup into the platform

Alexey Kardashevskiy aik at ozlabs.ru
Tue Jul 14 19:10:36 AEST 2020



On 14/07/2020 13:08, Oliver O'Halloran wrote:
> On Tue, Jul 14, 2020 at 11:50 AM Alexey Kardashevskiy <aik at ozlabs.ru> wrote:
>>
>> On 06/07/2020 11:36, Oliver O'Halloran wrote:
>>>  /**
>>>   * eeh_pe_tree_insert - Add EEH device to parent PE
>>>   * @edev: EEH device
>>> + * @new_pe_parent: PE to create additional PEs under
>>>   *
>>> - * Add EEH device to the parent PE. If the parent PE already
>>> - * exists, the PE type will be changed to EEH_PE_BUS. Otherwise,
>>> - * we have to create new PE to hold the EEH device and the new
>>> - * PE will be linked to its parent PE as well.
>>> + * Add EEH device to the PE in edev->pe_config_addr. If a PE already
>>> + * exists with that address then @edev is added to that PE. Otherwise
>>> + * a new PE is created and inserted into the PE tree as a child of
>>> + * @new_pe_parent.
>>> + *
>>> + * If @new_pe_parent is NULL then the new PE will be inserted under
>>> + * directly under the the PHB.
>>>   */
>>> -int eeh_pe_tree_insert(struct eeh_dev *edev)
>>> +int eeh_pe_tree_insert(struct eeh_dev *edev, struct eeh_pe *new_pe_parent)
>>>  {
>>>       struct pci_controller *hose = edev->controller;
>>>       struct eeh_pe *pe, *parent;
>>
>>
>> We can ditch this "parent" in that single place now and use "pe"
>> instead. And name the new parameter simply "parent". Dunno if it
>> improves things though.
> 
> I did this at some point and then decided not to. It added a bunch of
> noise to the diff and calling the parameter "parent" ended up being
> pretty unreadable. The parameter is "the parent of the PE that will be
> created to contain edev", or "parent of the parent PE". It's pretty
> unwieldy.

Ok fine but we still do not need both pe and parent in that function
(may be one day...).


> 
>>> @@ -399,7 +366,7 @@ int eeh_pe_tree_insert(struct eeh_dev *edev)
>>>                       }
>>>
>>>                       eeh_edev_dbg(edev,
>>> -                                  "Added to device PE (parent: PE#%x)\n",
>>> +                                  "Added to existing PE (parent: PE#%x)\n",
>>>                                    pe->parent->addr);
>>>               } else {
>>>                       /* Mark the PE as type of PCI bus */
>>> @@ -431,10 +398,9 @@ int eeh_pe_tree_insert(struct eeh_dev *edev)
>>>        * to PHB directly. Otherwise, we have to associate the
>>>        * PE with its parent.
>>>        */
>>> -     parent = eeh_pe_get_parent(edev);
>>> -     if (!parent) {
>>> -             parent = eeh_phb_pe_get(hose);
>>> -             if (!parent) {
>>> +     if (!new_pe_parent) {
>>> +             new_pe_parent = eeh_phb_pe_get(hose);
>>> +             if (!new_pe_parent) {
>>
>>
>>
>> afaict only pseries can realisticly pass new_pe_parent==NULL so this
>> chunk could go to pseries_eeh_pe_get_parent.
> 
> pnv_eeh_get_upstream_pe() will never return the PHB PE so
> new_pe_parent will be NULL for the first PE created under a PowerNV
> PHB. I guess we could move the PHB PE handling into the platform too,
> but I think that just results in having to special case PHB PEs in two
> places rather than one.
> 
>>> +static struct eeh_pe *pseries_eeh_pe_get_parent(struct eeh_dev *edev)
>>> +{
>>> +     struct eeh_dev *parent;
>>> +     struct pci_dn *pdn = eeh_dev_to_pdn(edev);
>>> +
>>> +     /*
>>> +      * It might have the case for the indirect parent
>>> +      * EEH device already having associated PE, but
>>> +      * the direct parent EEH device doesn't have yet.
>>> +      */
>>> +     if (edev->physfn)
>>> +             pdn = pci_get_pdn(edev->physfn);
>>> +     else
>>> +             pdn = pdn ? pdn->parent : NULL;
>>> +     while (pdn) {
>>> +             /* We're poking out of PCI territory */
>>
>>
>> We are traversing up PCI hierarchy here - pci_dn->parent, how is this
>> out of PCI territory? Or I understand "out of" incorrectly?
>>
>>
>>> +             parent = pdn_to_eeh_dev(pdn);
>>> +             if (!parent)
>>> +                     return NULL;
> 
> If there's no eeh dev then the node we're looking at is a PHB rather
> than an actual PCI device so it stops looking. I think. The comment
> was copied over from the existing code and I haven't spent a whole lot
> of time parsing it's meaning.


I noticed cut-n-paste. May be just ditch it if nobody can parse it?

> 
> 
> 
>>> @@ -301,6 +343,8 @@ void pseries_eeh_init_edev(struct pci_dn *pdn)
>>>       if (ret) {
>>>               eeh_edev_dbg(edev, "EEH failed to enable on device (code %d)\n", ret);
>>>       } else {
>>> +             struct eeh_pe *parent;
>>> +
>>>               /* Retrieve PE address */
>>>               edev->pe_config_addr = pseries_eeh_get_pe_addr(pdn);
>>>               pe.addr = edev->pe_config_addr;
>>> @@ -313,16 +357,23 @@ void pseries_eeh_init_edev(struct pci_dn *pdn)
>>>               if (ret > 0 && ret != EEH_STATE_NOT_SUPPORT)
>>>                       enable = 1;
>>>
>>> -             if (enable) {
>>> +             /* This device doesn't support EEH, but it may have an
>>> +              * EEH parent, in which case we mark it as supported.
>>> +              */
>>> +             parent = pseries_eeh_pe_get_parent(edev);
>>> +             if (parent && !enable)
>>> +                     edev->pe_config_addr = parent->addr;
>>
>>
>> What if pseries_eeh_pe_get_parent() returned NULL - we won't write
>> edev->pe_config_addr so it remains 0 which is fine just by accident? :)
> 
> edev->pe_config_addr is set above when we call
> pseries_eeh_get_pe_addr(). The check there is mainly to cover the case
> where pseries_eeh_get_pe_addr() fails because the device is on a
> subordinate bus rather than the root bus of the PE. PAPR says the
> get-pe-addr-info RTAS call can fail in that situation and that you're
> supposed to traverse up the DT to find the pe_config_addr, which is
> what pe_get_parent() does. Yeah it's confusing, but that's what it
> does today too.
> 
>> Reviewed-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>>
>>> +
>>> +             if (enable || parent) {
>>>                       eeh_add_flag(EEH_ENABLED);
>>> -                     eeh_pe_tree_insert(edev);
>>> +                     eeh_pe_tree_insert(edev, parent);
> 
>>>               } else if (pdn->parent && pdn_to_eeh_dev(pdn->parent) &&
>>>                          (pdn_to_eeh_dev(pdn->parent))->pe) {
>>>                       /* This device doesn't support EEH, but it may have an
>>>                        * EEH parent, in which case we mark it as supported.
>>>                        */
>>>                       edev->pe_config_addr = pdn_to_eeh_dev(pdn->parent)->pe_config_addr;
>>> -                     eeh_pe_tree_insert(edev);
>>> +                     eeh_pe_tree_insert(edev, parent);
> 
> I think I was supposed to delete this hunk and then forgot to since it
> handles the same case mentioned above.

A-ha!


> 
>>>               }
>>>               eeh_edev_dbg(edev, "EEH is %s on device (code %d)\n",
>>>                            (enable ? "enabled" : "unsupported"), ret);
>>>
>>
>> --
>> Alexey

-- 
Alexey


More information about the Linuxppc-dev mailing list