[PATCH] powerpc/pci: Fix PHB removal/rescan on PowerNV
Cédric Le Goater
clg at kaod.org
Thu Oct 8 15:37:02 AEDT 2020
On 10/8/20 4:23 AM, Oliver O'Halloran wrote:
> On Fri, Sep 25, 2020 at 7:23 PM Cédric Le Goater <clg at kaod.org> wrote:
>>
>> To fix an issue with PHB hotplug on pSeries machine (HPT/XIVE), commit
>> 3a3181e16fbd introduced a PPC specific pcibios_remove_bus() routine to
>> clear all interrupt mappings when the bus is removed. This routine
>> frees an array allocated in pcibios_scan_phb().
>>
>> This broke PHB hotplug on PowerNV because, when a PHB is removed and
>> re-scanned through sysfs, the PCI layer un-assigns and re-assigns
>> resources to the PHB but does not destroy and recreate the PCI
>> controller structure. Since pcibios_remove_bus() does not clear the
>> 'irq_map' array pointer, a second removal of the PHB will try to free
>> the array a second time and corrupt memory.
>
> "PHB hotplug" and "hot-plugging devices under a PHB" are different
> things. What you're saying here doesn't make a whole lot of sense to
> me unless you're conflating the two. The distinction is important
> since on pseries we can use DLPAR to add and remove actual PHBs (i.e.
> the pci_controller) at runtime, but there's no corresponding mechanism
> on PowerNV.
And it's even different on QEMU.
>> Free the 'irq_map' array in pcibios_free_controller() to fix
>> corruption and clear interrupt mapping after it has been
>> disposed. This to avoid filling up the array with successive
>> remove/rescan of a bus.
>
> Even with this patch I think we're still broken. With this patch
> applied the init path is something like:
>
> per-phb init:
> allocate phb->irq_map array
> per-bus init:
> nothing
> per-device init:
> pcibios_bus_add_device()
> pci_read_irq_line()
> pci_irq_map_register(pci_dev, virq)
> *record the device's interrupt in phb->irq_map*
>
> And the teardown path:
>
> per-device teardown:
> nothing
> per-bus teardown:
> pcibios_remove_bus()
> pci_irq_map_dispose()
> *walk phb->irq_map and dispose of each mapped interrupt*
> per-phb teardown:
> free(phb->irq_map)
>
> There's a lot of asymmetry here, which is a problem in itself, but the
> real issue is that when removing *any* pci_bus under a PHB we dispose
> of the LSI\ for *every* device under that PHB. Not good.
>
> Ideally we should be fixing this by having the per-device teardown
> handle disposing the mapping. Unfortunately, there's no pcibios hook
> that's called when removing a pci_dev. However, we can register a bus
> notifier which will be called when the pci_dev is removed from its bus
> and we already do that for the per-device EEH teardown and to handle
> IOMMU TCE invalidation when the device is removed.
I lack the knowledge here and I think some else should take over,
as I am not doing a good job.
Michael, can you drop the initial patch again :/ It is better not
to merge anything.
Thanks,
C.
More information about the Linuxppc-dev
mailing list