[PATCH] powerpc/eeh: avoid possible crash when edev->pdev changes

Ganesh G R ganeshgr at linux.ibm.com
Thu Jun 13 23:48:56 AEST 2024


On 6/11/24 8:18 AM, Michael Ellerman wrote:

> Hi Ganesh,
>
> Ganesh Goudar <ganeshgr at linux.ibm.com> writes:
>> If a PCI device is removed during eeh_pe_report_edev(), edev->pdev
>> will change and can cause a crash, hold the PCI rescan/remove lock
>> while taking a copy of edev->pdev.
>>
>> Signed-off-by: Ganesh Goudar <ganeshgr at linux.ibm.com>
>> ---
>>   arch/powerpc/kernel/eeh_pe.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c
>> index d1030bc52564..49f968733912 100644
>> --- a/arch/powerpc/kernel/eeh_pe.c
>> +++ b/arch/powerpc/kernel/eeh_pe.c
>> @@ -859,7 +859,9 @@ struct pci_bus *eeh_pe_bus_get(struct eeh_pe *pe)
>>   
>>   	/* Retrieve the parent PCI bus of first (top) PCI device */
>>   	edev = list_first_entry_or_null(&pe->edevs, struct eeh_dev, entry);
>> +	pci_lock_rescan_remove();
>>   	pdev = eeh_dev_to_pci_dev(edev);
>> +	pci_unlock_rescan_remove();
>>   	if (pdev)
>>   		return pdev->bus;
> What prevents pdev being freed/reused immediately after you drop the
> rescan/remove lock?

Yeah, I should have released the lock after getting bus address, I will send v2.

> AFAICS eeh_dev_to_pci_dev() doesn't take an additional reference to the
> pdev or anything.

Yes, I think we have to evaluate the possible eventualities of not taking the reference
in all the cases.
But we need this lock here because, if the PCI error is encountered in the hotplug remove
path, we need the pci rescan lock to avoid race between hotplug remove path and the bottom
half of EEH recovery, this lets the hotplug remove to complete since it is already holding
the lock and drop the recovery process as the device is no longer present.



More information about the Linuxppc-dev mailing list