[PATCH v5 01/23] PCI: Fix race condition in pci_enable/disable_device()

Sergey Miroshnichenko s.miroshnichenko at yadro.com
Mon Sep 30 18:53:16 AEST 2019


Hello Bjorn,

On 9/28/19 12:59 AM, Bjorn Helgaas wrote:
> On Fri, Aug 16, 2019 at 07:50:39PM +0300, Sergey Miroshnichenko wrote:
>> This is a yet another approach to fix an old [1-2] concurrency issue, when:
>>   - two or more devices are being hot-added into a bridge which was
>>     initially empty;
>>   - a bridge with two or more devices is being hot-added;
>>   - during boot, if BIOS/bootloader/firmware doesn't pre-enable bridges.
>>
>> The problem is that a bridge is reported as enabled before the MEM/IO bits
>> are actually written to the PCI_COMMAND register, so another driver thread
>> starts memory requests through the not-yet-enabled bridge:
>>
>>   CPU0                                        CPU1
>>
>>   pci_enable_device_mem()                     pci_enable_device_mem()
>>     pci_enable_bridge()                         pci_enable_bridge()
>>       pci_is_enabled()
>>         return false;
>>       atomic_inc_return(enable_cnt)
>>       Start actual enabling the bridge
>>       ...                                         pci_is_enabled()
>>       ...                                           return true;
>>       ...                                     Start memory requests <-- FAIL
>>       ...
>>       Set the PCI_COMMAND_MEMORY bit <-- Must wait for this
>>
>> Protect the pci_enable/disable_device() and pci_enable_bridge(), which is
>> similar to the previous solution from commit 40f11adc7cd9 ("PCI: Avoid race
>> while enabling upstream bridges"), but adding a per-device mutexes and
>> preventing the dev->enable_cnt from from incrementing early.
> 
> This isn't directly related to the movable BARs functionality; is it
> here because you see the problem more frequently when moving BARs?
> 

First two patches of this series (including this one) are fixes for
the boot and for the hotplug, not related to movable BARs.

Before these fixes, we were suffering from this issue on PowerNV until
commit db2173198b9513f7add8009f225afa1f1c79bcc6 "powerpc/powernv/pci:
Work around races in PCI bridge enabling" was backported to distros:
NVMEs randomly failed to start during system boot. So we've tested the
fixes with that commit reverted.

On x86 the BIOS does pre-enable the bridges, but they were still prone
to races when hot-added or was initially "empty".

Serge


More information about the Linuxppc-dev mailing list