[PATCH v3] PCI: Data corruption happening due to race condition
Bjorn Helgaas
helgaas at kernel.org
Sat Jul 28 08:25:40 AEST 2018
On Thu, Jul 19, 2018 at 02:18:09PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2018-07-18 at 18:29 -0500, Bjorn Helgaas wrote:
> > [+cc Paul, Michael, linuxppc-dev]
> >
>
> ..../...
>
> > > Debugging revealed a race condition between pcie core driver
> > > enabling is_added bit(pci_bus_add_device()) and nvme driver
> > > reset work-queue enabling is_busmaster bit (by pci_set_master()).
> > > As both fields are not handled in atomic manner and that clears
> > > is_added bit.
> > >
> > > Fix moves device addition is_added bit to separate private flag
> > > variable and use different atomic functions to set and retrieve
> > > device addition state. As is_added shares different memory
> > > location so race condition is avoided.
> >
> > Really nice bit of debugging!
>
> Indeed. However I'm not fan of the solution. Shouldn't we instead have
> some locking for the content of pci_dev ? I've always been wary of us
> having other similar races in there.
>
> As for the powerpc bits, I'm probably the one who wrote them, however,
> I'm on vacation this week and right now, no bandwidth to context switch
> all that back in :-) So give me a few days and/or ping me next week.
OK, here's a ping :)
Some powerpc cleanup would be ideal, but I'd like to fix the race for
v4.19, so I'm fine with this patch as-is. But I'd definitely want
your ack before inserting the ugly #include path in the powerpc code.
> The powerpc PCI code contains a lot of cruft coming from the depth of
> history, including rather nasty assumptions. We want to progressively
> clean it up, starting with EEH, but it will take time.
>
> Cheers,
> Ben.
>
> > > Signed-off-by: Hari Vyas <hari.vyas at broadcom.com>
> > > ---
> > > arch/powerpc/kernel/pci-common.c | 4 +++-
> > > arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++-
> > > arch/powerpc/platforms/pseries/setup.c | 3 ++-
> > > drivers/pci/bus.c | 6 +++---
> > > drivers/pci/hotplug/acpiphp_glue.c | 2 +-
> > > drivers/pci/pci.h | 11 +++++++++++
> > > drivers/pci/probe.c | 4 ++--
> > > drivers/pci/remove.c | 5 +++--
> > > include/linux/pci.h | 1 -
> > > 9 files changed, 27 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
> > > index fe9733f..471aac3 100644
> > > --- a/arch/powerpc/kernel/pci-common.c
> > > +++ b/arch/powerpc/kernel/pci-common.c
> > > @@ -42,6 +42,8 @@
> > > #include <asm/ppc-pci.h>
> > > #include <asm/eeh.h>
> > >
> > > +#include "../../../drivers/pci/pci.h"
> >
> > I see why you need it, but this include path is really ugly. Outside
> > of bootloaders and tools, there are very few instances of includes
> > like this that reference a different top-level directory, and I'm not
> > very keen about adding more.
More information about the Linuxppc-dev
mailing list