[PATCH 3/4] powerpc/eeh: Remove workaround from eeh_add_device_late()

Oliver O'Halloran oohall at gmail.com
Wed Apr 8 16:53:36 AEST 2020


On Wed, Apr 8, 2020 at 4:22 PM Sam Bobroff <sbobroff at linux.ibm.com> wrote:
>
> On Fri, Apr 03, 2020 at 05:08:32PM +1100, Oliver O'Halloran wrote:
> > On Mon, 2020-03-30 at 15:56 +1100, Sam Bobroff wrote:
> > > When EEH device state was released asynchronously by the device
> > > release handler, it was possible for an outstanding reference to
> > > prevent it's release and it was necessary to work around that if a
> > > device was re-discovered at the same PCI location.
> >
> > I think this is a bit misleading. The main situation where you'll hit
> > this hack is when recovering a device with a driver that doesn't
> > implement the error handling callbacks. In that case the device is
> > removed, reset, then re-probed by the PCI core, but we assume it's the
> > same physical device so the eeh_device state remains active.
> >
> > If you actually changed the underlying device I suspect something bad
> > would happen.
>
> I'm not sure I understand. Isn't the case you're talking about caught by
> the earlier check (just above the patch)?
>
>         if (edev->pdev == dev) {
>                 eeh_edev_dbg(edev, "Device already referenced!\n");
>                 return;
>         }

No, in the case I'm talking about the pci_dev is torn down and
freed(). After the PE is reset we re-probe the device and create a new
pci_dev.  If the release of the old pci_dev is delayed we need the
hack this patch is removing.

The check above should probably be a WARN_ON() since we should never
be re-running the EEH probe on the same device. I think there is a
case where that can happen, but I don't remember the details.

Oliver


More information about the Linuxppc-dev mailing list