[PATCH 1/2] PCI: Ensure error recoverability at all times

Lukas Wunner lukas at wunner.de
Sat Nov 15 05:58:19 AEDT 2025


On Thu, Nov 13, 2025 at 10:15:56AM -0600, Bjorn Helgaas wrote:
> It seems like there are two things going on here, and I'm not sure
> they're completely compatible:
> 
>   1) Driver calls pci_save_state() to take over device power
>      management and prevent the PCI core from doing it.
> 
>   2) Driver calls pci_save_state() to capture the device state it
>      wants to restore when recovering from an error.
> 
> Shouldn't a driver be able to do 2) without also getting 1)?

In general, it can:

A number of drivers already call pci_save_state() on probe to capture
the state for subsequent error recovery.  If the driver has modified
config space in its probe hook, then calling pci_save_state() continues
to make sense.  If the driver has *not* modified config space, then the
call becomes obsolete once this patch is accepted.

The reason I'm using the "in general" qualifier:

I've identified two corner cases where the PCI core neglects to set
state_saved = false before commencing the suspend sequence:

* If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects
  to set state_saved = false.  Yet both pci_legacy_suspend() and
  pci_legacy_suspend_late() subsequently query the state_saved flag.

* If a device is unbound or its driver has no PM callbacks
  (driver->pm == NULL), pci_pm_freeze() neglects to set state_saved = false.
  Yet pci_pm_freeze_noirq() subsequently queries the state_saved flag.

In these corner cases, pci_legacy_suspend() and pci_pm_freeze() depend
on some other part of the PCI core to set state_saved = false.
For a freshly enumerated device, the flag is initialized to false by
kzalloc() and pci_device_add() also explicitly sets it to false for good
measure.  On resume (or thaw or restore), the flag is set to false by
pci_restore_state().  The latter is preserved as is by my patch and the
former is moved to pci_bus_add_device() to retain the current behavior.

Clearly, the two corner cases should be fixed and then setting
state_saved = false in pci_bus_add_device() becomes unnecessary.

I'd prefer doing that in a separate step though.

So drivers which use legacy PCI PM callbacks or have no PM callbacks
should currently not call pci_save_state() on probe without manually
setting state_saved = false afterwards.  If they neglect that, then
pci_legacy_suspend_late() and pci_pm_freeze_noirq() will not call
pci_save_state() on the next suspend cycle and so the state that
will be restored on resume is the one recorded on probe, not the
one that the device had on suspend.  If these two states happen
to be identical, there's no problem.

> > > > +++ b/drivers/pci/bus.c
> > > > @@ -358,6 +358,13 @@ void pci_bus_add_device(struct pci_dev *dev)
> > > >  	pci_bridge_d3_update(dev);
> > > >  
> > > >  	/*
> > > > +	 * Save config space for error recoverability.  Clear state_saved
> > > > +	 * to detect whether drivers invoked pci_save_state() on suspen
[...]
> > > Can we expand this a little to explain how this is detected and what
> > > drivers *should* be doing?
[...]
> Yes.  I should have proposed some text for the comment, e.g.,
> 
>   Save config space for error recoverability.  Clear state_saved.  If
>   driver calls pci_save_state() again, state_saved will be set and
>   we'll know that on suspend, the PCI core shouldn't call
>   pci_save_state() or change the device power state.

I'm fine with rewording the code comment like this, as well as splitting
the code comment as suggested by Rafael.  Would you prefer amending the
code comment when applying or should I respin with a reworded comment?

Again, clearing state_saved in pci_bus_add_device() is just a temporary
measure to retain the existing behavior.  It (and an accompanying code
comment) can be dropped once pci_legacy_suspend() and pci_pm_freeze()
are fixed.

> I'm just wishing for a more concrete mention of "pci_save_state()",
> since that's where the critical "state_saved" flag is updated.
> 
> And I'm not sure Documentation/ includes anything about the idea of
> a driver using pci_save_state() to capture the state it wants to
> restore after an error.  I guess that's obvious now that I write it
> down, but I'm sure learning a lot from this conversation :)

Okay, noted that the documentation could be improved.  I'd be glad
if this could be postponed to a separate step as well though.
I can only address problems one at a time. :)

Thanks,

Lukas


More information about the Linuxppc-dev mailing list