[PATCH 1/2] PCI: Ensure error recoverability at all times

Lukas Wunner lukas at wunner.de
Wed Nov 19 21:02:23 AEDT 2025


On Fri, Nov 14, 2025 at 05:39:27PM -0600, Bjorn Helgaas wrote:
> On Fri, Nov 14, 2025 at 07:58:19PM +0100, Lukas Wunner wrote:
> > On Thu, Nov 13, 2025 at 10:15:56AM -0600, Bjorn Helgaas wrote:
> > > It seems like there are two things going on here, and I'm not sure
> > > they're completely compatible:
> > > 
> > >   1) Driver calls pci_save_state() to take over device power
> > >      management and prevent the PCI core from doing it.
> > > 
> > >   2) Driver calls pci_save_state() to capture the device state it
> > >      wants to restore when recovering from an error.
> > > 
> > > Shouldn't a driver be able to do 2) without also getting 1)?
> > 
> > In general, it can:
> > 
> > A number of drivers already call pci_save_state() on probe to capture
> > the state for subsequent error recovery.  If the driver has modified
> > config space in its probe hook, then calling pci_save_state() continues
> > to make sense.  If the driver has *not* modified config space, then the
> > call becomes obsolete once this patch is accepted.
> 
> So I guess "state_saved == true" means "driver does its own power
> management and PCI core shouldn't do it", and drivers that want 2) but
> not 1) just need to set state_saved = false after they call
> pci_save_state()?
> 
> That makes sense in sort of a weird way that makes my head hurt every
> time I try to understand it.

I agree it defies common sense.  So I've just submitted a series
which adds the missing "state_saved = false" in the legacy suspend
and !pm codepaths:

https://lore.kernel.org/r/094f2aad64418710daf0940112abe5a0afdc6bce.1763483367.git.lukas@wunner.de/

After this patch, the flag is always cleared before commencing the
suspend sequence and hence there is no longer a need for drivers to
clear state_saved after they call pci_save_state().  They can just
call pci_save_state() if they've modified Config Space in their
probe hook and be done with it.

> After error recovery, those drivers will see the state the driver
> identified when it called pci_save_state().  But after resume, they
> will see the state the PCI core saved at suspend time.  Right?

Correct.  The expectation is generally that they're identical.

E.g. I've just double-checked that we're enabling wakeup *after*
pci_save_state() in pci_pm_suspend_noirq().  So when the saved
state is restored on resume and later re-used for error recovery,
we're restoring the device with wakeup disabled, which is the
right thing to do because the device is in D0 after error recovery
issues a reset.

(pci_pm_suspend_noirq() first calls pci_save_state() and then calls
pci_prepare_to_sleep(), which enables wakeup.)

Thanks,

Lukas


More information about the Linuxppc-dev mailing list