[PATCH AUTOSEL 6.17-5.4] powerpc/eeh: Use result of error_detected() in uevent

Sasha Levin sashal at kernel.org
Sun Oct 26 02:54:48 AEDT 2025


From: Niklas Schnelle <schnelle at linux.ibm.com>

[ Upstream commit 704e5dd1c02371dfc7d22e1520102b197a3b628b ]

Ever since uevent support was added for AER and EEH with commit
856e1eb9bdd4 ("PCI/AER: Add uevents in AER and EEH error/resume"), it
reported PCI_ERS_RESULT_NONE as uevent when recovery begins.

Commit 7b42d97e99d3 ("PCI/ERR: Always report current recovery status for
udev") subsequently amended AER to report the actual return value of
error_detected().

Make the same change to EEH to align it with AER and s390.

Suggested-by: Lukas Wunner <lukas at wunner.de>
Link: https://lore.kernel.org/linux-pci/aIp6LiKJor9KLVpv@wunner.de/
Signed-off-by: Niklas Schnelle <schnelle at linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>
Reviewed-by: Lukas Wunner <lukas at wunner.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy at linux.intel.com>
Acked-by: Mahesh Salgaonkar <mahesh at linux.ibm.com>
Link: https://patch.msgid.link/20250807-add_err_uevents-v5-3-adf85b0620b0@linux.ibm.com
Signed-off-by: Sasha Levin <sashal at kernel.org>
---

LLM Generated explanations, may be completely bogus:

YES

Rationale
- Fixes incorrect uevent status at start of EEH recovery: the code
  currently emits a uevent with `PCI_ERS_RESULT_NONE` regardless of what
  the driver reported via `error_detected()`. This misrepresents the
  actual recovery status to user space.
- The fix makes EEH behave like AER (already fixed by commit
  7b42d97e99d3) and s390, improving cross-arch consistency and user
  space expectations.

Evidence in code
- Current EEH behavior: emits BEGIN_RECOVERY unconditionally at error
  detection
  - `pci_uevent_ers(pdev, PCI_ERS_RESULT_NONE);` is called after
    `error_detected()` even if the driver “votes” differently (e.g.,
    DISCONNECT/NEED_RESET): arch/powerpc/kernel/eeh_driver.c:337
- Proposed change: pass actual driver result
  - Changes the above call to `pci_uevent_ers(pdev, rc);`, where `rc` is
    the result of `driver->err_handler->error_detected()` captured just
    above: arch/powerpc/kernel/eeh_driver.c:337
- uevent mapping semantics (what user space sees) are centralized in
  `pci_uevent_ers()`:
  - NONE/CAN_RECOVER -> `ERROR_EVENT=BEGIN_RECOVERY`, `DEVICE_ONLINE=0`
  - RECOVERED -> `ERROR_EVENT=SUCCESSFUL_RECOVERY`, `DEVICE_ONLINE=1`
  - DISCONNECT -> `ERROR_EVENT=FAILED_RECOVERY`, `DEVICE_ONLINE=0`
  - Others (e.g., NEED_RESET) -> no immediate uevent (consistent with
    AER)
  - drivers/pci/pci-driver.c:1595
- AER already reports actual `error_detected()` return value to udev:
  - `pci_uevent_ers(dev, vote);` after computing `vote` in
    `report_error_detected()`: drivers/pci/pcie/err.c:83
- EEH already emits final-stage uevents correctly (unchanged by this
  patch):
  - Success at resume: `pci_uevent_ers(edev->pdev,
    PCI_ERS_RESULT_RECOVERED);` arch/powerpc/kernel/eeh_driver.c:432
  - Failure path: `pci_uevent_ers(pdev, PCI_ERS_RESULT_DISCONNECT);`
    arch/powerpc/kernel/eeh_driver.c:462

Why this is a bugfix suitable for stable
- User-visible correctness: With the current code, user space always
  sees “BEGIN_RECOVERY” even when drivers have already indicated an
  unrecoverable state (e.g., DISCONNECT). The patch ensures uevents
  reflect the true state immediately, matching AER behavior introduced
  by 7b42d97e99d3.
- Minimal, contained change: One-line change in a single architecture-
  specific file (PowerPC EEH). No API/ABI changes; only corrects the
  parameter passed to an existing helper.
- No architectural change: Keeps existing EEH flow; only adjusts the
  uevent status emitted at a single step.
- Low regression risk: AER has used this semantic for years;
  `pci_uevent_ers()` already handles `rc` values. EEH already emits
  RECOVERED/DISCONNECT at later stages; this makes the initial event
  consistent.
- Aligns cross-arch semantics: Consistent uevent reporting across AER,
  EEH, and s390 reduces user space special-casing and potential errors.

Potential side effects and why acceptable
- For drivers returning `PCI_ERS_RESULT_DISCONNECT` at
  `error_detected()`, user space will now see `FAILED_RECOVERY`
  immediately instead of a misleading `BEGIN_RECOVERY`. This is a
  correctness fix.
- For returns like `PCI_ERS_RESULT_NEED_RESET`, no initial uevent is
  emitted (consistent with AER); user space will still receive final
  RECOVERED/FAILED, as today. Any scripts that strictly expected an
  initial BEGIN_RECOVERY for all cases are already inconsistent with AER
  and should not rely on that behavior.

Historical context
- Uevent support was added by 856e1eb9bdd4 (“PCI/AER: Add uevents in AER
  and EEH error/resume”), initially emitting `NONE` at error detection
  for both AER and EEH.
- AER was corrected by 7b42d97e99d3 (“PCI/ERR: Always report current
  recovery status for udev”) to emit the actual `error_detected()`
  result.
- This patch brings EEH to parity with that established AER behavior.

Conclusion
- This is a targeted, low-risk correctness fix that improves user space
  observability and cross-arch consistency without changing kernel-side
  recovery logic. It fits stable backport rules (important bugfix,
  minimal change, low regression risk, confined to a subsystem).

 arch/powerpc/kernel/eeh_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index 48ad0116f3590..ef78ff77cf8f2 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -334,7 +334,7 @@ static enum pci_ers_result eeh_report_error(struct eeh_dev *edev,
 	rc = driver->err_handler->error_detected(pdev, pci_channel_io_frozen);
 
 	edev->in_error = true;
-	pci_uevent_ers(pdev, PCI_ERS_RESULT_NONE);
+	pci_uevent_ers(pdev, rc);
 	return rc;
 }
 
-- 
2.51.0



More information about the Linuxppc-dev mailing list