[PPC] Boot problems after the pci-v6.18-changes

Bjorn Helgaas helgaas at kernel.org
Thu Nov 6 09:09:25 AEDT 2025


[+cc luigi, Al, Roland; we broke X5000 in v6.18-rc1 by enabling ASPM
too aggressively; Christian's original report is at:
https://lore.kernel.org/r/db5c95a1-cf3e-46f9-8045-a1b04908051a@xenosoft.de
I grubbed around in lore for other X5000 users who might possibly be
able to help test our attempts to fix it]

On Mon, Nov 03, 2025 at 07:28:19PM +0100, Christian Zigotzky wrote:
> On 11/01/2025 06:06 PM, Manivannan Sadhasivam wrote:
> ...

> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index d97335a40193..74d8596b3f62 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -2524,6 +2524,7 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev
> *dev)
> >   * disable both L0s and L1 for now to be safe.
> >   */
> >  DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080,
> quirk_disable_aspm_l0s_l1);
> > +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_FREESCALE, 0x0451,
> quirk_disable_aspm_l0s_l1);
> >
> >  /*
> >   * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain

> I tested your patch with the RC4 of kernel 6.18 today. Unfortunately it
> doesn't solve the boot issue.

Thanks for testing that.  I see now why that approach doesn't work:
quirk_disable_aspm_l0s_l1() calls pci_disable_link_state(), which
updates the permissible ASPM link states, but pci_disable_link_state()
only works for devices at the downstream end of a link.  It doesn't
work at all for Root Ports, which are at the upstream end of a link.

Christian, you originally reported that both X5000 and X1000 were
broken.  I suspect X1000 may have been fixed in v6.18-rc3 by
df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree
platforms"), but I would love to have confirmation of that.

The patches below read the PCIe Link Capabilities that tell us what
ASPM link states the hardware supports, cache it, and use a quirk to
remove the states we think are broken.

Since these are just for testing, I didn't want to start a new thread.
If anybody is able to test this, you should be able to just apply this
whole email as a patch on top of v6.18-rc4.

Would appreciate the complete dmesg log of the boot if possible.  This
patch adds a quirk for the [1957:0451] Root Ports in Christian's
X5000.  If ASPM is still broken for other Root Ports, we will need the
Vendor and Device IDs to add quirks for them.

commit 9d089f88fd66 ("PCI/ASPM: Cache Link Capabilities to enable quirks to override")
Author: Bjorn Helgaas <bhelgaas at google.com>
Date:   Wed Nov 5 12:23:38 2025 -0600

    PCI/ASPM: Cache Link Capabilities to enable quirks to override
    
    Cache the PCIe Link Capabilities register in struct pci_dev so quirks can
    remove features to avoid hardware defects.  The idea is:
    
      - set_pcie_port_type() reads PCIe Link Capabilities and caches it in
        dev->lnkcap
    
      - EARLY or HEADER quirks can update the cached dev->lnkcap to remove
        advertised features that don't work correctly
    
      - pcie_aspm_cap_init() relies on dev->lnkcap and ignores any features not
        advertised there
    
    Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>


diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 79b965158473..c2ecfa4e92e5 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -378,15 +378,13 @@ static void pcie_set_clkpm(struct pcie_link_state *link, int enable)
 static void pcie_clkpm_cap_init(struct pcie_link_state *link, int blacklist)
 {
 	int capable = 1, enabled = 1;
-	u32 reg32;
 	u16 reg16;
 	struct pci_dev *child;
 	struct pci_bus *linkbus = link->pdev->subordinate;
 
 	/* All functions should have the same cap and state, take the worst */
 	list_for_each_entry(child, &linkbus->devices, bus_list) {
-		pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &reg32);
-		if (!(reg32 & PCI_EXP_LNKCAP_CLKPM)) {
+		if (!(child->lnkcap & PCI_EXP_LNKCAP_CLKPM)) {
 			capable = 0;
 			enabled = 0;
 			break;
@@ -567,7 +565,7 @@ static void encode_l12_threshold(u32 threshold_us, u32 *scale, u32 *value)
 
 static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 {
-	u32 latency, encoding, lnkcap_up, lnkcap_dw;
+	u32 latency, encoding;
 	u32 l1_switch_latency = 0, latency_up_l0s;
 	u32 latency_up_l1, latency_dw_l0s, latency_dw_l1;
 	u32 acceptable_l0s, acceptable_l1;
@@ -592,14 +590,10 @@ static void pcie_aspm_check_latency(struct pci_dev *endpoint)
 		struct pci_dev *dev = pci_function_0(link->pdev->subordinate);
 
 		/* Read direction exit latencies */
-		pcie_capability_read_dword(link->pdev, PCI_EXP_LNKCAP,
-					   &lnkcap_up);
-		pcie_capability_read_dword(dev, PCI_EXP_LNKCAP,
-					   &lnkcap_dw);
-		latency_up_l0s = calc_l0s_latency(lnkcap_up);
-		latency_up_l1 = calc_l1_latency(lnkcap_up);
-		latency_dw_l0s = calc_l0s_latency(lnkcap_dw);
-		latency_dw_l1 = calc_l1_latency(lnkcap_dw);
+		latency_up_l0s = calc_l0s_latency(link->pdev->lnkcap);
+		latency_up_l1 = calc_l1_latency(link->pdev->lnkcap);
+		latency_dw_l0s = calc_l0s_latency(dev->lnkcap);
+		latency_dw_l1 = calc_l1_latency(dev->lnkcap);
 
 		/* Check upstream direction L0s latency */
 		if ((link->aspm_capable & PCIE_LINK_STATE_L0S_UP) &&
@@ -814,7 +808,7 @@ static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
 static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 {
 	struct pci_dev *child = link->downstream, *parent = link->pdev;
-	u32 parent_lnkcap, child_lnkcap;
+	u32 lnkcap;
 	u16 parent_lnkctl, child_lnkctl;
 	struct pci_bus *linkbus = parent->subordinate;
 
@@ -829,9 +823,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 	 * If ASPM not supported, don't mess with the clocks and link,
 	 * bail out now.
 	 */
-	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &parent_lnkcap);
-	pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &child_lnkcap);
-	if (!(parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPMS))
+	if (!(parent->lnkcap & child->lnkcap & PCI_EXP_LNKCAP_ASPMS))
 		return;
 
 	/* Configure common clock before checking latencies */
@@ -841,10 +833,18 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 	 * Re-read upstream/downstream components' register state after
 	 * clock configuration.  L0s & L1 exit latencies in the otherwise
 	 * read-only Link Capabilities may change depending on common clock
-	 * configuration (PCIe r5.0, sec 7.5.3.6).
+	 * configuration (PCIe r5.0, sec 7.5.3.6).  Update only the exit
+	 * latencies in the cached dev->lnkcap because quirks may have
+	 * removed broken features advertised by the device.
 	 */
-	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &parent_lnkcap);
-	pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &child_lnkcap);
+	pcie_capability_read_dword(parent, PCI_EXP_LNKCAP, &lnkcap);
+	parent->lnkcap &= ~(PCI_EXP_LNKCAP_L0SEL | PCI_EXP_LNKCAP_L1EL);
+	parent->lnkcap |= lnkcap & (PCI_EXP_LNKCAP_L0SEL | PCI_EXP_LNKCAP_L1EL);
+
+	pcie_capability_read_dword(child, PCI_EXP_LNKCAP, &lnkcap);
+	child->lnkcap &= ~(PCI_EXP_LNKCAP_L0SEL | PCI_EXP_LNKCAP_L1EL);
+	child->lnkcap |= lnkcap & (PCI_EXP_LNKCAP_L0SEL | PCI_EXP_LNKCAP_L1EL);
+
 	pcie_capability_read_word(parent, PCI_EXP_LNKCTL, &parent_lnkctl);
 	pcie_capability_read_word(child, PCI_EXP_LNKCTL, &child_lnkctl);
 
@@ -864,7 +864,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 	 * given link unless components on both sides of the link each
 	 * support L0s.
 	 */
-	if (parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPM_L0S)
+	if (parent->lnkcap & child->lnkcap & PCI_EXP_LNKCAP_ASPM_L0S)
 		link->aspm_support |= PCIE_LINK_STATE_L0S;
 
 	if (child_lnkctl & PCI_EXP_LNKCTL_ASPM_L0S)
@@ -873,7 +873,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		link->aspm_enabled |= PCIE_LINK_STATE_L0S_DW;
 
 	/* Setup L1 state */
-	if (parent_lnkcap & child_lnkcap & PCI_EXP_LNKCAP_ASPM_L1)
+	if (parent->lnkcap & child->lnkcap & PCI_EXP_LNKCAP_ASPM_L1)
 		link->aspm_support |= PCIE_LINK_STATE_L1;
 
 	if (parent_lnkctl & child_lnkctl & PCI_EXP_LNKCTL_ASPM_L1)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 0ce98e18b5a8..7dab61347244 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1633,7 +1633,6 @@ void set_pcie_port_type(struct pci_dev *pdev)
 {
 	int pos;
 	u16 reg16;
-	u32 reg32;
 	int type;
 	struct pci_dev *parent;
 
@@ -1652,8 +1651,8 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->devcap);
 	pdev->pcie_mpss = FIELD_GET(PCI_EXP_DEVCAP_PAYLOAD, pdev->devcap);
 
-	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, &reg32);
-	if (reg32 & PCI_EXP_LNKCAP_DLLLARC)
+	pcie_capability_read_dword(pdev, PCI_EXP_LNKCAP, &pdev->lnkcap);
+	if (pdev->lnkcap & PCI_EXP_LNKCAP_DLLLARC)
 		pdev->link_active_reporting = 1;
 
 	parent = pci_upstream_bridge(pdev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e..ec4133ae9cae 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -361,6 +361,7 @@ struct pci_dev {
 	struct pci_dev  *rcec;          /* Associated RCEC device */
 #endif
 	u32		devcap;		/* PCIe Device Capabilities */
+	u32		lnkcap;		/* PCIe Link Capabilities */
 	u16		rebar_cap;	/* Resizable BAR capability offset */
 	u8		pcie_cap;	/* PCIe capability offset */
 	u8		msi_cap;	/* MSI capability offset */
commit ad3106e7d1bf ("PCI/ASPM: Add pcie_aspm_disable_cap() to remove advertised features")
Author: Bjorn Helgaas <bhelgaas at google.com>
Date:   Wed Nov 5 09:11:21 2025 -0600

    PCI/ASPM: Add pcie_aspm_disable_cap() to remove advertised features
    
    Add pcie_aspm_disable_cap(), which can be used by quirks to remove
    advertised features that don't work correctly.
    
    Removing advertised features prevents aspm.c from enabling them, even if
    users try to enable them via sysfs or by building the kernel with
    CONFIG_PCIEASPM_POWERSAVE or CONFIG_PCIEASPM_POWER_SUPERSAVE.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>


diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4492b809094b..858d2d6c9304 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -958,6 +958,7 @@ void pci_save_aspm_l1ss_state(struct pci_dev *dev);
 void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
 
 #ifdef CONFIG_PCIEASPM
+void pcie_aspm_disable_cap(struct pci_dev *pdev, int state);
 void pcie_aspm_init_link_state(struct pci_dev *pdev);
 void pcie_aspm_exit_link_state(struct pci_dev *pdev);
 void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked);
@@ -965,6 +966,7 @@ void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
 void pci_configure_ltr(struct pci_dev *pdev);
 void pci_bridge_reconfigure_ltr(struct pci_dev *pdev);
 #else
+static inline void pcie_aspm_disable_cap(struct pci_dev *pdev, int state) { }
 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked) { }
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index c2ecfa4e92e5..4e9e3503c569 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1530,6 +1530,18 @@ int pci_enable_link_state_locked(struct pci_dev *pdev, int state)
 }
 EXPORT_SYMBOL(pci_enable_link_state_locked);
 
+void pcie_aspm_disable_cap(struct pci_dev *pdev, int state)
+{
+	if (state & PCIE_LINK_STATE_L0S)
+		pdev->lnkcap &= ~PCI_EXP_LNKCAP_ASPM_L0S;
+	if (state & PCIE_LINK_STATE_L1)
+		pdev->lnkcap &= ~PCI_EXP_LNKCAP_ASPM_L1;
+	if (state & (PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1))
+		pci_info(pdev, "ASPM:%s%s removed from Link Capabilities to work around device defect\n",
+			 state & PCIE_LINK_STATE_L0S ? " L0s" : "",
+			 state & PCIE_LINK_STATE_L1 ? " L1" : "");
+}
+
 static int pcie_aspm_set_policy(const char *val,
 				const struct kernel_param *kp)
 {
commit 69184484d69e ("PCI/ASPM: Prevent use of ASPM on Freescale Root Ports")
Author: Bjorn Helgaas <bhelgaas at google.com>
Date:   Wed Nov 5 12:18:07 2025 -0600

    PCI/ASPM: Prevent use of ASPM on Freescale Root Ports
    
    Christian reported that f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and
    ASPM states for devicetree platforms") broke booting on the A-EON X5000 and
    AmigaOne X1000.
    
    Fixes: f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
    Fixes: df5192d9bb0e ("PCI/ASPM: Enable only L0s and L1 for devicetree platforms"
    )
    Reported-by: Christian Zigotzky <chzigotzky at xenosoft.de>
    Link: https://lore.kernel.org/r/db5c95a1-cf3e-46f9-8045-a1b04908051a@xenosoft.de
    Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>


diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 214ed060ca1b..8cf182c28d2b 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2525,6 +2525,16 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
  */
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
 
+/*
+ * Remove ASPM L0s and L1 support from cached copy of Link Capabilities so
+ * aspm.c won't try to enable them.
+ */
+static void quirk_disable_aspm_l0s_l1_cap(struct pci_dev *dev)
+{
+	pcie_aspm_disable_cap(dev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1);
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, 0x0451, quirk_disable_aspm_l0s_l1_cap);
+
 /*
  * Some Pericom PCIe-to-PCI bridges in reverse mode need the PCIe Retrain
  * Link bit cleared after starting the link retrain process to allow this


More information about the Linuxppc-dev mailing list