No subject


Thu Apr 9 12:00:03 AEST 2026


  "[...] during normal operation, the link might fail and go down. After
  this link-down event, the controller requests the DWC_pcie_clkrst.v
  module to hot-reset the controller. There is no difference in the
  handling of a link-down reset or a hot reset; the controller asserts
  the link_req_rst_not output requesting the DWC_pcie_clkrst.v module to
  reset the controller."

In some of the adjacent documentation (and confirmed in local testing),
it suggests that this automatic reset will also reset various DBI (i.e.,
PCIe config space) registers. It also seems as if there's not really a
good way to completely stop this automatic reset -- the docs mention
some SW methods prevent the reset, but they all seem racy or incomplete.

Anyway, I think this implies that patch 1 is somewhat wrong [1]. It
includes some code like this:

		pci_save_state(dev);
		ret = host->reset_root_port(host, dev);
		if (ret)
			pci_err(dev, "Failed to reset Root Port: %d\n", ret);
		else
			/* Now restore it on success */
			pci_restore_state(dev);

That first line (pci_save_state()) is prone to saving invalid state,
depending on whether the link-down event has finished flushing and
resetting the controller yet or not. The resulting impact is a bit hard
to judge, depending on what (mis)configuration you end up with.

I also noticed commit a2f1e22390ac ("PCI/ERR: Ensure error
recoverability at all times") was merged recently. With that change, I
believe it is now safe to perform pci_restore_state() even without
pci_save_state() here.

So ... can we remove pci_save_state() from
pcibios_reset_secondary_bus()? Might that help? It sounds like my above
observations *may* match Richard's reports, but I'm not sure. And
anyway, the documented hardware behavior is racy, so it's hard to
propose a foolproof solution.

Brian

[1] At least, for DesignWare controllers.

> [   87.864423] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
> 
> root at imx95evk:~#
> root at imx95evk:~# cat /proc/interrupts | grep lnk;
> 273:          2          0          0          0          0          0    GICv3 342 Level     PCIe PME, lnk_notify
> root at imx95evk:~#
> root at imx95evk:~#
> root at imx95evk:~# ./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff; Writing 32-bit va
> [  107.028086] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d00) link down detected lue 0x4001FF to a
> [  107.037018] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ddress 0x4C30003C
> [  107.045137] pcieport 0000:00:00.0: Recovering Root Port due to Link Down
> 
> Writing 32-bit
> [  107.053332] pci 0000:01:00.0: AER: can't recover (no error_detected callback)  value 0x1FF to address 0x4C30003C root at imx95evk:~#
> [  107.282146] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> [  107.470801] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> [  107.602823] pcieport 0000:00:00.0: Root Port has been reset
> [  107.608601] pcieport 0000:00:00.0: AER: device recovery failed
> [  107.614497] imx6q-pcie 4c300000.pcie: Rescan bus after link up is detected
> [  107.623805] pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [  107.632281] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01
> 
> root at imx95evk:~#
> root at imx95evk:~# cat /proc/interrupts | grep lnk;
> 273:          4          0          0          0          0          0    GICv3 342 Level     PCIe PME, lnk_notify
> root at imx95evk:~#
> root at imx95evk:~# ./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff; Writing 32-bit va
> [  133.424041] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d00) link down detected lue 0x4001FF to a
> [  133.432954] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ddress 0x4C30003C
> [  133.441106] pcieport 0000:00:00.0: Recovering Root Port due to Link Down
> 
> Writing 32-bit
> [  133.449309] pci 0000:01:00.0: AER: can't recover (no error_detected callback)  value 0x1FF to address 0x4C30003C root at imx95evk:~#
> [  133.677824] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> [  133.870414] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> [  134.002534] pcieport 0000:00:00.0: Root Port has been reset
> [  134.008307] pcieport 0000:00:00.0: AER: device recovery failed
> [  134.014193] imx6q-pcie 4c300000.pcie: Rescan bus after link up is detected
> [  134.023418] pcieport 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
> [  134.031881] pci_bus 0000:01: busn_res: [bus 01] end is updated to 01
> 
> root at imx95evk:~# ./memtool 4c30003c=004001ff; ./memtool 4c30003c=000001ff; Writing 32-bit va
> [  140.149713] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000d00) link down detected lue 0x4001FF to a
> [  140.158614] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down ddress 0x4C30003C
> [  140.166779] pcieport 0000:00:00.0: Recovering Root Port due to Link Down
> [  140.174981] pci 0000:01:00.0: AER: can't recover (no error_detected callback) Writing 32-bit value 0x1FF to address 0x4C30003C root at imx95evk:~#
> [  140.401605] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> [  140.590491] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> [  140.596206] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected
> 
> root at imx95evk:~#
> [  141.630311] pcieport 0000:00:00.0: Data Link Layer Link Active not set in 100 msec
> [  141.637950] pcieport 0000:00:00.0: Failed to reset Root Port: -25
> [  141.644095] pcieport 0000:00:00.0: AER: subordinate device reset failed
> [  141.650883] pcieport 0000:00:00.0: AER: device recovery failed
> [  141.656784] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down
> [  141.663520] pcieport 0000:00:00.0: Recovering Root Port due to Link Down
> [  141.670271] pci 0000:01:00.0: AER: can't recover (no error_detected callback)
> [  141.897701] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> [  142.086341] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> [  142.092038] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected
> [  143.126273] pcieport 0000:00:00.0: Data Link Layer Link Active not set in 100 msec
> [  143.133919] pcieport 0000:00:00.0: Failed to reset Root Port: -25
> [  143.140052] pcieport 0000:00:00.0: AER: subordinate device reset failed
> [  143.146747] pcieport 0000:00:00.0: AER: device recovery failed
> [  143.152604] imx6q-pcie 4c300000.pcie: Stop root bus and handle link down
> [  143.159314] pcieport 0000:00:00.0: Recovering Root Port due to Link Down
> [  143.166022] pci 0000:01:00.0: AER: can't recover (no error_detected callback)
> [  143.389723] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000700) link up detected
> [  143.582294] imx6q-pcie 4c300000.pcie: PCIe Gen.3 x1 link up
> [  143.587996] imx6q-pcie 4c300000.pcie: PCIe(LNK_STS:0x00000c00) link down detected
> 
> 
> Thanks.
> Best Regards
> Richard Zhu


More information about the Linuxppc-dev mailing list