[PPC] Boot problems after the pci-v6.18-changes
Herve Codina
herve.codina at bootlin.com
Thu Oct 23 20:19:47 AEDT 2025
Hi Manivannan,
On Thu, 23 Oct 2025 14:19:46 +0530
Manivannan Sadhasivam <mani at kernel.org> wrote:
> On Thu, Oct 23, 2025 at 09:38:13AM +0200, Herve Codina wrote:
> > Hi Manivannan,
> >
> > On Wed, 15 Oct 2025 18:20:22 +0530
> > Manivannan Sadhasivam <mani at kernel.org> wrote:
> >
> > > Hi Herve,
> > >
> > > On Wed, Oct 15, 2025 at 01:58:11PM +0200, Herve Codina wrote:
> > > > Hi Christian,
> > > >
> > > > On Wed, 15 Oct 2025 13:30:44 +0200
> > > > Christian Zigotzky <chzigotzky at xenosoft.de> wrote:
> > > >
> > > > > Hello Herve,
> > > > >
> > > > > > On 15 October 2025 at 10:39 am, Herve Codina <herve.codina at bootlin.com> wrote:
> > > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I also observed issues with the commit f3ac2ff14834 ("PCI/ASPM: Enable all
> > > > > > ClockPM and ASPM states for devicetree platforms")
> > > > >
> > > > > Thanks for reporting.
> > > > >
> > > > > >
> > > > > > Also tried the quirk proposed in this discussion (quirk_disable_aspm_all)
> > > > > > an the quirk also fixes the timing issue.
> > > > >
> > > > > Where have you added quirk_disable_aspm_all?
> > > >
> > > > --- 8< ---
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index 214ed060ca1b..a3808ab6e92e 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -2525,6 +2525,17 @@ static void quirk_disable_aspm_l0s_l1(struct pci_dev *dev)
> > > > */
> > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ASMEDIA, 0x1080, quirk_disable_aspm_l0s_l1);
> > > >
> > > > +static void quirk_disable_aspm_all(struct pci_dev *dev)
> > > > +{
> > > > + pci_info(dev, "Disabling ASPM\n");
> > > > + pci_disable_link_state(dev, PCIE_LINK_STATE_ALL);
> > >
> > > Could you please try disabling L1SS and L0s separately to see which one is
> > > causing the issue? Like,
> > >
> > > pci_disable_link_state(dev, PCIE_LINK_STATE_L1_1 | PCIE_LINK_STATE_L1_2);
> > >
> > > pci_disable_link_state(dev, PCIE_LINK_STATE_L0S);
> > >
> >
> > I did tests and here are the results:
> >
> > - quirk pci_disable_link_state(dev, PCIE_LINK_STATE_ALL)
> > Issue not present
> >
> > - quirk pci_disable_link_state(dev, PCIE_LINK_STATE_L1_1 | PCIE_LINK_STATE_L1_2)
> > Issue present, timings similar to timings already reported
> > (hundreds of ms).
> >
> > - quirk pci_disable_link_state(dev, PCIE_LINK_STATE_L0S);
> > Issue present, timings still incorrect but lower
> > 64 bytes from 192.168.32.100: seq=10 ttl=64 time=16.738 ms
> > 64 bytes from 192.168.32.100: seq=11 ttl=64 time=39.500 ms
> > 64 bytes from 192.168.32.100: seq=12 ttl=64 time=62.178 ms
> > 64 bytes from 192.168.32.100: seq=13 ttl=64 time=84.709 ms
> > 64 bytes from 192.168.32.100: seq=14 ttl=64 time=107.484 ms
> >
>
> This is weird. Looks like all ASPM states (L0s, L1ss) are contributing to the
> increased latency, which is more than what should occur. This makes me ignore
> inspecting the L0s/L1 exit latency fields :/
>
> Bjorn sent out a patch [1] that enables only L0s and L1 by default. But it
> might not help you. I don't honestly know how you are seeing this much of the
> latency. This could the due to an issue in the PCI component (host or endpoint),
> or even the board routing. Identifying which one is causing the issue is going
> to be tricky as it would require some experimentation.
I've just tested the patch from Bjorn and I confirm that it doesn't fix my issue.
>
> If you are motivated, we can start to isolate this issue to the endpoint first.
> Is it possible for you to connect a different PCI card to your host and check
> whether you are seeing the increased latency? If the different PCI card is not
> exhibiting the same behavior, then the current device is the culprit and we
> should be able to quirk it.
Will see what I can do.
Best regards,
Hervé
More information about the Linuxppc-dev
mailing list