[Skiboot] PCIe training failure on PLX PCI bridge

Oliver oohall at gmail.com
Thu Jun 6 22:33:25 AEST 2019


On Thu, Jun 6, 2019 at 10:07 PM Timothy Pearson
<tpearson at raptorengineering.com> wrote:
>
> ----- Original Message -----
> > From: "Oliver" <oohall at gmail.com>
> > To: "Timothy Pearson" <tpearson at raptorengineering.com>
> > Cc: "skiboot" <skiboot at lists.ozlabs.org>
> > Sent: Thursday, June 6, 2019 1:32:48 AM
> > Subject: Re: [Skiboot] PCIe training failure on PLX PCI bridge
>
> > On Thu, Jun 6, 2019 at 2:24 PM Timothy Pearson
> > <tpearson at raptorengineering.com> wrote:
> >>
> >> Popped in a couple of new tuner cards into our "oddball hardware"
> >> testing unit (Talos II w/ two P9 DD2.2 CPUs).  These use a Pericom
> >> bridge chip, PI7C9X2G304, as the PCIe interface.
> >>
> >> The links train but are degraded so are taken back offline.  Both cards
> >> are brand new and show the exact same problem.
> >
> > What skiboot version are you using? The workaround in 02a683bf09d9
> > ("hw/phb4: Assert Link Disable bit after ETU init") might help.
>
> GIT hash d318cdb.  I'm not sure if that patch is included or not, I'd have to go check.

It's not in that build. That commit is from mid-april and the patch I
mentioned was only merged this week.

> >> Log from PHB#0 in Mata mode:
> >
> > Mata mode doesn't appear to be enabled since there's no TRACE: lines
> > in the output. Turn on the link trace output with:
> >
> > nvram -p ibm,skiboot --update-config pci-tracing=true
> >
> > It should look something like this:
> >
> >     PHB#0001[0:1]: TRACE:0x0000102101000000  0ms presence GEN1:x16:polling
> >     PHB#0001[0:1]: TRACE:0x0000001101000000 23ms          GEN1:x16:detect
> >     PHB#0001[0:1]: TRACE:0x0000102101000000 23ms presence GEN1:x16:polling
> >     PHB#0001[0:1]: TRACE:0x0000183101000000 29ms training GEN1:x16:config
> >     PHB#0001[0:1]: TRACE:0x00001c5881000000 30ms training GEN1:x08:recovery
> >     PHB#0001[0:1]: TRACE:0x00001c5883000000 30ms training GEN3:x08:recovery
> >     PHB#0001[0:1]: TRACE:0x0000144883000000 33ms presence GEN3:x08:L0
> >      PHB#0001[0:1]: TRACE:0x0000154883000000 33ms trained  GEN3:x08:L0
>
> You're right -- I learned the hard way that the NVRAM setting doesn't take effect if you fast reboot.  It took a traditional cold power down and re-IPL before the trace mode was activated, and by that time the cards were already removed again.  I'll see what I can do to get a proper trace, but if this is the same.

Oddly enough I started writing a patch to fix that the other day.
Unfortunately, it's slightly annoying to fix and I figured it'd never
be useful to anyone other than me so I dropped it. I'll go dust that
off.

> Pericom fault already known to the hardware team in Austin I'm not sure the trace is going to help much?

Possibly, I don't know the details.

Oliver


More information about the Skiboot mailing list