[Skiboot] nvlink2 topology

Alistair Popple alistair at popple.id.au
Thu Jul 26 16:38:30 AEST 2018


On Thursday, 26 July 2018 4:10:21 PM AEST Alexey Kardashevskiy wrote:
> 
> On 26/07/2018 14:34, Alistair Popple wrote:
> > Hi Alexey,
> > 
> > On Thursday, 26 July 2018 12:56:20 PM AEST Alexey Kardashevskiy wrote:
> >>
> >> On 26/07/2018 03:53, Reza Arbab wrote:
> >>> On Tue, Jul 24, 2018 at 12:12:43AM +1000, Alexey Kardashevskiy wrote:
> >>>> But before I try this, the existing tree seems to have a problem at
> >>>> (same with another xscom node):
> >>>> /sys/firmware/devicetree/base/xscom at 603fc00000000/npu at 5011000
> >>>> ./link at 4/ibm,slot-label
> >>>>                 "GPU2"
> >>>> ./link at 2/ibm,slot-label
> >>>>                 "GPU1"
> >>>> ./link at 0/ibm,slot-label
> >>>>                 "GPU0"
> >>>> ./link at 5/ibm,slot-label
> >>>>                 "GPU2"
> >>>> ./link at 3/ibm,slot-label
> >>>>                 "GPU1"
> >>>> ./link at 1/ibm,slot-label
> >>>>                 "GPU0"
> >>>>
> >>>> This comes from hostboot.
> >>>> Witherspoon_Design_Workbook_v1.7_19June2018.pdf on page 39 suggests that
> >>>> link at 3 and link at 5 should be swapped. Which one is correct?
> >>>
> >>> I would think link at 3 should be "GPU2" and link at 5 should be "GPU1".
> > 
> > The link numbering in the device-tree is based on CPU NDL link index. As the
> > workbook does not contain CPU link indicies
> 
> It does, page 39.

Where? I see the GPU link numbers in the GPU boxes on the right but none on the
CPU side (yellow boxes on the left). The CPU side only has PHY lane masks
listed. The numbers in the GPU boxes are GPU link numbers.

> > I suspect you are mixing these up
> > with the GPU link numbers which are shown. The device-tree currently contains no
> > information on what the GPU side link numbers are.
> 
> Correct, this is what I want to add.
> 
> >>> If so, it's a little surprising that this hasn't broken anything. The
> >>> driver has its own way of discovering what connects to what, so maybe
> >>> there really just isn't a consumer of these labels yet.
> > 
> > You need to be careful what you are referring to here - PHY link index, NDL link
> > index or NTL link index. The lane-mask corresponds to the PHY link index which
> > is different to the CPU NDL/NTL link index as there are multiple muxes which
> > switch these around.
> 
> So what are the link at x nodes about? PHY, NDL, NTL? The workbook does not
> mention NDL/NTL. What links does page 39 refer to?

The link nodes are about NTL index.

> >> Can you please 1) make sure we do understand things right and these are
> >> not some weird muxes somewhere between GPU and P9 2) fix it? Thanks :)
> > 
> > I don't think there is anything to fix here. On your original question we have
> > no knowledge of GPU<->GPU link topology so you would need to either hard code
> > this in Skiboot or get it added to the HDAT.
> 
> So which is one is it then - HDAT or Skiboot?

Perhaps Olive or Stewart have an opinion here? Ideally this would be in HDAT and
encoded in the MRW. In practice HDAT seems to just hardcode things anyway so I'm
not sure what value there is in putting it there and a hardcoded platform
specific table in Skiboot might be no worse.

> > Or better yet get the driver enhanced so that it uses it's own topology
> > detection to only bring-up CPU->GPU links in the virtualised pass-thru case.
> 
> How? Enhance VFIO or IODA2 with topology detection does not seem
> possible without a document describing it. And we do not need to detect
> anything, we actually know exactly what the topology is, from the workbook.

Enhance the NVIDIA Device Driver. The device driver running in the guest should
be able to determine which links are CPU-GPU vs. GPU-GPU links and disable just
the GPU-GPU links.

- Alistair



More information about the Skiboot mailing list