[Skiboot] nvlink2 topology

Alexey Kardashevskiy aik at ozlabs.ru
Fri Jul 27 11:12:32 AEST 2018



On 26/07/2018 18:08, Alexey Kardashevskiy wrote:
> 
> 
> On 26/07/2018 16:38, Alistair Popple wrote:
>> On Thursday, 26 July 2018 4:10:21 PM AEST Alexey Kardashevskiy wrote:
>>>
>>> On 26/07/2018 14:34, Alistair Popple wrote:
>>>> Hi Alexey,
>>>>
>>>> On Thursday, 26 July 2018 12:56:20 PM AEST Alexey Kardashevskiy wrote:
>>>>>
>>>>> On 26/07/2018 03:53, Reza Arbab wrote:
>>>>>> On Tue, Jul 24, 2018 at 12:12:43AM +1000, Alexey Kardashevskiy wrote:
>>>>>>> But before I try this, the existing tree seems to have a problem at
>>>>>>> (same with another xscom node):
>>>>>>> /sys/firmware/devicetree/base/xscom at 603fc00000000/npu at 5011000
>>>>>>> ./link at 4/ibm,slot-label
>>>>>>>                 "GPU2"
>>>>>>> ./link at 2/ibm,slot-label
>>>>>>>                 "GPU1"
>>>>>>> ./link at 0/ibm,slot-label
>>>>>>>                 "GPU0"
>>>>>>> ./link at 5/ibm,slot-label
>>>>>>>                 "GPU2"
>>>>>>> ./link at 3/ibm,slot-label
>>>>>>>                 "GPU1"
>>>>>>> ./link at 1/ibm,slot-label
>>>>>>>                 "GPU0"
>>>>>>>
>>>>>>> This comes from hostboot.
>>>>>>> Witherspoon_Design_Workbook_v1.7_19June2018.pdf on page 39 suggests that
>>>>>>> link at 3 and link at 5 should be swapped. Which one is correct?
>>>>>>
>>>>>> I would think link at 3 should be "GPU2" and link at 5 should be "GPU1".
>>>>
>>>> The link numbering in the device-tree is based on CPU NDL link index. As the
>>>> workbook does not contain CPU link indicies
>>>
>>> It does, page 39.
>>
>> Where? I see the GPU link numbers in the GPU boxes on the right but none on the
>> CPU side (yellow boxes on the left). The CPU side only has PHY lane masks
>> listed. The numbers in the GPU boxes are GPU link numbers.
> 
> 
> Ah, counting them from top to bottom does not work. Anyway, I got this
> from Ryan:
> 
> P90_0 -> GPU0_1; P90_1-> GPU0_5, P90_2 -> GPU1_1, P90_5 -> GPU1_5; P90_4
> -> GPU2_3; P90_3 -> GPU2_5
> 
> and he could not tell what document this is from. And it was
> specifically mentioned that 'nvlinks 3 and 5 are "swapped"'.


Update:
witherspoon_seq_red.ppt has this mappings, these are NDL.



>>>> I suspect you are mixing these up
>>>> with the GPU link numbers which are shown. The device-tree currently contains no
>>>> information on what the GPU side link numbers are.
>>>
>>> Correct, this is what I want to add.
>>>
>>>>>> If so, it's a little surprising that this hasn't broken anything. The
>>>>>> driver has its own way of discovering what connects to what, so maybe
>>>>>> there really just isn't a consumer of these labels yet.
>>>>
>>>> You need to be careful what you are referring to here - PHY link index, NDL link
>>>> index or NTL link index. The lane-mask corresponds to the PHY link index which
>>>> is different to the CPU NDL/NTL link index as there are multiple muxes which
>>>> switch these around.
>>>
>>> So what are the link at x nodes about? PHY, NDL, NTL? The workbook does not
>>> mention NDL/NTL. What links does page 39 refer to?
>>
>> The link nodes are about NTL index.
> 
> What is swapped from my comment above? Or it is totally irrelevant?


Figured it out, it is NDL. What spec does describe this relationship?


>>>>> Can you please 1) make sure we do understand things right and these are
>>>>> not some weird muxes somewhere between GPU and P9 2) fix it? Thanks :)
>>>>
>>>> I don't think there is anything to fix here. On your original question we have
>>>> no knowledge of GPU<->GPU link topology so you would need to either hard code
>>>> this in Skiboot or get it added to the HDAT.
>>>
>>> So which is one is it then - HDAT or Skiboot?
>>
>> Perhaps Olive or Stewart have an opinion here? Ideally this would be in HDAT and
>> encoded in the MRW. In practice HDAT seems to just hardcode things anyway so I'm
>> not sure what value there is in putting it there and a hardcoded platform
>> specific table in Skiboot might be no worse.
> 
> They do not, it is either you or Reza ;)
>
>>>> Or better yet get the driver enhanced so that it uses it's own topology
>>>> detection to only bring-up CPU->GPU links in the virtualised pass-thru case.
>>>
>>> How? Enhance VFIO or IODA2 with topology detection does not seem
>>> possible without a document describing it. And we do not need to detect
>>> anything, we actually know exactly what the topology is, from the workbook.
>>
>> Enhance the NVIDIA Device Driver. The device driver running in the guest should
>> be able to determine which links are CPU-GPU vs. GPU-GPU links and disable just
>> the GPU-GPU links.
> 
> 
> No, we do not want to trust the guest to do the right thing.



-- 
Alexey


More information about the Skiboot mailing list