[PATCH kernel RFC 0/3] powerpc/pseries/iommu: GPU coherent memory pass through

Alexey Kardashevskiy aik at ozlabs.ru
Mon Oct 15 18:29:00 AEDT 2018


Ping?


On 17/09/2018 17:05, Alexey Kardashevskiy wrote:
> Ping?
> 
> The problem is still there...
> 
> 
> On 24/08/2018 13:04, Alexey Kardashevskiy wrote:
>>
>>
>> On 09/08/2018 14:41, Alexey Kardashevskiy wrote:
>>>
>>>
>>> On 25/07/2018 19:50, Alexey Kardashevskiy wrote:
>>>> I am trying to pass through a 3D controller:
>>>> [0302]: NVIDIA Corporation GV100GL [Tesla V100 SXM2] [10de:1db1] (rev a1)
>>>>
>>>> which has a quite unique feature as coherent memory directly accessible
>>>> from a POWER9 CPU via an NVLink2 transport.
>>>>
>>>> So in addition to passing a PCI device + accompanying NPU devices,
>>>> we will also be passing the host physical address range as it is done
>>>> on the bare metal system.
>>>>
>>>> The memory on the host is presented as:
>>>>
>>>> ===
>>>> [aik at yc02goos ~]$ lsprop /proc/device-tree/memory at 42000000000
>>>> ibm,chip-id      000000fe (254)
>>>> device_type      "memory"
>>>> compatible       "ibm,coherent-device-memory"
>>>> reg              00000420 00000000 00000020 00000000
>>>> linux,usable-memory
>>>>                  00000420 00000000 00000000 00000000
>>>> phandle          00000726 (1830)
>>>> name             "memory"
>>>> ibm,associativity
>>>>                  00000004 000000fe 000000fe 000000fe 000000fe
>>>> ===
>>>>
>>>> and the host does not touch it as the second 64bit value of
>>>> "linux,usable-memory" - the size - is null. Later on the NVIDIA driver
>>>> trains the NVLink2 and probes this memory and this is how it becomes
>>>> onlined.
>>>>
>>>> In the virtual environment I am planning on doing the same thing,
>>>> however there is a difference in 64bit DMA handling. The powernv
>>>> platform uses a PHB3 bypass mode and that just works but
>>>> the pseries platform uses DDW RTAS API to achieve the same
>>>> result and the problem with this is that we need a huge DMA
>>>> window to start from zero (because this GPU supports less than
>>>> 50bits for DMA address space) and cover not just present memory
>>>> but also this new coherent memory.
>>>>
>>>>
>>>> This is based on sha1
>>>> d72e90f3 Linus Torvalds "Linux 4.18-rc6".
>>>>
>>>> Please comment. Thanks.
>>>
>>>
>>> Ping?
>>
>>
>> Ping?
>>
>>>
>>>
>>>>
>>>>
>>>>
>>>> Alexey Kardashevskiy (3):
>>>>   powerpc/pseries/iommu: Allow dynamic window to start from zero
>>>>   powerpc/pseries/iommu: Force default DMA window removal
>>>>   powerpc/pseries/iommu: Use memory@ nodes in max RAM address
>>>>     calculation
>>>>
>>>>  arch/powerpc/platforms/pseries/iommu.c | 77 ++++++++++++++++++++++++++++++----
>>>>  1 file changed, 70 insertions(+), 7 deletions(-)
>>>>
>>>
>>
> 

-- 
Alexey


More information about the Linuxppc-dev mailing list