<!DOCTYPE html>

<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p>Hello Ritesh</p>

    <p>I think, what you are proposing to add <span

      style="white-space: pre-wrap">dev->bus_dma_limit in the check might work. In the case of PowerNV, this is not set, but dev->dma_ops_bypass is set. So, for PowerNV, it will fall back to how it was before.</span></p>

    <p>Also, since these both are set in LPAR mode, the current patch

      as-is will work.</p>

    <p>Dan, can you please try Ritesh proposed fix on your PowerNV box?

      I am not able to lay my hands on a PowerNV box yet.</p>

    <p>Thanks,</p>

    <p>Gaurav</p>

    <div class="moz-cite-prefix">On 3/25/26 7:12 AM, Ritesh Harjani

      (IBM) wrote:<br>

    </div>

    <blockquote type="cite" cite="mid:5x6knm5q.ritesh.list@gmail.com">

      <pre wrap="" class="moz-quote-pre">Gaurav Batra <a class="moz-txt-link-rfc2396E" href="mailto:gbatra@linux.ibm.com"><gbatra@linux.ibm.com></a> writes:

Hi Gaurav,

</pre>

      <blockquote type="cite">

        <pre wrap="" class="moz-quote-pre">Hello Ritesh/Dan,

Here is the motivation for my patch and thoughts on the issue.

Before my patch, there were 2 scenarios to consider where, even when the 

memory

was pre-mapped for DMA, coherent allocations were getting mapped from 2GB

default DMA Window. In case of pre-mapped memory, the allocations should 

not be

directed towards 2GB default DMA window.

1. AMD GPU which has device DMA mask > 32 bits but less then 64 bits. In 

this

case the PHB is put into Limited Addressability mode.

    This scenario doesn't have vPMEM

2. Device that supports 64-bit DMA mask. The LPAR has vPMEM assigned.

In both the above scenarios, IOMMU has pre-mapped RAM from DDW (64-bit 

PPC DMA

window).

Lets consider code paths for both the case, before my patch

1. AMD GPU

dev->dma_ops_bypass = true

dev->bus_dma_limit = 0

- Here the AMD controller shows 3 functions on the PHB.

- After the first function is probed, it sees that the memory is pre-mapped

   and doesn't direct DMA allocations towards 2GB default window.

   So, dma_go_direct() worked as expected.

- AMD GPU driver, adds device memory to system pages. The stack is as below

add_pages+0x118/0x130 (unreliable)

pagemap_range+0x404/0x5e0

memremap_pages+0x15c/0x3d0

devm_memremap_pages+0x38/0xa0

kgd2kfd_init_zone_device+0x110/0x210 [amdgpu]

amdgpu_device_ip_init+0x648/0x6d8 [amdgpu]

amdgpu_device_init+0xb10/0x10c0 [amdgpu]

amdgpu_driver_load_kms+0x2c/0xb0 [amdgpu]

amdgpu_pci_probe+0x2e4/0x790 [amdgpu]

- This changed max_pfn to some high value beyond max RAM.

- Subsequently, for each other functions on the PHB, the call to

   dma_go_direct() will return false which will then direct DMA 

allocations towards

   2GB Default DMA window even if the memory is pre-mapped.

    dev->dma_ops_bypass is true, dma_direct_get_required_mask() resulted 

in large

    value for the mask (due to changed max_pfn) which is beyond AMD GPU 

device DMA mask

2. Device supports 64-bit DMA mask. The LPAR has vPMEM assigned

dev->dma_ops_bypass = false

dev->bus_dma_limit = has some value depending on size of RAM (eg.  

0x0800001000000000)

- Here the call to dma_go_direct() returns false since 

dev->dma_ops_bypass = false.

I crafted the solution to cover both the case. I tested today on an LPAR

with 7.0-rc4 and it works with AMDGPU.

With my patch, allocations will go towards direct only when 

dev->dma_ops_bypass = true,

which will be the case for "pre-mapped" RAM.

Ritesh mentioned that this is PowerNV. I need to revisit this patch and 

see why it is failing on PowerNV.

...