[PATCH] powerpc/powernv/iommu: iommu incorrectly bypass DMA APIs
Gaurav Batra
gbatra at linux.ibm.com
Tue Apr 7 08:49:39 AEST 2026
On 4/3/26 10:53 PM, Shivaprasad G Bhat wrote:
> On 4/1/26 6:35 AM, Ritesh Harjani (IBM) wrote:
>> ++CC few people who would be interested in this fix.
>>
>> Gaurav Batra <gbatra at linux.ibm.com> writes:
>>
>>> In a PowerNV environment, for devices that supports DMA mask less than
>>> 64 bit but larger than 32 bits, iommu is incorrectly bypassing DMA
>>> APIs while allocating and mapping buffers for DMA operations.
>>>
>>> Devices are failing with ENOMEN during probe with the following
>>> messages
>>>
>>> amdgpu 0000:01:00.0: [drm] Detected VRAM RAM=4096M, BAR=4096M
>>> amdgpu 0000:01:00.0: [drm] RAM width 128bits GDDR5
>>> amdgpu 0000:01:00.0: iommu: 64-bit OK but direct DMA is limited by 0
>>> amdgpu 0000:01:00.0: dma_iommu_get_required_mask: returning bypass
>>> mask 0xfffffffffffffff
>>> amdgpu 0000:01:00.0: 4096M of VRAM memory ready
>>> amdgpu 0000:01:00.0: 32570M of GTT memory ready.
>>> amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
>>> amdgpu 0000:01:00.0: [drm] Debug VRAM access will use slowpath MM
>>> access
>>> amdgpu 0000:01:00.0: [drm] GART: num cpu pages 4096, num gpu pages
>>> 65536
>>> amdgpu 0000:01:00.0: [drm] PCIE GART of 256M enabled (table at
>>> 0x000000F4FFF80000).
>>> amdgpu 0000:01:00.0: (-12) failed to allocate kernel bo
>>> amdgpu 0000:01:00.0: (-12) create WB bo failed
>>> amdgpu 0000:01:00.0: amdgpu_device_wb_init failed -12
>>> amdgpu 0000:01:00.0: amdgpu_device_ip_init failed
>>> amdgpu 0000:01:00.0: Fatal error during GPU init
>>> amdgpu 0000:01:00.0: finishing device.
>>> amdgpu 0000:01:00.0: probe with driver amdgpu failed with error -12
>>> amdgpu 0000:01:00.0: ttm finalized
>>>
>>> Fixes: 1471c517cf7d ("powerpc/iommu: bypass DMA APIs for coherent
>>> allocations for pre-mapped memory")
>>> Reported-by: Dan Horák <dan at danny.cz>
>>> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5039
>> We could even add the lore link so that in future people can find the
>> discussion.
>> Closes:
>> https://lore.kernel.org/linuxppc-dev/20260313142351.609bc4c3efe1184f64ca5f44@danny.cz/
>>
>>> Tested-by: Dan Horak <dan at danny.cz>
>>> Signed-off-by: Ritesh Harjani <ritesh.list at gmail.com>
>> Feel free to change this to the following, I mainly only suggested
>> the fix :)
>>
>> Suggested-by: Ritesh Harjani (IBM) <ritesh.list at gmail.com>
>>
>>
>>> Signed-off-by: Gaurav Batra <gbatra at linux.ibm.com>
>>> ---
>> Generally this info...
>>> I am working on testing the patch in an LPAR with AMDGPU. I will
>>> update the
>>> results soon.
>> ... goes below the three dashes in here ^^^
>> This is the same place where people also update the patch change log.
>>
>> But sure, thanks for updating. As for this patch, it looks good to me.
>> So, feel free to add:
>>
>> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list at gmail.com>
>>
>>
>>> arch/powerpc/kernel/dma-iommu.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/dma-iommu.c
>>> b/arch/powerpc/kernel/dma-iommu.c
>>> index 73e10bd4d56d..8b4de508d2eb 100644
>>> --- a/arch/powerpc/kernel/dma-iommu.c
>>> +++ b/arch/powerpc/kernel/dma-iommu.c
>>> @@ -67,7 +67,7 @@ bool arch_dma_unmap_sg_direct(struct device *dev,
>>> struct scatterlist *sg,
>>> }
>>> bool arch_dma_alloc_direct(struct device *dev)
>>> {
>>> - if (dev->dma_ops_bypass)
>>> + if (dev->dma_ops_bypass && dev->bus_dma_limit)
>>> return true;
>>> return false;
>>> @@ -75,7 +75,7 @@ bool arch_dma_alloc_direct(struct device *dev)
>>> bool arch_dma_free_direct(struct device *dev, dma_addr_t
>>> dma_handle)
>>> {
>>> - if (!dev->dma_ops_bypass)
>>> + if (!dev->dma_ops_bypass || !dev->bus_dma_limit)
>>> return false;
>
>
> While this works, looking more on why the 1471c517cf7dae needed
> arch_dma_[alloc|free]_direct()
>
> functions was because dev->bus_dma_limit being not updated when the
> new memory pages were
>
> getting added in add_pages() case leading to the can_map_direct()
> checks failing later when needed.
>
> Looks like these functions were modeled after
> arch_dma_[map|unmap]_phys_direct() which were
>
> introduced for handing a similar need.
>
>
> Do we get iommu_mem_notifier() call for action MEM_GOING_ONLINE in
> device memory being
>
> made online case? If yes, we can just update the bus_dma_limit for
> each device in the iommu
>
> group to include the new max_pfn. This makes it sound wrong as the
> pages though configured
>
> as system memory, are still device private and the memory notifiers
> are for the real memory
>
> hotplug/unplug cases.
>
>
> In the commit 7170130e4c below, Balibir does point this out to be a
> problem and prevents updating
>
> max_pfn for device private memory. He has mentioned it to be
> considered for PPC too.
>
> The dma_addressing_limited() of x86 is same as PPC can_map_direct()
> and we know why
>
> arch_dma_[alloc|free]_direct() would not be needed anymore.
When pmemory is converted to "system-ram" it moves "max_pfn" as well.
The command is "daxctl reconfigure-device --mode=system-ram dax0.0
--force". This needs to be considered as well.
Thanks,
Gaurav
>
>
> commit 7170130e4c72ce0caa0cb42a1627c635cc262821
> Author: Balbir Singh <balbirs at nvidia.com>
> Date: Tue Apr 1 11:07:52 2025 +1100
>
> x86/mm/init: Handle the special case of device private pages in
> add_pages(), to not increase max_pfn and trigger
> dma_addressing_limited() bounce buffers
>
> So, I think we need to bring the changes from 7170130e4c72ce0c into PPC.
>
>
> With that, the original hunk setting the bypass unconditionally in
> 1471c517cf7dae can be
>
> changed like below, and remove the redundant
> arch_dma_[alloc|free]_direct() functions.
>
>
> diff --git a/arch/powerpc/kernel/dma-iommu.c
> b/arch/powerpc/kernel/dma-iommu.c
> index aa3689d61917..120800b962cf 100644
> --- a/arch/powerpc/kernel/dma-iommu.c
> +++ b/arch/powerpc/kernel/dma-iommu.c
> @@ -150,7 +150,7 @@ int dma_iommu_dma_supported(struct device *dev,
> u64 mask)
> * 1:1 mapping but it is somehow limited.
> * ibm,pmemory is one example.
> */
> - dev->dma_ops_bypass = dev->bus_dma_limit == 0;
> + dev->dma_ops_bypass = (dev->bus_dma_limit == 0) ||
> (min_not_zero(mask, dev->bus_dma_limit) >=
> dma_direct_get_required_mask(dev));
> if (!dev->dma_ops_bypass)
> dev_warn(dev,
> "iommu: 64-bit OK but direct DMA is
> limited by %llx\n",
>
> Suggested-by: Shivaprasad G Bhat <sbhat at linux.ibm.com>
>
>
> Thanks,
>
> Shivaprasad
>
More information about the Linuxppc-dev
mailing list