[PATCH v4 00/16] dma-mapping: migrate to physical address-based API
Jason Gunthorpe
jgg at nvidia.com
Sat Sep 6 03:43:24 AEST 2025
On Fri, Sep 05, 2025 at 06:20:51PM +0200, Marek Szyprowski wrote:
> I've checked the most advertised use case in
> https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=dmabuf-vfio
> and I still don't see the reason why it cannot be based
> on dma_map_resource() API? I'm aware of the little asymmetry of the
> client calls is such case, indeed it is not preety, but this should work
> even now:
>
> phys = phys_vec[i].paddr;
>
> if (is_mmio)
> dma_map_resource(phys, len, ...);
> else
> dma_map_page(phys_to_page(phys), offset_in_page(phys), ...);
>
> What did I miss?
I have a somewhat different answer than Leon..
The link path would need a resource variation too:
+ ret = dma_iova_link(attachment->dev, state,
+ phys_vec[i].paddr, 0,
+ phys_vec[i].len, dir, attrs);
+ if (ret)
+ goto err_unmap_dma;
+
+ mapped_len += phys_vec[i].len;
It is an existing bug that we don't properly handle all details of
MMIO for link.
Since this is already a phys_addr_t I wouldn't strongly argue that
should be done by adding ATTR_MMIO to dma_iova_link().
If you did that, then you'd still want a dma_(un)map_phys() helper
that handled ATTR_MMIO too. It could be an inline "if () resource else
page" wrapper like you say.
So API wise I think we have the right design here.
I think the question you are asking is how much changing to the
internals of the DMA API do you want to do to make ATTR_MMIO.
It is not zero, but there is some minimum that is less than this.
So reason #1 much of this ATTR_MMIO is needed anyhow. Being consistent
and unifying the dma_map_resource path with ATTR_MMIO should improve
the long term maintainability of the code. We already uncovered paths
where map_resource is not behaving consistently with map_page and it
is unclear if these are bugs or deliberate.
Reason #2 we do actually want to get rid of struct page usage to help
advance Matthew's work. This means we want to build a clean struct
page less path for IO. Meaning we can do phys to virt, or kmap phys,
but none of: phys to page, page to virt, page to phys. Stopping at a
phys based public API and then leaving all the phys to page/etc
conversions hidden inside is not enough.
This is why I was looking at the dma_ops path, to see just how much
page usage there is, and I found very little. So this dream is
achievable and with this series we are there for ARM64 and x86
environments.
> This patchset focuses only on the dma_map_page -> dma_map_phys rework.
> There are also other interfaces, like dma_alloc_pages() and so far
> nothing has been proposed for them so far.
That's because they already have non-page alternatives.
Allmost all places call dma_alloc_noncoherent():
static inline void *dma_alloc_noncoherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp)
{
struct page *page = dma_alloc_pages(dev, size, dma_handle, dir, gfp);
return page ? page_address(page) : NULL;
Which is KVA based.
There is only one user I found of alloc_pages:
drivers/firewire/ohci.c: ctx->pages[i] = dma_alloc_pages(dev, PAGE_SIZE, &dma_addr,
And it deliberately uses page->private:
set_page_private(ctx->pages[i], dma_addr);
So it is correct to use the struct page API.
Some usages of dma_alloc_noncontiguous() can be implemented using the
dma_iova_link() flow like drivers/vfio/pci/mlx5/cmd.c shows by using
alloc_pages_bulk() for the allocator. We don't yet have a 'dma alloc
link' operation though, and there are only 4 users of
dma_alloc_noncontiguous()..
Jason
More information about the Linuxppc-dev
mailing list