[PATCH 0/8] dma-mapping: migrate to physical address-based API

Jason Gunthorpe jgg at nvidia.com
Wed Jul 30 00:03:55 AEST 2025


On Fri, Jul 25, 2025 at 09:05:46PM +0100, Robin Murphy wrote:

> But given what we do already know of from decades of experience, obvious
> question: For the tiny minority of users who know full well when they're
> dealing with a non-page-backed physical address, what's wrong with using
> dma_map_resource?

I was also pushing for this, that we would have two seperate paths:

- the phys_addr was guarenteed to have a KVA (and today also struct page)
- the phys_addr is non-cachable and no KVA may exist

This is basically already the distinction today between map resource
and map page.

The caller would have to look at what it is trying to map, do the P2P
evaluation and then call the cachable phys or resource path(s).

Leon, I think you should revive the work you had along these lines. It
would address my concerns with the dma_ops changes too. I continue to
think we should not push non-cachable, non-KVA MMIO down the map_page
ops, those should use the map_resource op.

> Does it make sense to try to consolidate our p2p infrastructure so
> dma_map_resource() could return bus addresses where appropriate?

For some users but not entirely :( The sg path for P2P relies on
storing information inside the scatterlist so unmap knows what to do.

Changing map_resource to return a similar flag and then having drivers
somehow store that flag and give it back to unmap is not a trivial
change. It would be a good API for simple drivers, and I think we
could build such a helper calling through the new flow. But places
like DMABUF that have more complex lists will not like it.

For them we've been following the approach of BIO where the
driver/subystem will maintain a mapping list and be aware of when the
P2P information is changing. Then it has to do different map/unmap
sequences based on its own existing tracking.

I view this as all very low level infrastructure, I'm really hoping we
can get an agreement with Chritain and build a scatterlist replacement
for DMABUF that encapsulates all this away from drivers like BIO does
for block.

But we can't start that until we have a DMA API working fully for
non-struct page P2P memory. That is being driven by this series and
the VFIO DMABUF implementation on top of it.

> Are we trying to remove struct page from the kernel altogether? 

Yes, it is a very long term project being pushed along with the
folios, memdesc conversion and so forth. It is huge, with many
aspects, but we can start to reasonably work on parts of them
independently.

A mid-term dream is to be able to go from pin_user_pages() -> DMA
without drivers needing to touch struct page at all. 

This is a huge project on its own, and we are progressing it slowly
"bottom up" by allowing phys_addr_t in the DMA API then we can build
more infrastructure for subsystems to be struct-page free, culminating
in some pin_user_phyr() and phys_addr_t bio_vec someday.

Certainly a big part of this series is influenced by requirements to
advance pin_user_pages() -> DMA, while the other part is about
allowing P2P to work using phys_addr_t without struct page.

Jason


More information about the Linuxppc-dev mailing list