[PATCH v5 2/3] dma: replace zone_dma_bits by zone_dma_limit
Catalin Marinas
catalin.marinas at arm.com
Thu Aug 8 23:46:54 AEST 2024
On Thu, Aug 08, 2024 at 11:35:01AM +0200, Petr Tesařík wrote:
> On Wed, 7 Aug 2024 19:14:58 +0100
> Catalin Marinas <catalin.marinas at arm.com> wrote:
> > With ZONE_DMA32, since all the DMA code assumes that ZONE_DMA32 ends at
> > 4GB CPU address, it doesn't really work for such platforms. If there are
> > 32-bit devices with a corresponding CPU address offset, ZONE_DMA32
> > should end at 36GB on Baruch's platform. But to simplify things, we just
> > ignore this on arm64 and make ZONE_DMA32 empty.
>
> Ah. That makes sense. It also seems to support my theory that Linux
> memory zones are an obsolete concept and should be replaced by a
> different mechanism.
I agree, they are too coarse-grained. From an API perspective, what we
need is an alloc_pages() that takes a DMA mask or phys address limit,
maybe something similar to memblock_alloc_range_nid(). OTOH, an
advantage of the zones is that by default you keep the lower memory free
by using ZONE_NORMAL as default, you have free lists per zone. Maybe
with some alternative data structures we could efficiently search free
pages based on phys ranges or bitmasks and get rid of the zones but I
haven't put any thoughts into it.
We'd still need some boundaries like *_dma_get_max_cpu_address() to at
least allocate an swiotlb buffer that's suitable for all devices.
> > In some cases where we have the device structure we could instead do a
> > dma_to_phys(DMA_BIT_MASK(32)) but not in the two cases above. I guess if
> > we really want to address this properly, we'd need to introduce a
> > zone_dma32_limit that's initialised by the arch code. For arm64, I'm
> > happy with just having an empty ZONE_DMA32 on such platforms.
>
> The obvious caveat is that zone boundaries are system-wide, but the
> mapping between bus addresses and CPU addresses depends on the device
> structure. After all, that's why dma_to_phys takes the device as a
> parameter... In fact, a system may have multiple busses behind
> different bridges with a different offset applied by each.
Indeed, and as Robin mentioned, the ACPI/DT code already handle this.
> FYI I want to make more people aware of these issues at this year's
> Plumbers, see https://lpc.events/event/18/contributions/1776/
Looking forward to this. I'll dial in, unfortunately can't make Plumbers
in person this year.
In the meantime, I think this series is a good compromise ;).
--
Catalin
More information about the Linuxppc-dev
mailing list