[PATCH v2 00/11] Remove device private pages from physical address space

Thu Jan 8 12:29:21 AEDT 2026

On 2026-01-08 at 08:02 +1100, Balbir Singh <balbirs at nvidia.com> wrote...
> On 1/8/26 06:54, Jason Gunthorpe wrote:
> > On Wed, Jan 07, 2026 at 12:06:08PM -0800, Andrew Morton wrote:
> > 
> >>>   2) Attempting to add the device private pages to the linear map at
> >>>      addresses beyond the actual physical memory causes issues on
> >>>      architectures like aarch64  - meaning the feature does not work there [0].
> >>
> >> Can you better help us understand the seriousness of these problems? 
> >> How much are our users really hurting from this?
> > 
> > We think it is pretty serious, in the future HW support sense, as it
> > means real systems being built do not work :)

There's actually existing HW that could benefit from this support - after all
there is nothing stopping someone plugging a Intel/AMD/NVIDIA GPU into an ARM
machine today :-)

So it would be nice if we could support this feature there as it results in
really sub-optimal performance compared with x86 when using the SVM (shared
virtual memory) feature because data has to be remote mapped (ie. accessed via
PCIe link) rather than migrated to local GPU video memory.

Having the kernel steal physical address space has also caused problems on
x86 - we have encountered virtualised environments which depending on specific
firmware/BIOS don't have enough free physical address space to support device
private pages and hence migration of memory to the GPU device, again leading to
sub-optmial performance.

> > Also Willy and others were cheering this work on at LPC. I think the
> > possible followup to move DEVICE_PRIVATE from struct page and reduce
> > the memory allocation would be well celebrated.

For reference the recording of my LPC presentation covering both this series and
the above is here - https://www.youtube.com/watch?v=CFe_c8-tEuM

The hope is that in addition to enabling support for this more broadly across
other platforms/architectures that it will also enable further clean-ups to
reduce memory allocation overhead (I almost convinced myself we wouldn't need a
struct at all ... almost)

> > The Intel Xe and AMD GPU teams are the two drivers most important to
> > be testing this as they consume the feature.
> > 
> 
> And the ultravisor usage in powerpc as well (book3s_hv_uvmem).

As does Nouveau (which I've tested). But I agree AMD GPU and Intel Xe are the
most important drivers here. I would be surprised if anyone was actually using
the powerpc ultravisor, and I don't have access to a setup for this, so unless
some PPC folk can offer to help I wouldn't like to see testing there hold up
the series.

Especially as I believe most of the driver side changes are relatively straight
forward.

 - Alistair

> Balbir