[PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

Peter Xu peterx at redhat.com
Fri Dec 1 08:30:58 AEDT 2023

On Fri, Nov 24, 2023 at 11:07:51AM -0500, Peter Xu wrote:
> On Fri, Nov 24, 2023 at 09:06:01AM +0000, Ryan Roberts wrote:
> > I don't have any micro-benchmarks for GUP though, if that's your question. Is
> > there an easy-to-use test I can run to get some numbers? I'd be happy to try it out.
> Thanks Ryan.  Then nothing is needed to be tested if gup is not yet touched
> from your side, afaict.  I'll see whether I can provide some rough numbers
> instead in the next post (I'll probably only be able to test it in a VM,
> though, but hopefully that should still reflect mostly the truth).

An update: I finished a round of 64K cont_pte test, in the slow gup micro
benchmark I see ~15% perf degrade with this patchset applied on a VM on top
of Apple M1.

Frankly that's even less than I expected, considering not only how slow gup
THP used to be, but also on the fact that that's a tight loop over slow
gup, which in normal cases shouldn't happen: "present" ptes normally goes
to fast-gup, while !present goes into a fault following it.  I assume
that's why nobody cared slow gup for THP before.  I think adding cont_pte
support shouldn't be very hard, but that will include making cont_pte idea
global just for arm64 and riscv Svnapot.

The current plan is I'll add that performance number into my commit message
only, as I don't ever expect any real workload will regress with it.  Maybe
a global cont_pte api will be needed at some point, but perhaps not yet
feel strongly for this use case.

Please feel free to raise any concerns otherwise.


Peter Xu

More information about the Linuxppc-dev mailing list