[RFC PATCH] powerpc: ideas to improve page table frag allocator

Christophe Leroy christophe.leroy at csgroup.eu
Tue Sep 6 04:36:51 AEST 2022



Le 05/09/2022 à 10:50, Nicholas Piggin a écrit :
> The page table fragment allocator is a simple per-mm slab allocator.
> It can be quite wasteful of space for small processes, as well as being
> expensive to initialise.  It does not do well at NUMA awareness.
> 
> This is a quick hack at addressing some of those problems, but it's not
> complete. It doesn't support THP because it doesn't deal with the page
> table deposit. It has has certain cases where cross-CPU locking could be
> increased (but also a reduction in other cases including reduction on
> ptl). NUMA still has some corner case issues, but it is improved. So
> it's not mergeable yet or necessarily the best way to solve the
> problems. Just a quick hack for some testing.
> 
> It save 1-2MB on a simple distro boot on a small (4 CPU) system. The
> powerpc fork selftests benchmark with --fork performance is improved by
> 15% on a POWER9 (14.5k/s -> 17k/s). This is just about a worst-case
> microbenchmark, but would still be good to fix it.
> 
> What would really be nice is if we could avoid writing our own allocator
> and use the slab allocator. The problem being we need a page table lock
> spinlock associated with the page table, and that must be able to be
> derived from the page table pointer, and I don't think slab has anything
> that fits the bill.

I have not looked at it in details yet, but I have the feeling that the 
handling of single-fragment architectures have disappeared.

That's commit 2a146533bf96 ("powerpc/mm: Avoid useless lock with single 
page fragments").

Thanks to that optimisation, all platforms were converted to page 
fragments with:
- commit 32ea4c149990 ("powerpc/mm: Extend pte_fragment functionality to 
PPC32")
- commit 737b434d3d55 ("powerpc/mm: convert Book3E 64 to pte_fragment")


But if the optimisation is removed then I guess the cost will likely be 
higher than before.

Christophe


More information about the Linuxppc-dev mailing list