[RFC PATCH] powerpc: ideas to improve page table frag allocator

Nicholas Piggin npiggin at gmail.com
Wed Sep 7 14:24:52 AEST 2022


On Tue Sep 6, 2022 at 4:36 AM AEST, Christophe Leroy wrote:
>
>
> Le 05/09/2022 à 10:50, Nicholas Piggin a écrit :
> > The page table fragment allocator is a simple per-mm slab allocator.
> > It can be quite wasteful of space for small processes, as well as being
> > expensive to initialise.  It does not do well at NUMA awareness.
> > 
> > This is a quick hack at addressing some of those problems, but it's not
> > complete. It doesn't support THP because it doesn't deal with the page
> > table deposit. It has has certain cases where cross-CPU locking could be
> > increased (but also a reduction in other cases including reduction on
> > ptl). NUMA still has some corner case issues, but it is improved. So
> > it's not mergeable yet or necessarily the best way to solve the
> > problems. Just a quick hack for some testing.
> > 
> > It save 1-2MB on a simple distro boot on a small (4 CPU) system. The
> > powerpc fork selftests benchmark with --fork performance is improved by
> > 15% on a POWER9 (14.5k/s -> 17k/s). This is just about a worst-case
> > microbenchmark, but would still be good to fix it.
> > 
> > What would really be nice is if we could avoid writing our own allocator
> > and use the slab allocator. The problem being we need a page table lock
> > spinlock associated with the page table, and that must be able to be
> > derived from the page table pointer, and I don't think slab has anything
> > that fits the bill.
>
> I have not looked at it in details yet, but I have the feeling that the 
> handling of single-fragment architectures have disappeared.

Yes that's gone from my hack, it should be special-cased of course
to reduce or avoid unnecessary overhead.

Thanks,
Nick

>
> That's commit 2a146533bf96 ("powerpc/mm: Avoid useless lock with single 
> page fragments").
>
> Thanks to that optimisation, all platforms were converted to page 
> fragments with:
> - commit 32ea4c149990 ("powerpc/mm: Extend pte_fragment functionality to 
> PPC32")
> - commit 737b434d3d55 ("powerpc/mm: convert Book3E 64 to pte_fragment")
>
>
> But if the optimisation is removed then I guess the cost will likely be 
> higher than before.
>
> Christophe



More information about the Linuxppc-dev mailing list