[PATCH 00/10] powerpc/mm/slice: improve slice speed and stack use

Nicholas Piggin npiggin at gmail.com
Wed Mar 7 00:24:57 AEDT 2018


Since this was last posted, it's been ported on top of Christophe's
8xx slice implementation that is merged in powerpc next, also taken
into account some feedback and bugs from Aneesh and Christophe --
thanks.

A few significant changes, first is refactoring slice_set_user_psize,
which makes it more obvious how the slice state is initialized, which
makes it easier to reason about using dynamic high slice size limits I
think.

Second is a significant change to how the slice masks are kept. No
longer are they bolted on the side and hit with a big recalculation
call that redoes everything whenever something changes. Now they are
just maintained as part of slice conversion.

This now passes vm selftests including the 128TB boundary case tests.
I also added a process microbenchmark and redid benchmarks and stack
measurements.

Overall on POWER8, this series increases vfork+exec+exit
microbenchmark rate by 15.6%, and mmap+munmap rate by 81%. Slice
code/data size is reduced by 1kB, and max stack overhead through
slice_get_unmapped_area call goes rom 992 to 448 bytes. The cost is
288 bytes added to the mm_context_t per mm for the slice masks on
Book3S.

Thanks,
Nick

Nicholas Piggin (10):
  selftests/powerpc: add process creation benchmark
  powerpc/mm/slice: Simplify and optimise slice context initialisation
  powerpc/mm/slice: tidy lpsizes and hpsizes update loops
  powerpc/mm/slice: pass pointers to struct slice_mask where possible
  powerpc/mm/slice: implement a slice mask cache
  powerpc/mm/slice: implement slice_check_range_fits
  powerpc/mm/slice: Switch to 3-operand slice bitops helpers
  powerpc/mm/slice: Use const pointers to cached slice masks where
    possible
  powerpc/mm/slice: use the dynamic high slice size to limit bitmap
    operations
  powerpc/mm/slice: remove radix calls to the slice code

 arch/powerpc/include/asm/book3s/64/mmu.h           |  18 +
 arch/powerpc/include/asm/hugetlb.h                 |   9 +-
 arch/powerpc/include/asm/mmu-8xx.h                 |  14 +
 arch/powerpc/include/asm/slice.h                   |   8 +-
 arch/powerpc/mm/hugetlbpage.c                      |   5 +-
 arch/powerpc/mm/mmu_context_book3s64.c             |   9 +-
 arch/powerpc/mm/mmu_context_nohash.c               |   5 +-
 arch/powerpc/mm/slice.c                            | 458 +++++++++++----------
 .../selftests/powerpc/benchmarks/.gitignore        |   2 +
 .../testing/selftests/powerpc/benchmarks/Makefile  |   8 +-
 .../selftests/powerpc/benchmarks/exec_target.c     |   5 +
 tools/testing/selftests/powerpc/benchmarks/fork.c  | 339 +++++++++++++++
 12 files changed, 632 insertions(+), 248 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/benchmarks/exec_target.c
 create mode 100644 tools/testing/selftests/powerpc/benchmarks/fork.c

-- 
2.16.1



More information about the Linuxppc-dev mailing list