[PATCH v2 0/7] Nesting support for lazy MMU mode

Kevin Brodsky kevin.brodsky at arm.com
Mon Sep 8 17:39:24 AEST 2025


When the lazy MMU mode was introduced eons ago, it wasn't made clear
whether such a sequence was legal:

	arch_enter_lazy_mmu_mode()
	...
		arch_enter_lazy_mmu_mode()
		...
		arch_leave_lazy_mmu_mode()
	...
	arch_leave_lazy_mmu_mode()

It seems fair to say that nested calls to
arch_{enter,leave}_lazy_mmu_mode() were not expected, and most
architectures never explicitly supported it.

Ryan Roberts' series from March [1] attempted to prevent nesting from
ever occurring, and mostly succeeded. Unfortunately, a corner case
(DEBUG_PAGEALLOC) may still cause nesting to occur on arm64. Ryan
proposed [2] to address that corner case at the generic level but this
approach received pushback; [3] then attempted to solve the issue on
arm64 only, but it was deemed too fragile.

It feels generally fragile to rely on lazy_mmu sections not to nest,
because callers of various standard mm functions do not know if the
function uses lazy_mmu itself. This series therefore performs a U-turn
and adds support for nested lazy_mmu sections, on all architectures.

The main change enabling nesting is patch 2, following the approach
suggested by Catalin Marinas [4]: have enter() return some state and
the matching leave() take that state. In this series, the state is only
used to handle nesting, but it could be used for other purposes such as
restoring context modified by enter(); the proposed kpkeys framework
would be an immediate user [5].

Patch overview:

* Patch 1: general cleanup - not directly related, but avoids any doubt
  regarding the expected behaviour of arch_flush_lazy_mmu_mode() outside
  x86

* Patch 2: main API change, no functional change

* Patch 3-6: nesting support for all architectures that support lazy_mmu

* Patch 7: clarification that nesting is supported in the documentation

Patch 4-6 are technically not required at this stage since nesting is
only observed on arm64, but they ensure future correctness in case
nesting is (re)introduced in generic paths. For instance, it could be
beneficial in some configurations to enter lazy_mmu set_ptes() once
again.

This series has been tested by running the mm kselfetsts on arm64 with
DEBUG_PAGEALLOC and KFENCE. It was also build-tested on other
architectures (with and without XEN_PV on x86).

- Kevin

[1] https://lore.kernel.org/all/20250303141542.3371656-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/all/20250530140446.2387131-1-ryan.roberts@arm.com/
[3] https://lore.kernel.org/all/20250606135654.178300-1-ryan.roberts@arm.com/
[4] https://lore.kernel.org/all/aEhKSq0zVaUJkomX@arm.com/
[5] https://lore.kernel.org/linux-hardening/20250815085512.2182322-19-kevin.brodsky@arm.com/
---
Changelog

v1..v2:
- Rebased on mm-unstable.
- Patch 2: handled new calls to enter()/leave(), clarified how the "flush"
  pattern (leave() followed by enter()) is handled.
- Patch 5,6: removed unnecessary local variable [Alexander Gordeev's
  suggestion].
- Added Mike Rapoport's Acked-by.

v1: https://lore.kernel.org/all/20250904125736.3918646-1-kevin.brodsky@arm.com/
---
Cc: Alexander Gordeev <agordeev at linux.ibm.com>
Cc: Andreas Larsson <andreas at gaisler.com>
Cc: Andrew Morton <akpm at linux-foundation.org>
Cc: Boris Ostrovsky <boris.ostrovsky at oracle.com>
Cc: Borislav Petkov <bp at alien8.de>
Cc: Catalin Marinas <catalin.marinas at arm.com>
Cc: Christophe Leroy <christophe.leroy at csgroup.eu>
Cc: Dave Hansen <dave.hansen at linux.intel.com>
Cc: David Hildenbrand <david at redhat.com>
Cc: "David S. Miller" <davem at davemloft.net>
Cc: "H. Peter Anvin" <hpa at zytor.com>
Cc: Ingo Molnar <mingo at redhat.com>
Cc: Jann Horn <jannh at google.com>
Cc: Juergen Gross <jgross at suse.com>
Cc: "Liam R. Howlett" <Liam.Howlett at oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes at oracle.com>
Cc: Madhavan Srinivasan <maddy at linux.ibm.com>
Cc: Michael Ellerman <mpe at ellerman.id.au>
Cc: Michal Hocko <mhocko at suse.com>
Cc: Mike Rapoport <rppt at kernel.org>
Cc: Nicholas Piggin <npiggin at gmail.com>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: Ryan Roberts <ryan.roberts at arm.com>
Cc: Suren Baghdasaryan <surenb at google.com>
Cc: Thomas Gleixner <tglx at linutronix.de>
Cc: Vlastimil Babka <vbabka at suse.cz>
Cc: Will Deacon <will at kernel.org>
Cc: Yeoreum Yun <yeoreum.yun at arm.com>
Cc: linux-arm-kernel at lists.infradead.org
Cc: linux-kernel at vger.kernel.org
Cc: linuxppc-dev at lists.ozlabs.org
Cc: sparclinux at vger.kernel.org
Cc: xen-devel at lists.xenproject.org
---
Kevin Brodsky (7):
  mm: remove arch_flush_lazy_mmu_mode()
  mm: introduce local state for lazy_mmu sections
  arm64: mm: fully support nested lazy_mmu sections
  x86/xen: support nested lazy_mmu sections (again)
  powerpc/mm: support nested lazy_mmu sections
  sparc/mm: support nested lazy_mmu sections
  mm: update lazy_mmu documentation

 arch/arm64/include/asm/pgtable.h              | 34 ++++++-------------
 .../include/asm/book3s/64/tlbflush-hash.h     | 22 ++++++++----
 arch/powerpc/mm/book3s64/hash_tlb.c           | 10 +++---
 arch/powerpc/mm/book3s64/subpage_prot.c       |  5 +--
 arch/sparc/include/asm/tlbflush_64.h          |  6 ++--
 arch/sparc/mm/tlb.c                           | 17 +++++++---
 arch/x86/include/asm/paravirt.h               |  8 ++---
 arch/x86/include/asm/paravirt_types.h         |  6 ++--
 arch/x86/include/asm/pgtable.h                |  3 +-
 arch/x86/xen/enlighten_pv.c                   |  2 +-
 arch/x86/xen/mmu_pv.c                         | 13 ++++---
 fs/proc/task_mmu.c                            |  5 +--
 include/linux/mm_types.h                      |  3 ++
 include/linux/pgtable.h                       | 21 +++++++++---
 mm/kasan/shadow.c                             |  4 +--
 mm/madvise.c                                  | 20 ++++++-----
 mm/memory.c                                   | 20 ++++++-----
 mm/migrate_device.c                           |  5 +--
 mm/mprotect.c                                 |  5 +--
 mm/mremap.c                                   |  5 +--
 mm/userfaultfd.c                              |  5 +--
 mm/vmalloc.c                                  | 15 ++++----
 mm/vmscan.c                                   | 15 ++++----
 23 files changed, 148 insertions(+), 101 deletions(-)


base-commit: b024763926d2726978dff6588b81877d000159c1
-- 
2.47.0



More information about the Linuxppc-dev mailing list