[PATCH v2 1/4] mm: Add optional close() to struct vm_special_mapping

Michael Ellerman mpe at ellerman.id.au
Mon Aug 12 18:26:02 AEST 2024


Add an optional close() callback to struct vm_special_mapping. It will
be used, by powerpc at least, to handle unmapping of the VDSO.

Although support for unmapping the VDSO was initially added
for CRIU[1], it is not desirable to guard that support behind
CONFIG_CHECKPOINT_RESTORE.

There are other known users of unmapping the VDSO which are not related
to CRIU, eg. Valgrind [2] and void-ship [3].

The powerpc arch_unmap() hook has been in place for ~9 years, with no
ifdef, so there may be other unknown users that have come to rely on
unmapping the VDSO. Even if the code was behind an ifdef, major distros
enable CHECKPOINT_RESTORE so users may not realise unmapping the VDSO
depends on that configuration option.

It's also undesirable to have such core mm behaviour behind a relatively
obscure CONFIG option.

Longer term the unmap behaviour should be standardised across
architectures, however that is complicated by the fact the VDSO pointer
is stored differently across architectures. There was a previous attempt
to unify that handling [4], which could be revived.

See [5] for further discussion.

[1]: commit 83d3f0e90c6c ("powerpc/mm: tracking vDSO remap")
[2]: https://sourceware.org/git/?p=valgrind.git;a=commit;h=3a004915a2cbdcdebafc1612427576bf3321eef5
[3]: https://github.com/insanitybit/void-ship
[4]: https://lore.kernel.org/lkml/20210611180242.711399-17-dima@arista.com/
[5]: https://lore.kernel.org/linuxppc-dev/shiq5v3jrmyi6ncwke7wgl76ojysgbhrchsk32q4lbx2hadqqc@kzyy2igem256

Suggested-by: Linus Torvalds <torvalds at linux-foundation.org>
Signed-off-by: Michael Ellerman <mpe at ellerman.id.au>
Reviewed-by: David Hildenbrand <david at redhat.com>
---
 include/linux/mm_types.h | 3 +++
 mm/mmap.c                | 6 ++++++
 2 files changed, 9 insertions(+)

v2:
- Add some blank lines as requested.
- Expand special_mapping_close() comment.
- Add David's reviewed-by.
- Expand change log to capture review discussion.

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 485424979254..78bdfc59abe5 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1313,6 +1313,9 @@ struct vm_special_mapping {
 
 	int (*mremap)(const struct vm_special_mapping *sm,
 		     struct vm_area_struct *new_vma);
+
+	void (*close)(const struct vm_special_mapping *sm,
+		      struct vm_area_struct *vma);
 };
 
 enum tlb_flush_reason {
diff --git a/mm/mmap.c b/mm/mmap.c
index d0dfc85b209b..af4dbf0d3bd4 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -3620,10 +3620,16 @@ void vm_stat_account(struct mm_struct *mm, vm_flags_t flags, long npages)
 static vm_fault_t special_mapping_fault(struct vm_fault *vmf);
 
 /*
+ * Close hook, called for unmap() and on the old vma for mremap().
+ *
  * Having a close hook prevents vma merging regardless of flags.
  */
 static void special_mapping_close(struct vm_area_struct *vma)
 {
+	const struct vm_special_mapping *sm = vma->vm_private_data;
+
+	if (sm->close)
+		sm->close(sm, vma);
 }
 
 static const char *special_mapping_name(struct vm_area_struct *vma)
-- 
2.45.2



More information about the Linuxppc-dev mailing list