[PATCH v10 01/10] powerpc/mm: Implement set_memory() routines

Michael Ellerman mpe at ellerman.id.au
Wed Mar 31 22:16:23 AEDT 2021


Hi Jordan,

A few nits below ...

Jordan Niethe <jniethe5 at gmail.com> writes:
> From: Russell Currey <ruscur at russell.cc>
>
> The set_memory_{ro/rw/nx/x}() functions are required for STRICT_MODULE_RWX,
> and are generally useful primitives to have.  This implementation is
> designed to be completely generic across powerpc's many MMUs.
>
> It's possible that this could be optimised to be faster for specific
> MMUs, but the focus is on having a generic and safe implementation for
> now.
>
> This implementation does not handle cases where the caller is attempting
> to change the mapping of the page it is executing from, or if another
> CPU is concurrently using the page being altered.  These cases likely
> shouldn't happen, but a more complex implementation with MMU-specific code
> could safely handle them, so that is left as a TODO for now.
>
> On hash the linear mapping is not kept in the linux pagetable, so this
> will not change the protection if used on that range. Currently these
> functions are not used on the linear map so just WARN for now.
>
> These functions do nothing if STRICT_KERNEL_RWX is not enabled.
>
> Reviewed-by: Daniel Axtens <dja at axtens.net>
> Signed-off-by: Russell Currey <ruscur at russell.cc>
> Signed-off-by: Christophe Leroy <christophe.leroy at csgroup.eu>
> [jpn: -rebase on next plus "powerpc/mm/64s: Allow STRICT_KERNEL_RWX again"
>       - WARN on hash linear map]
> Signed-off-by: Jordan Niethe <jniethe5 at gmail.com>
> ---
> v10: WARN if trying to change the hash linear map
> ---
>  arch/powerpc/Kconfig                  |  1 +
>  arch/powerpc/include/asm/set_memory.h | 32 ++++++++++
>  arch/powerpc/mm/Makefile              |  2 +-
>  arch/powerpc/mm/pageattr.c            | 88 +++++++++++++++++++++++++++
>  4 files changed, 122 insertions(+), 1 deletion(-)
>  create mode 100644 arch/powerpc/include/asm/set_memory.h
>  create mode 100644 arch/powerpc/mm/pageattr.c
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index fc7f5c5933e6..4498a27ac9db 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -135,6 +135,7 @@ config PPC
>  	select ARCH_HAS_MEMBARRIER_CALLBACKS
>  	select ARCH_HAS_MEMBARRIER_SYNC_CORE
>  	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
> +	select ARCH_HAS_SET_MEMORY

Below you do:

	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
		return 0;

Which suggests we should instead just only select ARCH_HAS_SET_MEMORY if
STRICT_KERNEL_RWX ?


> diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
> index 3b4e9e4e25ea..d8a08abde1ae 100644
> --- a/arch/powerpc/mm/Makefile
> +++ b/arch/powerpc/mm/Makefile
> @@ -5,7 +5,7 @@
>  
>  ccflags-$(CONFIG_PPC64)	:= $(NO_MINIMAL_TOC)
>  
> -obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o \
> +obj-y				:= fault.o mem.o pgtable.o mmap.o maccess.o pageattr.o \

.. and then the file should only be built if ARCH_HAS_SET_MEMORY = y.

>  				   init_$(BITS).o pgtable_$(BITS).o \
>  				   pgtable-frag.o ioremap.o ioremap_$(BITS).o \
>  				   init-common.o mmu_context.o drmem.o
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> new file mode 100644
> index 000000000000..9efcb01088da
> --- /dev/null
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -0,0 +1,88 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * MMU-generic set_memory implementation for powerpc
> + *
> + * Copyright 2019, IBM Corporation.

Should be 2019-2021.

> + */
> +
> +#include <linux/mm.h>
> +#include <linux/set_memory.h>
> +
> +#include <asm/mmu.h>
> +#include <asm/page.h>
> +#include <asm/pgtable.h>
> +
> +
> +/*
> + * Updates the attributes of a page in three steps:
> + *
> + * 1. invalidate the page table entry
> + * 2. flush the TLB
> + * 3. install the new entry with the updated attributes
> + *
> + * This is unsafe if the caller is attempting to change the mapping of the
> + * page it is executing from, or if another CPU is concurrently using the
> + * page being altered.

Is the 2nd part of that statement true?

Or, I guess maybe it is true depending on what "unsafe" means.

AIUI it's unsafe to use this on the page you're executing from, and by
unsafe we mean the kernel will potentially crash because it will lose
the mapping for the currently executing text.

Using this on a page that another CPU is accessing could be safe, if eg.
the other CPU is reading from the page and we are just changing it from
RW->RO.

So I'm not sure they're the same type of "unsafe".

> + * TODO make the implementation resistant to this.
> + *
> + * NOTE: can be dangerous to call without STRICT_KERNEL_RWX

I don't think we need that anymore?

> + */
> +static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
> +{
> +	long action = (long)data;
> +	pte_t pte;
> +
> +	spin_lock(&init_mm.page_table_lock);
> +
> +	/* invalidate the PTE so it's safe to modify */
> +	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +
> +	/* modify the PTE bits as desired, then apply */
> +	switch (action) {
> +	case SET_MEMORY_RO:
> +		pte = pte_wrprotect(pte);
> +		break;

So set_memory_ro() removes write, but doesn't remove execute.

That doesn't match my mental model of what "set to ro" means, but I
guess I'm wrong because the other implementations seem to do something
similar.


> +	case SET_MEMORY_RW:
> +		pte = pte_mkwrite(pte);

I think we want to add pte_mkdirty() here also to avoid a fault when the
mapping is written to.

eg. pmd_mkwrite(pmd_mkdirty(pte));

> +		break;
> +	case SET_MEMORY_NX:
> +		pte = pte_exprotect(pte);
> +		break;
> +	case SET_MEMORY_X:
> +		pte = pte_mkexec(pte);
> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
> +		break;
> +	}
> +
> +	set_pte_at(&init_mm, addr, ptep, pte);
> +	spin_unlock(&init_mm.page_table_lock);
> +
> +	return 0;
> +}
> +
> +int change_memory_attr(unsigned long addr, int numpages, long action)
> +{
> +	unsigned long start = ALIGN_DOWN(addr, PAGE_SIZE);
> +	unsigned long sz = numpages * PAGE_SIZE;
> +
> +	if (!IS_ENABLED(CONFIG_STRICT_KERNEL_RWX))
> +		return 0;
> +
> +	if (numpages <= 0)
> +		return 0;
> +

This ↓ should have a comment explaining what it's doing:

> +#ifdef CONFIG_PPC_BOOK3S_64
> +	if (WARN_ON_ONCE(!radix_enabled() &&
> +		     get_region_id(addr) == LINEAR_MAP_REGION_ID)) {
> +		return -1;
> +	}
> +#endif

Maybe:

	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
	    WARN_ON_ONCE(!radix_enabled() && get_region_id(addr) == LINEAR_MAP_REGION_ID)) {
		return -1;
	}

But then Aneesh pointed out that we should also block VMEMMAP_REGION_ID.

It might be better to just check for the permitted regions.

	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && !radix_enabled()) {
        	int region = get_region_id(addr);

	    	if (WARN_ON_ONCE(region != VMALLOC_REGION_ID && region != IO_REGION_ID))
                	return -1;
	}

> +
> +	return apply_to_existing_page_range(&init_mm, start, sz,
> +					    change_page_attr, (void *)action);
> +}


cheers


More information about the Linuxppc-dev mailing list