[PATCH v3] powerpc/mm: Support execute-only memory on the Radix MMU

Russell Currey ruscur at russell.cc
Tue Aug 9 16:28:48 AEST 2022


On Tue, 2022-08-09 at 05:51 +0000, Christophe Leroy wrote:
> Le 09/08/2022 à 04:44, Russell Currey a écrit :
> > The Hash MMU already supports XOM (i.e. mmap with PROT_EXEC only)
> > through the execute-only pkey.  A PROT_EXEC-only mapping will
> > actually
> > map to RX, and then the pkey will be applied on top of it.
> 
> I don't think XOM is a commonly understood accronym. Maybe the first 
> time you use it it'd be better to say something like:
> 
> The Hash MMU already supports execute-only memory (XOM)

Yes, that's better.

> 
> When you say that Hash MMU supports it through the execute-only pkey,
> does it mean that it is taken into account automatically at mmap
> time, 
> or does the userspace app has to do something special to use the key
> ? 
> If it is the second, it means that depending on whether you are radix
> or 
> not, you must do something different ? Is that expected ?

It happens at mmap time, see do_mmap() in mm/mmap.c (and similar for
mprotect).  That calls into execute_only_pkey() which can return
something on x86 & Hash, and if it does that pkey gets used.  The
userspace process doesn't have to do anything, it's transparent.  So
there's no difference in program behaviour switching between Hash/Radix
- at least in the basic cases I've tested.

> 
> > 
> > Radix doesn't have pkeys, but it does have execute permissions
> > built-in
> > to the MMU, so all we have to do to support XOM is expose it.
> > 
> > Signed-off-by: Russell Currey <ruscur at russell.cc>
> > ---
> > v3: Incorporate Aneesh's suggestions, leave protection_map
> > untouched
> > Basic test:
> > https://github.com/ruscur/junkcode/blob/main/mmap_test.c
> > 
> >   arch/powerpc/include/asm/book3s/64/pgtable.h |  2 ++
> >   arch/powerpc/mm/book3s64/pgtable.c           | 11 +++++++++--
> >   arch/powerpc/mm/fault.c                      |  6 +++++-
> >   3 files changed, 16 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > index 392ff48f77df..486902aff040 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > @@ -151,6 +151,8 @@
> >   #define PAGE_COPY_X   __pgprot(_PAGE_BASE | _PAGE_READ |
> > _PAGE_EXEC)
> >   #define PAGE_READONLY __pgprot(_PAGE_BASE | _PAGE_READ)
> >   #define PAGE_READONLY_X       __pgprot(_PAGE_BASE | _PAGE_READ |
> > _PAGE_EXEC)
> > +/* Radix only, Hash uses PAGE_READONLY_X + execute-only pkey
> > instead */
> > +#define PAGE_EXECONLY  __pgprot(_PAGE_BASE | _PAGE_EXEC)
> >   
> >   /* Permission masks used for kernel mappings */
> >   #define PAGE_KERNEL   __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
> > diff --git a/arch/powerpc/mm/book3s64/pgtable.c
> > b/arch/powerpc/mm/book3s64/pgtable.c
> > index 7b9966402b25..62f63d344596 100644
> > --- a/arch/powerpc/mm/book3s64/pgtable.c
> > +++ b/arch/powerpc/mm/book3s64/pgtable.c
> > @@ -553,8 +553,15 @@ EXPORT_SYMBOL_GPL(memremap_compat_align);
> >   
> >   pgprot_t vm_get_page_prot(unsigned long vm_flags)
> >   {
> > -       unsigned long prot = pgprot_val(protection_map[vm_flags &
> > -
> >                                        (VM_READ|VM_WRITE|VM_EXEC|VM_
> > SHARED)]);
> > +       unsigned long prot;
> > +
> > +       /* Radix supports execute-only, but protection_map maps X -
> > > RX */
> > +       if (radix_enabled() && ((vm_flags &
> > (VM_READ|VM_WRITE|VM_EXEC)) == VM_EXEC)) {
> 
> Maybe use VM_ACCESS_FLAGS ?

I was looking for something like that but only checked powerpc, thanks.

> 
> > +               prot = pgprot_val(PAGE_EXECONLY);
> > +       } else {
> > +               prot = pgprot_val(protection_map[vm_flags &
> > +                                
> > (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]);
> > +       }
> >   
> >         if (vm_flags & VM_SAO)
> >                 prot |= _PAGE_SAO;
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 014005428687..59e4cbcf3109 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -270,7 +270,11 @@ static bool access_error(bool is_write, bool
> > is_exec, struct vm_area_struct *vma
> >                 return false;
> >         }
> >   
> > -       if (unlikely(!vma_is_accessible(vma)))
> > +       /* On Radix, a read fault could be from PROT_NONE or
> > PROT_EXEC */
> > +       if (unlikely(radix_enabled() && !(vma->vm_flags &
> > VM_READ)))
> > +               return true;
> 
> Why do you need the radix_enabled() here ?
> Even if it doesn't fault directly, reading a non readable area is
> still 
> an error and should be handled as such, even on hardware that will
> not 
> generate a fault for it at the first place. So I'd just do:
> 
>         if (!(vma->vm_flags & VM_READ)))
>                 return true;
> 

I don't think we need it, just concerned I might break something.  I
can do that.

> > +       /* Check for a PROT_NONE fault on other MMUs */
> > +       else if (unlikely(!vma_is_accessible(vma)))
> >                 return true;
> >         /*
> >          * We should ideally do the vma pkey access check here. But
> > in the
> 
> Don't use an if/else construct, there is no other 'else' in that 
> function, or in similar functions like bad_kernel_fault() for
> instance.
> 
> So leave the !vma_is_accessible(vma) untouched and add your check as
> a 
> standalone check before or after it.

I think checking vma_is_accessible(vma) is redundant if we're checking
for a read fault.  It doesn't satisfy the Radix exec-only case because
PROT_EXEC will be set, but as far as I can tell other MMUs don't have a
no-read mode other than PROT_NONE.  Unless I'm missing something,
checking if PROT_READ is there should be enough.


More information about the Linuxppc-dev mailing list