[PATCH v5] powerpc: Do not make the entire heap executable

Kees Cook keescook at chromium.org
Thu Sep 29 03:16:33 AEST 2016


On Tue, Sep 27, 2016 at 6:42 PM, Michael Ellerman <mpe at ellerman.id.au> wrote:
> Denys Vlasenko <dvlasenk at redhat.com> writes:
>
>> On 32-bit powerpc the ELF PLT sections of binaries (built with --bss-plt,
>> or with a toolchain which defaults to it) look like this:
>
> Or (it seems), for all programs built with -pg (profiling).
>
>>   [17] .sbss             NOBITS          0002aff8 01aff8 000014 00  WA  0   0  4
>>   [18] .plt              NOBITS          0002b00c 01aff8 000084 00 WAX  0   0  4
>>   [19] .bss              NOBITS          0002b090 01aff8 0000a4 00  WA  0   0  4
>>
>> Which results in an ELF load header:
>>
>>   Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>>   LOAD           0x019c70 0x00029c70 0x00029c70 0x01388 0x014c4 RWE 0x10000
>>
>> This is all correct, the load region containing the PLT is marked as
>> executable. Note that the PLT starts at 0002b00c but the file mapping ends at
>> 0002aff8, so the PLT falls in the 0 fill section described by the load header,
>> and after a page boundary.
>>
>> Unfortunately the generic ELF loader ignores the X bit in the load headers
>> when it creates the 0 filled non-file backed mappings. It assumes all of these
>> mappings are RW BSS sections, which is not the case for PPC.
>>
>> gcc/ld has an option (--secure-plt) to not do this, this is said to incur
>> a small performance penalty.
>>
>> Currently, to support 32-bit binaries with PLT in BSS kernel maps *entire
>> brk area* with executable rights for all binaries, even --secure-plt ones.
>>
>> Stop doing that.
>>
>> Teach the ELF loader to check the X bit in the relevant load header
>> and create 0 filled anonymous mappings that are executable
>> if the load header requests that.
> ...
>>
>> Signed-off-by: Jason Gunthorpe <jgunthorpe at obsidianresearch.com>
>> Signed-off-by: Denys Vlasenko <dvlasenk at redhat.com>
>> Reviewed-by: Kees Cook <keescook at chromium.org>
>> CC: Michael Ellerman <mpe at ellerman.id.au>
>
> This looks OK to me, I've tested it with --bss-plt and --secure-plt and
> also -pg.
>
> So for the powerpc part I'll give you an:
>
> Acked-by: Michael Ellerman <mpe at ellerman.id.au>
>
> But this is not really a powerpc patch, and I'm not an ELF expert. So
> I'm not comfortable merging it via the powerpc tree. It doesn't look
> like we really have a maintainer for binfmt_elf.c, so I'm not sure who
> should be acking that part.

I've done a bunch of recent work in and around binfmt_elf.c, so I'd be
happy to change my "Reviewed-by" to an "Acked-by" a change that I
suggested earlier. (Details below.)

> I've added Al Viro to Cc, he maintains fs/ and might be interested.
>
> I've also added Andrew Morton who might be happy to put this in his
> tree, and see if anyone complains?
>
> Also here: https://patchwork.ozlabs.org/patch/661595/
>
> cheers
>
>> ---
>> Changes since v4:
>> * if (current->personality & READ_IMPLIES_EXEC), still use VM_EXEC
>>   for 32-bit executables.
>>
>> Changes since v3:
>> * typo fix in commit message
>> * rebased to current Linus tree
>>
>> Changes since v2:
>> * moved capability to map with VM_EXEC into vm_brk_flags()
>>
>> Changes since v1:
>> * wrapped lines to not exceed 79 chars
>> * improved comment
>> * expanded CC list
>>
>>  arch/powerpc/include/asm/page.h |  3 ++-
>>  fs/binfmt_elf.c                 | 30 ++++++++++++++++++++++--------
>>  include/linux/mm.h              |  1 +
>>  mm/mmap.c                       | 21 ++++++++++++++++-----
>>  4 files changed, 41 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
>> index 56398e7..d188f51 100644
>> --- a/arch/powerpc/include/asm/page.h
>> +++ b/arch/powerpc/include/asm/page.h
>> @@ -230,7 +230,9 @@ extern long long virt_phys_offset;
>>   * and needs to be executable.  This means the whole heap ends
>>   * up being executable.
>>   */
>> -#define VM_DATA_DEFAULT_FLAGS32      (VM_READ | VM_WRITE | VM_EXEC | \
>> +#define VM_DATA_DEFAULT_FLAGS32 \
>> +     (((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
>> +                              VM_READ | VM_WRITE | \
>>                                VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
>>
>>  #define VM_DATA_DEFAULT_FLAGS64      (VM_READ | VM_WRITE | \
>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>> index 7f6aff3f..2b57b5a 100644
>> --- a/fs/binfmt_elf.c
>> +++ b/fs/binfmt_elf.c
>> @@ -91,12 +91,18 @@ static struct linux_binfmt elf_format = {
>>
>>  #define BAD_ADDR(x) ((unsigned long)(x) >= TASK_SIZE)
>>
>> -static int set_brk(unsigned long start, unsigned long end)
>> +static int set_brk(unsigned long start, unsigned long end, int prot)
>>  {
>>       start = ELF_PAGEALIGN(start);
>>       end = ELF_PAGEALIGN(end);
>>       if (end > start) {
>> -             int error = vm_brk(start, end - start);
>> +             /*
>> +              * Map the last of the bss segment.
>> +              * If the header is requesting these pages to be
>> +              * executable, honour that (ppc32 needs this).
>> +              */
>> +             int error = vm_brk_flags(start, end - start,
>> +                             prot & PROT_EXEC ? VM_EXEC : 0);
>>               if (error)
>>                       return error;
>>       }
>> @@ -524,6 +530,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>>       unsigned long load_addr = 0;
>>       int load_addr_set = 0;
>>       unsigned long last_bss = 0, elf_bss = 0;
>> +     int bss_prot = 0;
>>       unsigned long error = ~0UL;
>>       unsigned long total_size;
>>       int i;
>> @@ -606,8 +613,10 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>>                        * elf_bss and last_bss is the bss section.
>>                        */
>>                       k = load_addr + eppnt->p_vaddr + eppnt->p_memsz;
>> -                     if (k > last_bss)
>> +                     if (k > last_bss) {
>>                               last_bss = k;
>> +                             bss_prot = elf_prot;
>> +                     }
>>               }
>>       }
>>
>> @@ -623,13 +632,14 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
>>       /*
>>        * Next, align both the file and mem bss up to the page size,
>>        * since this is where elf_bss was just zeroed up to, and where
>> -      * last_bss will end after the vm_brk() below.
>> +      * last_bss will end after the vm_brk_flags() below.
>>        */
>>       elf_bss = ELF_PAGEALIGN(elf_bss);
>>       last_bss = ELF_PAGEALIGN(last_bss);
>>       /* Finally, if there is still more bss to allocate, do it. */
>>       if (last_bss > elf_bss) {
>> -             error = vm_brk(elf_bss, last_bss - elf_bss);
>> +             error = vm_brk_flags(elf_bss, last_bss - elf_bss,
>> +                             bss_prot & PROT_EXEC ? VM_EXEC : 0);
>>               if (error)
>>                       goto out;
>>       }
>> @@ -674,6 +684,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>       unsigned long error;
>>       struct elf_phdr *elf_ppnt, *elf_phdata, *interp_elf_phdata = NULL;
>>       unsigned long elf_bss, elf_brk;
>> +     int bss_prot = 0;
>>       int retval, i;
>>       unsigned long elf_entry;
>>       unsigned long interp_load_addr = 0;
>> @@ -881,7 +892,8 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>                          before this one. Map anonymous pages, if needed,
>>                          and clear the area.  */
>>                       retval = set_brk(elf_bss + load_bias,
>> -                                      elf_brk + load_bias);
>> +                                      elf_brk + load_bias,
>> +                                      bss_prot);
>>                       if (retval)
>>                               goto out_free_dentry;
>>                       nbyte = ELF_PAGEOFFSET(elf_bss);
>> @@ -975,8 +987,10 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>               if (end_data < k)
>>                       end_data = k;
>>               k = elf_ppnt->p_vaddr + elf_ppnt->p_memsz;
>> -             if (k > elf_brk)
>> +             if (k > elf_brk) {
>> +                     bss_prot = elf_prot;
>>                       elf_brk = k;
>> +             }
>>       }
>>
>>       loc->elf_ex.e_entry += load_bias;
>> @@ -992,7 +1006,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
>>        * mapping in the interpreter, to make sure it doesn't wind
>>        * up getting placed where the bss needs to go.
>>        */
>> -     retval = set_brk(elf_bss, elf_brk);
>> +     retval = set_brk(elf_bss, elf_brk, bss_prot);
>>       if (retval)
>>               goto out_free_dentry;
>>       if (likely(elf_bss != elf_brk) && unlikely(padzero(elf_bss))) {
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 08ed53e..3ffa76c 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2058,6 +2058,7 @@ static inline void mm_populate(unsigned long addr, unsigned long len) {}
>>
>>  /* These take the mm semaphore themselves */
>>  extern int __must_check vm_brk(unsigned long, unsigned long);
>> +extern int __must_check vm_brk_flags(unsigned long, unsigned long, unsigned long);
>>  extern int vm_munmap(unsigned long, size_t);
>>  extern unsigned long __must_check vm_mmap(struct file *, unsigned long,
>>          unsigned long, unsigned long,
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index ca9d91b..4d5b3f3 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -2653,11 +2653,11 @@ static inline void verify_mm_writelocked(struct mm_struct *mm)
>>   *  anonymous maps.  eventually we may be able to do some
>>   *  brk-specific accounting here.
>>   */
>> -static int do_brk(unsigned long addr, unsigned long request)
>> +static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags)
>>  {
>>       struct mm_struct *mm = current->mm;
>>       struct vm_area_struct *vma, *prev;
>> -     unsigned long flags, len;
>> +     unsigned long len;
>>       struct rb_node **rb_link, *rb_parent;
>>       pgoff_t pgoff = addr >> PAGE_SHIFT;
>>       int error;
>> @@ -2668,7 +2668,7 @@ static int do_brk(unsigned long addr, unsigned long request)
>>       if (!len)
>>               return 0;
>>
>> -     flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;
>> +     flags |= VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;

This is where the flags are actually built from what's coming in
through the newly created exported function vm_brk_flags() below. The
only flag we're acting on is VM_EXEC (passed in from set_brk() above).
I think do_brk_flags() should mask the valid flags, or we'll regret it
in the future. I'd like to see something like:

    /* Until we need other flags, refuse anything except VM_EXEC. */
    if ((flags & (~VM_EXEC)) != 0)
        return -EINVAL;
    flags |= VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;

With this, I'd be happy to Ack. :)

-Kees

>>
>>       error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
>>       if (offset_in_page(error))
>> @@ -2736,7 +2736,12 @@ out:
>>       return 0;
>>  }
>>
>> -int vm_brk(unsigned long addr, unsigned long len)
>> +static int do_brk(unsigned long addr, unsigned long len)
>> +{
>> +     return do_brk_flags(addr, len, 0);
>> +}
>> +
>> +int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags)
>>  {
>>       struct mm_struct *mm = current->mm;
>>       int ret;
>> @@ -2745,13 +2750,19 @@ int vm_brk(unsigned long addr, unsigned long len)
>>       if (down_write_killable(&mm->mmap_sem))
>>               return -EINTR;
>>
>> -     ret = do_brk(addr, len);
>> +     ret = do_brk_flags(addr, len, flags);
>>       populate = ((mm->def_flags & VM_LOCKED) != 0);
>>       up_write(&mm->mmap_sem);
>>       if (populate && !ret)
>>               mm_populate(addr, len);
>>       return ret;
>>  }
>> +EXPORT_SYMBOL(vm_brk_flags);
>> +
>> +int vm_brk(unsigned long addr, unsigned long len)
>> +{
>> +     return vm_brk_flags(addr, len, 0);
>> +}
>>  EXPORT_SYMBOL(vm_brk);
>>
>>  /* Release all mmaps. */
>> --
>> 2.9.2



-- 
Kees Cook
Nexus Security


More information about the Linuxppc-dev mailing list