[PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel

Sachin Sant sachinp at linux.ibm.com
Wed Jan 11 16:06:41 AEDT 2023



> On 10-Jan-2023, at 6:17 PM, Michael Ellerman <mpe at ellerman.id.au> wrote:
> 
> If a relocatable kernel is loaded at an address that is not 2MB aligned
> and told not to relocate to zero, the kernel can crash due to
> mark_rodata_ro() incorrectly changing some read-write data to read-only.
> 
> Scenarios where the misalignment can occur are when the kernel is
> loaded by kdump or using the RELOCATABLE_TEST config option.
> 
> Example crash with the kernel loaded at 5MB:
> 
>  Run /sbin/init as init process
>  BUG: Unable to handle kernel data access on write at 0xc000000000452000
>  Faulting instruction address: 0xc0000000005b6730
>  Oops: Kernel access of bad area, sig: 11 [#1]
>  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>  CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
>  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
>  NIP:  c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
>  REGS: c000000004503250 TRAP: 0300   Not tainted  (6.2.0-rc1-00011-g349188be4841)
>  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44288480  XER: 00000000
>  CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
>  ...
>  NIP memset+0x68/0x104
>  LR  zero_user_segments.constprop.0+0xa8/0xf0
>  Call Trace:
>    ext4_mpage_readpages+0x7f8/0x830
>    ext4_readahead+0x48/0x60
>    read_pages+0xb8/0x380
>    page_cache_ra_unbounded+0x19c/0x250
>    filemap_fault+0x58c/0xae0
>    __do_fault+0x60/0x100
>    __handle_mm_fault+0x1230/0x1a40
>    handle_mm_fault+0x120/0x300
>    ___do_page_fault+0x20c/0xa80
>    do_page_fault+0x30/0xc0
>    data_access_common_virt+0x210/0x220
> 
> This happens because mark_rodata_ro() tries to change permissions on the
> range _stext..__end_rodata, but _stext sits in the middle of the 2MB
> page from 4MB to 6MB:
> 
>  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
>  radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)
> 
> The logic that changes the permissions assumes the linear mapping was
> split correctly at boot, so it marks the entire 2MB page read-only. That
> leads to the write fault above.
> 
> To fix it, the boot time mapping logic needs to consider that if the
> kernel is running at a non-zero address then _stext is a boundary where
> it must split the mapping.
> 
> That leads to the mapping being split correctly, allowing the rodata
> permission change to take happen correctly, with no spillover:
> 
>  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
>  radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
>  radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
>  radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)
> 
> If the kernel is loaded at a 2MB aligned address, the mapping continues
> to use 2MB pages as before:
> 
>  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
>  radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages
> 
> Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
> Signed-off-by: Michael Ellerman <mpe at ellerman.id.au>
> ---

Tested successfully with different crash kernel memory values
Tested-by : Sachin Sant <sachinp at linux.ibm.com>

- Sachin


More information about the Linuxppc-dev mailing list