POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

Fri Apr 17 21:45:34 AEST 2020

Steven Rostedt <rostedt at goodmis.org> writes:
> On Thu, 16 Apr 2020 21:19:10 -0400
> Qian Cai <cai at lca.pw> wrote:
>
>> OK, reverted the commit,
>> 
>> c55d7b5e6426 (“powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE”)
>> 
>> or set STRICT_KERNEL_RWX=n fixed the crash below and also mentioned in this thread,
>
> This may be a symptom and not a cure.

I think it is a cure.

But we still have a bug, which is that when STRICT_KERNEL_RWX is enabled
we have some sort of corruption going on.

Enabling STRICT_KERNEL_RWX changes our implementation of
patch_instruction() which is used by ftrace, so I suspect this is a
powerpc bug.

>> [  148.110969][T13115] LTP: starting chown04_16
>> [  148.255048][T13380] kernel tried to execute exec-protected page (c0000000016804ac) - exploit attempt? (uid: 0)
>> [  148.255099][T13380] BUG: Unable to handle kernel instruction fetch
>> [  148.255122][T13380] Faulting instruction address: 0xc0000000016804ac
>> [  148.255136][T13380] Oops: Kernel access of bad area, sig: 11 [#1]
>> [  148.255157][T13380] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV
>> [  148.255171][T13380] Modules linked in: loop kvm_hv kvm xfs sd_mod bnx2x mdio ahci tg3 libahci libphy libata firmware_class dm_mirror dm_region_hash dm_log dm_mod
>> [  148.255213][T13380] CPU: 45 PID: 13380 Comm: chown04_16 Tainted: G        W         5.6.0+ #7
>> [  148.255236][T13380] NIP:  c0000000016804ac LR: c00800000fa60408 CTR: c0000000016804ac
>> [  148.255250][T13380] REGS: c0000010a6fafa00 TRAP: 0400   Tainted: G        W          (5.6.0+)
>> [  148.255281][T13380] MSR:  9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 84000248  XER: 20040000
>> [  148.255310][T13380] CFAR: c00800000fa66534 IRQMASK: 0 
>> [  148.255310][T13380] GPR00: c000000000973268 c0000010a6fafc90 c000000001648200 0000000000000000 
>> [  148.255310][T13380] GPR04: c000000d8a22dc00 c0000010a6fafd30 00000000b5e98331 ffffffff00012c9f 
>> [  148.255310][T13380] GPR08: c000000d8a22dc00 0000000000000000 0000000000000000 c00000000163c520 
>> [  148.255310][T13380] GPR12: c0000000016804ac c000001ffffdad80 0000000000000000 0000000000000000 
>> [  148.255310][T13380] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [  148.255310][T13380] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [  148.255310][T13380] GPR24: 00007fff8f5e2e48 0000000000000000 c00800000fa6a488 c0000010a6fafd30 
>> [  148.255310][T13380] GPR28: 0000000000000000 000000007fffffff c00800000fa60400 c000000efd0c6780 
>> [  148.255494][T13380] NIP [c0000000016804ac] sysctl_net_busy_read+0x0/0x4
>
> The instruction pointer is on sysctl_net_busy_read? Isn't that data and
> not code?

Yes.

But we're corrupting the text, or data, somewhere, so we can jump
anywhere.

I have another trace where vhost_init() appears to call into
proc_dointvec() before crashing. vhost_init() is an empty function.

cheers