POWER9 crash due to STRICT_KERNEL_RWX (WAS: Re: Linux-next POWER9 NULL pointer NIP...)

Russell Currey ruscur at russell.cc
Fri Apr 17 12:46:28 AEST 2020


On Thu, 2020-04-16 at 22:40 -0400, Qian Cai wrote:
> > On Apr 16, 2020, at 10:27 PM, Russell Currey <ruscur at russell.cc>
> > wrote:
> > 
> > Reverting the patch with the given config will have the same effect
> > as
> > STRICT_KERNEL_RWX=n.  Not discounting that it could be a bug on the
> > powerpc side (i.e. relocatable kernels with strict RWX on haven't
> > been
> > exhaustively tested yet), but we should definitely figure out
> > what's
> > going on with this bad access first.
> 
> BTW, this bad access only happened once. The overwhelming rest of
> crashes are with NULL pointer NIP like below. How can you explain
> that STRICT_KERNEL_RWX=n would also make those NULL NIP disappear if
> STRICT_KERNEL_RWX is just a messenger?

What happens if you test with STRICT_KERNEL_RWX=y and RELOCATABLE=n,
reverting my patch?  This would give us an idea of whether it's
something broken recently or if there's something else going on.

> 
> [  215.281666][T16896] LTP: starting chown04_16
> [  215.424203][T18297] BUG: Unable to handle kernel instruction fetch
> (NULL pointer?)
> [  215.424289][T18297] Faulting instruction address: 0x00000000
> [  215.424313][T18297] Oops: Kernel access of bad area, sig: 11 [#1]
> [  215.424341][T18297] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256
> DEBUG_PAGEALLOC NUMA PowerNV
> [  215.424383][T18297] Modules linked in: loop kvm_hv kvm ip_tables
> x_tables xfs sd_mod bnx2x mdio tg3 ahci libahci libphy libata
> firmware_class dm_mirror dm_region_hash dm_log dm_mod
> [  215.424459][T18297] CPU: 85 PID: 18297 Comm: chown04_16 Tainted:
> G        W         5.6.0-next-20200405+ #3
> [  215.424489][T18297] NIP:  0000000000000000 LR: c00800000fbc0408
> CTR: 0000000000000000
> [  215.424530][T18297] REGS: c000200b8606f990 TRAP: 0400   Tainted:
> G        W          (5.6.0-next-20200405+)
> [  215.424570][T18297] MSR:  9000000040009033
> <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 84000248  XER: 20040000
> [  215.424619][T18297] CFAR: c00800000fbc64f4 IRQMASK: 0 
> [  215.424619][T18297] GPR00: c0000000006c2238 c000200b8606fc20
> c00000000165ce00 0000000000000000 
> [  215.424619][T18297] GPR04: c000201a58106400 c000200b8606fcc0
> 000000005f037e7d ffffffff00013bfb 
> [  215.424619][T18297] GPR08: c000201a58106400 0000000000000000
> 0000000000000000 c000000001652ee0 
> [  215.424619][T18297] GPR12: 0000000000000000 c000201fff69a600
> 0000000000000000 0000000000000000 
> [  215.424619][T18297] GPR16: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 
> [  215.424619][T18297] GPR20: 0000000000000000 0000000000000000
> 0000000000000000 0000000000000007 
> [  215.424619][T18297] GPR24: 0000000000000000 0000000000000000
> c00800000fbc8688 c000200b8606fcc0 
> [  215.424619][T18297] GPR28: 0000000000000000 000000007fffffff
> c00800000fbc0400 c00020068b8c0e70 
> [  215.424914][T18297] NIP [0000000000000000] 0x0
> [  215.424953][T18297] LR [c00800000fbc0408] find_free_cb+0x8/0x30
> [loop]
> find_free_cb at drivers/block/loop.c:2129
> [  215.424997][T18297] Call Trace:
> [  215.425036][T18297] [c000200b8606fc20] [c0000000006c2290]
> idr_for_each+0xf0/0x170 (unreliable)
> [  215.425073][T18297] [c000200b8606fca0] [c00800000fbc2744]
> loop_lookup.part.2+0x4c/0xb0 [loop]
> loop_lookup at drivers/block/loop.c:2144
> [  215.425105][T18297] [c000200b8606fce0] [c00800000fbc3558]
> loop_control_ioctl+0x120/0x1d0 [loop]
> [  215.425149][T18297] [c000200b8606fd40] [c0000000004eb688]
> ksys_ioctl+0xd8/0x130
> [  215.425190][T18297] [c000200b8606fd90] [c0000000004eb708]
> sys_ioctl+0x28/0x40
> [  215.425233][T18297] [c000200b8606fdb0] [c00000000003cc30]
> system_call_exception+0x110/0x1e0
> [  215.425274][T18297] [c000200b8606fe20] [c00000000000c9f0]
> system_call_common+0xf0/0x278
> [  215.425314][T18297] Instruction dump:
> [  215.425338][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> XXXXXXXX XXXXXXXX XXXXXXXX 
> [  215.425374][T18297] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> XXXXXXXX XXXXXXXX XXXXXXXX 
> [  215.425422][T18297] ---[ end trace ebed248fad431966 ]---
> [  215.642114][T18297] 
> [  216.642220][T18297] Kernel panic - not syncing: Fatal exception



More information about the Linuxppc-dev mailing list