[powerpc] linux-next 20220520 boot failure (drc_pmem_query_stats)
Sachin Sant
sachinp at linux.ibm.com
Mon May 23 17:03:31 AEST 2022
While booting linux-next (5.18.0-rc7-next-20220520) on a Power10 LPAR
configure with pmem following oops is seen. The LPAR fails to boot to
login prompt.
[ 10.948211] papr_scm ibm,persistent-memory:ibm,pmemory at 44104001: Permission denied while accessing performance stats
[ 10.948536] Kernel attempted to write user page (1c) - exploit attempt? (uid: 0)
[ 10.948539] BUG: Kernel NULL pointer dereference on write at 0x0000001c
[ 10.948540] Faulting instruction address: 0xc008000001b90844
[ 10.948542] Oops: Kernel access of bad area, sig: 11 [#1]
[ 10.948563] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 10.948568] Modules linked in: papr_scm(E+) libnvdimm(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
[ 10.948587] CPU: 25 PID: 796 Comm: systemd-udevd Tainted: G E 5.18.0-rc7-next-20220520 #2
[ 10.948592] NIP: c008000001b90844 LR: c008000001b92794 CTR: c008000001b907f8
[ 10.948595] REGS: c00000003082b110 TRAP: 0300 Tainted: G E (5.18.0-rc7-next-20220520)
[ 10.948600] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44222822 XER: 00000001
[ 10.948613] CFAR: c00000000007c744 DAR: 000000000000001c DSISR: 42000000 IRQMASK: 0
[ 10.948613] GPR00: c008000001b92794 c00000003082b3b0 c008000001bc8000 c00000000941bc00
[ 10.948613] GPR04: 0000000000000010 0000000000000000 c000000016001800 c00000003082b420
[ 10.948613] GPR08: 000000000000001c 0000000001000000 0000000053544154 c008000001b92c98
[ 10.948613] GPR12: c008000001b907f8 c000000abfd02b00 c00000003082bd00 00000001372bd8b0
[ 10.948613] GPR16: 000000000000ff20 c0080000008911b8 c008000000890000 00000000000011d0
[ 10.948613] GPR20: 0000000000000001 c00000003082bbc0 c008000001bc0a88 0000000000000000
[ 10.948613] GPR24: 0000000000000000 c000000002950e30 00000000ffffffff 0000000000000010
[ 10.948613] GPR28: c00000000941bc00 0000000000000010 0000000000000020 c00000000941bc00
[ 10.948660] NIP [c008000001b90844] drc_pmem_query_stats+0x5c/0x270 [papr_scm]
[ 10.948667] LR [c008000001b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm]
[ 10.948673] Call Trace:
[ 10.948675] [c00000003082b3b0] [c00000000941bca0] 0xc00000000941bca0 (unreliable)
[ 10.948680] [c00000003082b460] [c008000001b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm]
[ 10.948687] [c00000003082b550] [c0000000009809b8] platform_probe+0x98/0x150
[ 10.948694] [c00000003082b5d0] [c00000000097bf2c] really_probe+0xfc/0x510
[ 10.948699] [c00000003082b650] [c00000000097c4bc] __driver_probe_device+0x17c/0x230
[ 10.948704] [c00000003082b6d0] [c00000000097c5c8] driver_probe_device+0x58/0x120
[ 10.948709] [c00000003082b710] [c00000000097ce0c] __driver_attach+0xfc/0x230
[ 10.948714] [c00000003082b790] [c000000000978458] bus_for_each_dev+0xa8/0x130
[ 10.948718] [c00000003082b7f0] [c00000000097b2c4] driver_attach+0x34/0x50
[ 10.948722] [c00000003082b810] [c00000000097a508] bus_add_driver+0x1e8/0x350
[ 10.948729] [c00000003082b8a0] [c00000000097def8] driver_register+0x98/0x1a0
[ 10.948736] [c00000003082b910] [c0000000009804a8] __platform_driver_register+0x38/0x50
[ 10.948741] [c00000003082b930] [c008000001b92c10] papr_scm_init+0x3c/0x78 [papr_scm]
[ 10.948747] [c00000003082b960] [c000000000011ff0] do_one_initcall+0x60/0x2d0
[ 10.948753] [c00000003082ba30] [c00000000023627c] do_init_module+0x6c/0x2d0
[ 10.948760] [c00000003082bab0] [c000000000239650] load_module+0x1e90/0x2290
[ 10.948765] [c00000003082bc90] [c000000000239d9c] __do_sys_finit_module+0xdc/0x180
[ 10.948771] [c00000003082bdb0] [c0000000000335fc] system_call_exception+0x17c/0x350
[ 10.948777] [c00000003082be10] [c00000000000c53c] system_call_common+0xec/0x270
[ 10.948782] --- interrupt: c00 at 0x7fffa3f2f1d4
[ 10.948785] NIP: 00007fffa3f2f1d4 LR: 00007fffa456ea9c CTR: 0000000000000000
[ 10.948789] REGS: c00000003082be80 TRAP: 0c00 Tainted: G E (5.18.0-rc7-next-20220520)
[ 10.948793] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 28222204 XER: 00000000
[ 10.948805] IRQMASK: 0
[ 10.948805] GPR00: 0000000000000161 00007fffd70550b0 00007fffa4007300 0000000000000011
[ 10.948805] GPR04: 00007fffa457ad30 0000000000000000 0000000000000011 0000000000000000
[ 10.948805] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 10.948805] GPR12: 0000000000000000 00007fffa4656580 0000000000020000 00000001372bd8b0
[ 10.948805] GPR16: 0000000137300108 00000001372c5c68 0000000000000000 0000000000000000
[ 10.948805] GPR20: 0000000000000000 00000001372c5ca0 000000016f049240 00007fffd70552d0
[ 10.948805] GPR24: 0000000137300128 0000000000020000 0000000000000000 000000016f038a80
[ 10.948805] GPR28: 00007fffa457ad30 0000000000020000 0000000000000000 000000016f049240
[ 10.948849] NIP [00007fffa3f2f1d4] 0x7fffa3f2f1d4
[ 10.948851] LR [00007fffa456ea9c] 0x7fffa456ea9c
[ 10.948854] --- interrupt: c00
[ 10.948856] Instruction dump:
[ 10.948859] f8010010 f821ff51 e92d0c80 f9210088 39200000 41820118 3d405354 614a4154
[ 10.948869] 2fa50000 3d200100 391d000c 3bc00020 <7ca0452c> 913d0008 794a07c6 654a534d
[ 10.948878] ---[ end trace 0000000000000000 ]---
[ 10.951576]
[ 11.951579] Kernel panic - not syncing: Fatal exception
Following patch seems to be the cause for this regression.
commit 8b8fb1355917
powerpc/papr_scm: Fix leaking nvdimm_events_map elements
Reverting this patch helps to boot the kernel.
This crash is only seen with following option disabled(profile) for the said LPAR
Enable Performance Information Collection
- Sachin
More information about the Linuxppc-dev
mailing list