[powerpc] linux-next 20220520 boot failure (drc_pmem_query_stats)

Sachin Sant sachinp at linux.ibm.com
Mon May 23 17:03:31 AEST 2022


While booting linux-next (5.18.0-rc7-next-20220520) on a Power10 LPAR
configure with pmem following oops is seen. The LPAR fails to boot to
login prompt.

[   10.948211] papr_scm ibm,persistent-memory:ibm,pmemory at 44104001: Permission denied while accessing performance stats
[   10.948536] Kernel attempted to write user page (1c) - exploit attempt? (uid: 0)
[   10.948539] BUG: Kernel NULL pointer dereference on write at 0x0000001c
[   10.948540] Faulting instruction address: 0xc008000001b90844
[   10.948542] Oops: Kernel access of bad area, sig: 11 [#1]
[   10.948563] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[   10.948568] Modules linked in: papr_scm(E+) libnvdimm(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
[   10.948587] CPU: 25 PID: 796 Comm: systemd-udevd Tainted: G            E     5.18.0-rc7-next-20220520 #2
[   10.948592] NIP:  c008000001b90844 LR: c008000001b92794 CTR: c008000001b907f8
[   10.948595] REGS: c00000003082b110 TRAP: 0300   Tainted: G            E      (5.18.0-rc7-next-20220520)
[   10.948600] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44222822  XER: 00000001
[   10.948613] CFAR: c00000000007c744 DAR: 000000000000001c DSISR: 42000000 IRQMASK: 0 
[   10.948613] GPR00: c008000001b92794 c00000003082b3b0 c008000001bc8000 c00000000941bc00 
[   10.948613] GPR04: 0000000000000010 0000000000000000 c000000016001800 c00000003082b420 
[   10.948613] GPR08: 000000000000001c 0000000001000000 0000000053544154 c008000001b92c98 
[   10.948613] GPR12: c008000001b907f8 c000000abfd02b00 c00000003082bd00 00000001372bd8b0 
[   10.948613] GPR16: 000000000000ff20 c0080000008911b8 c008000000890000 00000000000011d0 
[   10.948613] GPR20: 0000000000000001 c00000003082bbc0 c008000001bc0a88 0000000000000000 
[   10.948613] GPR24: 0000000000000000 c000000002950e30 00000000ffffffff 0000000000000010 
[   10.948613] GPR28: c00000000941bc00 0000000000000010 0000000000000020 c00000000941bc00 
[   10.948660] NIP [c008000001b90844] drc_pmem_query_stats+0x5c/0x270 [papr_scm]
[   10.948667] LR [c008000001b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm]
[   10.948673] Call Trace:
[   10.948675] [c00000003082b3b0] [c00000000941bca0] 0xc00000000941bca0 (unreliable)
[   10.948680] [c00000003082b460] [c008000001b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm]
[   10.948687] [c00000003082b550] [c0000000009809b8] platform_probe+0x98/0x150
[   10.948694] [c00000003082b5d0] [c00000000097bf2c] really_probe+0xfc/0x510
[   10.948699] [c00000003082b650] [c00000000097c4bc] __driver_probe_device+0x17c/0x230
[   10.948704] [c00000003082b6d0] [c00000000097c5c8] driver_probe_device+0x58/0x120
[   10.948709] [c00000003082b710] [c00000000097ce0c] __driver_attach+0xfc/0x230
[   10.948714] [c00000003082b790] [c000000000978458] bus_for_each_dev+0xa8/0x130
[   10.948718] [c00000003082b7f0] [c00000000097b2c4] driver_attach+0x34/0x50
[   10.948722] [c00000003082b810] [c00000000097a508] bus_add_driver+0x1e8/0x350
[   10.948729] [c00000003082b8a0] [c00000000097def8] driver_register+0x98/0x1a0
[   10.948736] [c00000003082b910] [c0000000009804a8] __platform_driver_register+0x38/0x50
[   10.948741] [c00000003082b930] [c008000001b92c10] papr_scm_init+0x3c/0x78 [papr_scm]
[   10.948747] [c00000003082b960] [c000000000011ff0] do_one_initcall+0x60/0x2d0
[   10.948753] [c00000003082ba30] [c00000000023627c] do_init_module+0x6c/0x2d0
[   10.948760] [c00000003082bab0] [c000000000239650] load_module+0x1e90/0x2290
[   10.948765] [c00000003082bc90] [c000000000239d9c] __do_sys_finit_module+0xdc/0x180
[   10.948771] [c00000003082bdb0] [c0000000000335fc] system_call_exception+0x17c/0x350
[   10.948777] [c00000003082be10] [c00000000000c53c] system_call_common+0xec/0x270
[   10.948782] --- interrupt: c00 at 0x7fffa3f2f1d4
[   10.948785] NIP:  00007fffa3f2f1d4 LR: 00007fffa456ea9c CTR: 0000000000000000
[   10.948789] REGS: c00000003082be80 TRAP: 0c00   Tainted: G            E      (5.18.0-rc7-next-20220520)
[   10.948793] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 28222204  XER: 00000000
[   10.948805] IRQMASK: 0 
[   10.948805] GPR00: 0000000000000161 00007fffd70550b0 00007fffa4007300 0000000000000011 
[   10.948805] GPR04: 00007fffa457ad30 0000000000000000 0000000000000011 0000000000000000 
[   10.948805] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[   10.948805] GPR12: 0000000000000000 00007fffa4656580 0000000000020000 00000001372bd8b0 
[   10.948805] GPR16: 0000000137300108 00000001372c5c68 0000000000000000 0000000000000000 
[   10.948805] GPR20: 0000000000000000 00000001372c5ca0 000000016f049240 00007fffd70552d0 
[   10.948805] GPR24: 0000000137300128 0000000000020000 0000000000000000 000000016f038a80 
[   10.948805] GPR28: 00007fffa457ad30 0000000000020000 0000000000000000 000000016f049240 
[   10.948849] NIP [00007fffa3f2f1d4] 0x7fffa3f2f1d4
[   10.948851] LR [00007fffa456ea9c] 0x7fffa456ea9c
[   10.948854] --- interrupt: c00
[   10.948856] Instruction dump:
[   10.948859] f8010010 f821ff51 e92d0c80 f9210088 39200000 41820118 3d405354 614a4154 
[   10.948869] 2fa50000 3d200100 391d000c 3bc00020 <7ca0452c> 913d0008 794a07c6 654a534d 
[   10.948878] ---[ end trace 0000000000000000 ]---
[   10.951576] 
[   11.951579] Kernel panic - not syncing: Fatal exception

Following patch seems to be the cause for this regression.
commit 8b8fb1355917 
    powerpc/papr_scm: Fix leaking nvdimm_events_map elements

Reverting this patch helps to boot the kernel. 

This crash is only seen with following option disabled(profile) for the said LPAR
Enable Performance Information Collection

- Sachin


More information about the Linuxppc-dev mailing list