early soft lockup in 6.15-rc2 on PowerNV

Ritesh Harjani (IBM) ritesh.list at gmail.com
Wed Apr 16 20:25:15 AEST 2025


Dan Horák <dan at danny.cz> writes:

> Hi,
>
> after updating to Fedora built 6.15-rc2 kernel from 6.14 I am getting a
> soft lockup early in the boot and NVME related timeout/crash later
> (could it be related?). I am first checking if this is a known issue
> as I have not started bisecting yet.
>
> [    2.866399] Memory: 63016960K/67108864K available (25152K kernel code, 4416K rwdata, 24000K rodata, 9792K init, 1796K bss, 476160K reserved, 3356672K cma-reserved)
> [    2.874121] devtmpfs: initialized
> [   24.037685] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
> [   24.037690] CPU#0 Utilization every 4s during lockup:
> [   24.037692] 	#1: 101% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   24.037697] 	#2: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   24.037701] 	#3: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   24.037704] 	#4: 101% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   24.037707] 	#5: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   24.037711] Modules linked in:
> [   24.037716] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-0.rc2.22.fc43.ppc64le #1 VOLUNTARY 
> [   24.037722] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
> [   24.037725] NIP:  c00000000308a72c LR: c00000000308a7d0 CTR: c0000000018012c0
> [   24.037729] REGS: c000200006637a50 TRAP: 0900   Not tainted  (6.15.0-0.rc2.22.fc43.ppc64le)
> [   24.037733] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 48000828  XER: 00000000
> [   24.037750] CFAR: 0000000000000000 IRQMASK: 0 
> [   24.037750] GPR00: c00000000308a7d0 c000200006637cf0 c0000000025baa00 0000000000000040 
> [   24.037750] GPR04: c0002007ff390b00 0000000000010000 0000000000000000 c0002007ff3a0b00 
> [   24.037750] GPR08: 00000000002007ff 000000000012d092 0000000000000000 0000000000000000 
> [   24.037750] GPR12: 0000000000000000 c000000003fb0000 c000000000011320 0000000000000000 
> [   24.037750] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   24.037750] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   24.037750] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   24.037750] GPR28: 0000000000000000 c000000003f10be0 c0000000019efaf8 0000000000037940 
> [   24.037806] NIP [c00000000308a72c] memory_dev_init+0xb4/0x194
> [   24.037815] LR [c00000000308a7d0] memory_dev_init+0x158/0x194
> [   24.037820] Call Trace:
> [   24.037822] [c000200006637cf0] [c00000000308a7d0] memory_dev_init+0x158/0x194 (unreliable)
> [   24.037830] [c000200006637d70] [c000000003089bd0] driver_init+0x74/0xa0
> [   24.037836] [c000200006637d90] [c00000000300f628] kernel_init_freeable+0x204/0x288
> [   24.037843] [c000200006637df0] [c000000000011344] kernel_init+0x2c/0x1b8
> [   24.037849] [c000200006637e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
> [   24.037855] --- interrupt: 0 at 0x0
> [   24.037858] Code: 7c651b78 40820010 3fa20195 3bbd61e0 48000080 3c62ff89 389e00c8 3863e510 4bf7a625 60000000 39290001 7c284840 <41800088> 792aaac2 7c2a2840 4080ffec 
> [   48.045039] watchdog: BUG: soft lockup - CPU#0 stuck for 44s! [swapper/0:1]
> [   48.045043] CPU#0 Utilization every 4s during lockup:
> [   48.045045] 	#1: 101% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   48.045049] 	#2: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   48.045053] 	#3: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   48.045056] 	#4: 101% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   48.045059] 	#5: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
> [   48.045063] Modules linked in:
> [   48.045067] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G             L     ------  ---  6.15.0-0.rc2.22.fc43.ppc64le #1 VOLUNTARY 
> [   48.045073] Tainted: [L]=SOFTLOCKUP
> [   48.045075] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
> [   48.045077] NIP:  c00000000308a72c LR: c00000000308a7d0 CTR: c0000000018012c0
> [   48.045081] REGS: c000200006637a50 TRAP: 0900   Tainted: G             L     ------  ---   (6.15.0-0.rc2.22.fc43.ppc64le)
> [   48.045085] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 48000828  XER: 00000000
> [   48.045100] CFAR: 0000000000000000 IRQMASK: 0 
> [   48.045100] GPR00: c00000000308a7d0 c000200006637cf0 c0000000025baa00 0000000000000040 
> [   48.045100] GPR04: c0002007ff390b00 0000000000010000 0000000000000000 c0002007ff3a0b00 
> [   48.045100] GPR08: 00000000002007ff 00000000000a65fd 0000000000000000 0000000000000000 
> [   48.045100] GPR12: 0000000000000000 c000000003fb0000 c000000000011320 0000000000000000 
> [   48.045100] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   48.045100] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   48.045100] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [   48.045100] GPR28: 0000000000000000 c000000003f10be0 c0000000019efaf8 000000000007f880 
> [   48.045155] NIP [c00000000308a72c] memory_dev_init+0xb4/0x194
> [   48.045161] LR [c00000000308a7d0] memory_dev_init+0x158/0x194
> [   48.045166] Call Trace:
> [   48.045167] [c000200006637cf0] [c00000000308a7d0] memory_dev_init+0x158/0x194 (unreliable)
> [   48.045175] [c000200006637d70] [c000000003089bd0] driver_init+0x74/0xa0
> [   48.045181] [c000200006637d90] [c00000000300f628] kernel_init_freeable+0x204/0x288
> [   48.045187] [c000200006637df0] [c000000000011344] kernel_init+0x2c/0x1b8
> [   48.045193] [c000200006637e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
> [   48.045199] --- interrupt: 0 at 0x0

The above looks similar to
https://lore.kernel.org/all/20250410125110.1232329-1-gshan@redhat.com/

Maybe you can give this patch a try for above softlockup.

-ritesh



More information about the Linuxppc-dev mailing list