early soft lockup in 6.15-rc2 on PowerNV
Dan Horák
dan at danny.cz
Thu Apr 17 00:05:17 AEST 2025
Hi Ritesh,
On Wed, 16 Apr 2025 15:55:15 +0530
Ritesh Harjani (IBM) <ritesh.list at gmail.com> wrote:
> Dan Horák <dan at danny.cz> writes:
>
> > Hi,
> >
> > after updating to Fedora built 6.15-rc2 kernel from 6.14 I am getting a
> > soft lockup early in the boot and NVME related timeout/crash later
> > (could it be related?). I am first checking if this is a known issue
> > as I have not started bisecting yet.
> >
> > [ 2.866399] Memory: 63016960K/67108864K available (25152K kernel code, 4416K rwdata, 24000K rodata, 9792K init, 1796K bss, 476160K reserved, 3356672K cma-reserved)
> > [ 2.874121] devtmpfs: initialized
> > [ 24.037685] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1]
> > [ 24.037690] CPU#0 Utilization every 4s during lockup:
> > [ 24.037692] #1: 101% system, 0% softirq, 0% hardirq, 0% idle
> > [ 24.037697] #2: 100% system, 0% softirq, 0% hardirq, 0% idle
> > [ 24.037701] #3: 100% system, 0% softirq, 0% hardirq, 0% idle
> > [ 24.037704] #4: 101% system, 0% softirq, 0% hardirq, 0% idle
> > [ 24.037707] #5: 100% system, 0% softirq, 0% hardirq, 0% idle
> > [ 24.037711] Modules linked in:
> > [ 24.037716] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.15.0-0.rc2.22.fc43.ppc64le #1 VOLUNTARY
> > [ 24.037722] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
> > [ 24.037725] NIP: c00000000308a72c LR: c00000000308a7d0 CTR: c0000000018012c0
> > [ 24.037729] REGS: c000200006637a50 TRAP: 0900 Not tainted (6.15.0-0.rc2.22.fc43.ppc64le)
> > [ 24.037733] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 48000828 XER: 00000000
> > [ 24.037750] CFAR: 0000000000000000 IRQMASK: 0
> > [ 24.037750] GPR00: c00000000308a7d0 c000200006637cf0 c0000000025baa00 0000000000000040
> > [ 24.037750] GPR04: c0002007ff390b00 0000000000010000 0000000000000000 c0002007ff3a0b00
> > [ 24.037750] GPR08: 00000000002007ff 000000000012d092 0000000000000000 0000000000000000
> > [ 24.037750] GPR12: 0000000000000000 c000000003fb0000 c000000000011320 0000000000000000
> > [ 24.037750] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 24.037750] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 24.037750] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 24.037750] GPR28: 0000000000000000 c000000003f10be0 c0000000019efaf8 0000000000037940
> > [ 24.037806] NIP [c00000000308a72c] memory_dev_init+0xb4/0x194
> > [ 24.037815] LR [c00000000308a7d0] memory_dev_init+0x158/0x194
> > [ 24.037820] Call Trace:
> > [ 24.037822] [c000200006637cf0] [c00000000308a7d0] memory_dev_init+0x158/0x194 (unreliable)
> > [ 24.037830] [c000200006637d70] [c000000003089bd0] driver_init+0x74/0xa0
> > [ 24.037836] [c000200006637d90] [c00000000300f628] kernel_init_freeable+0x204/0x288
> > [ 24.037843] [c000200006637df0] [c000000000011344] kernel_init+0x2c/0x1b8
> > [ 24.037849] [c000200006637e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
> > [ 24.037855] --- interrupt: 0 at 0x0
> > [ 24.037858] Code: 7c651b78 40820010 3fa20195 3bbd61e0 48000080 3c62ff89 389e00c8 3863e510 4bf7a625 60000000 39290001 7c284840 <41800088> 792aaac2 7c2a2840 4080ffec
> > [ 48.045039] watchdog: BUG: soft lockup - CPU#0 stuck for 44s! [swapper/0:1]
> > [ 48.045043] CPU#0 Utilization every 4s during lockup:
> > [ 48.045045] #1: 101% system, 0% softirq, 0% hardirq, 0% idle
> > [ 48.045049] #2: 100% system, 0% softirq, 0% hardirq, 0% idle
> > [ 48.045053] #3: 100% system, 0% softirq, 0% hardirq, 0% idle
> > [ 48.045056] #4: 101% system, 0% softirq, 0% hardirq, 0% idle
> > [ 48.045059] #5: 100% system, 0% softirq, 0% hardirq, 0% idle
> > [ 48.045063] Modules linked in:
> > [ 48.045067] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G L ------ --- 6.15.0-0.rc2.22.fc43.ppc64le #1 VOLUNTARY
> > [ 48.045073] Tainted: [L]=SOFTLOCKUP
> > [ 48.045075] Hardware name: T2P9D01 REV 1.00 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
> > [ 48.045077] NIP: c00000000308a72c LR: c00000000308a7d0 CTR: c0000000018012c0
> > [ 48.045081] REGS: c000200006637a50 TRAP: 0900 Tainted: G L ------ --- (6.15.0-0.rc2.22.fc43.ppc64le)
> > [ 48.045085] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 48000828 XER: 00000000
> > [ 48.045100] CFAR: 0000000000000000 IRQMASK: 0
> > [ 48.045100] GPR00: c00000000308a7d0 c000200006637cf0 c0000000025baa00 0000000000000040
> > [ 48.045100] GPR04: c0002007ff390b00 0000000000010000 0000000000000000 c0002007ff3a0b00
> > [ 48.045100] GPR08: 00000000002007ff 00000000000a65fd 0000000000000000 0000000000000000
> > [ 48.045100] GPR12: 0000000000000000 c000000003fb0000 c000000000011320 0000000000000000
> > [ 48.045100] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 48.045100] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 48.045100] GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > [ 48.045100] GPR28: 0000000000000000 c000000003f10be0 c0000000019efaf8 000000000007f880
> > [ 48.045155] NIP [c00000000308a72c] memory_dev_init+0xb4/0x194
> > [ 48.045161] LR [c00000000308a7d0] memory_dev_init+0x158/0x194
> > [ 48.045166] Call Trace:
> > [ 48.045167] [c000200006637cf0] [c00000000308a7d0] memory_dev_init+0x158/0x194 (unreliable)
> > [ 48.045175] [c000200006637d70] [c000000003089bd0] driver_init+0x74/0xa0
> > [ 48.045181] [c000200006637d90] [c00000000300f628] kernel_init_freeable+0x204/0x288
> > [ 48.045187] [c000200006637df0] [c000000000011344] kernel_init+0x2c/0x1b8
> > [ 48.045193] [c000200006637e50] [c00000000000debc] ret_from_kernel_user_thread+0x14/0x1c
> > [ 48.045199] --- interrupt: 0 at 0x0
>
> The above looks similar to
> https://lore.kernel.org/all/20250410125110.1232329-1-gshan@redhat.com/
>
> Maybe you can give this patch a try for above softlockup.
yes, it was it, the mentioned patch fixes the soft-lockup, so feel free
to add
Tested-by: Dan Horák <dan at danny.cz>
The NVME issue seems to be unrelated, I will keep looking ...
Dan
More information about the Linuxppc-dev
mailing list