[ BUG: Invalid wait context ], BUG: scheduling while atomic: swapper/0/1/0x00000002 on kernel 6.2.0-rc4

Erhard F. erhard_f at mailbox.org
Tue Jan 17 11:37:26 AEDT 2023


Getting this at boot on my Talos II POWER9 box:

[...]
=============================
[ BUG: Invalid wait context ]
6.2.0-rc4-P9 #1 Tainted: G                T 
-----------------------------
swapper/0/1 is trying to lock:
c0000000021b57c8 (cpuhp_state_mutex){+.+.}-{3:3}, at: __cpuhp_setup_state_cpuslocked+0xb0/0x5f0
other info that might help us debug this:
context-{4:4}
3 locks held by swapper/0/1:
 #0: c00000000dd738f8 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x124/0x330
 #1: c00000000218ef58 (nest_init_lock){+.+.}-{2:2}, at: init_imc_pmu+0x1104/0x1790
 #2: c0000000021b58f0 (cpu_hotplug_lock){++++}-{0:0}, at: init_imc_pmu+0x137c/0x1790
stack backtrace:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G                T  6.2.0-rc4-P9 #1
Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
Call Trace:
[c000000006c67050] [c0000000012a3c80] dump_stack_lvl+0xb4/0x124 (unreliable)
[c000000006c67090] [c0000000001e233c] __lock_acquire+0x351c/0x3550
[c000000006c67200] [c0000000001e33f0] lock_acquire+0x1a0/0x5a0
[c000000006c67300] [c0000000012fcd08] __mutex_lock+0xe8/0x610
[c000000006c673f0] [c0000000000f3680] __cpuhp_setup_state_cpuslocked+0xb0/0x5f0
[c000000006c674b0] [c0000000000f4ef8] __cpuhp_setup_state+0x168/0x3f0
[c000000006c67530] [c0000000000d3e5c] init_imc_pmu+0x137c/0x1790
[c000000006c67690] [c0000000000c4764] opal_imc_counters_probe+0x3a4/0x7e0
[c000000006c677e0] [c000000000dd09c0] platform_probe+0xa0/0x150
[c000000006c67860] [c000000000dcaf20] really_probe+0x170/0x590
[c000000006c67900] [c000000000dcb448] __driver_probe_device+0x108/0x1d0
[c000000006c67940] [c000000000dcb594] driver_probe_device+0x84/0x1a0
[c000000006c67990] [c000000000dcb9c4] __driver_attach+0x134/0x330
[c000000006c679e0] [c000000000dc5e6c] bus_for_each_dev+0xdc/0x150
[c000000006c67a30] [c000000000dc9fc0] driver_attach+0x40/0x70
[c000000006c67a60] [c000000000dc92b8] bus_add_driver+0x338/0x420
[c000000006c67b10] [c000000000dcd8d4] driver_register+0x154/0x310
[c000000006c67ba0] [c000000000dd0144] __platform_driver_register+0x54/0x80
[c000000006c67bd0] [c00000000203183c] opal_imc_driver_init+0x60/0x90
[c000000006c67c00] [c000000000011ee8] do_one_initcall+0xc8/0x630
[c000000006c67cf0] [c0000000020036cc] kernel_init_freeable+0x72c/0x864
[c000000006c67de0] [c000000000012b98] kernel_init+0x28/0x1d0
[c000000006c67e50] [c00000000000ce5c] ret_from_kernel_thread+0x5c/0x64
--- interrupt: 0 at 0x0
NIP:  0000000000000000 LR: 0000000000000000 CTR: 0000000000000000
REGS: c000000006c67e80 TRAP: 0000   Tainted: G                T   (6.2.0-rc4-P9)
MSR:  0000000000000000 <>  CR: 00000000  XER: 00000000
CFAR: 0000000000000000 IRQMASK: 0 
GPR00: 0000000000000000 c000000006c68000 0000000000000000 0000000000000000 
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 0000000000000000 c000000000012b78 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
NIP [0000000000000000] 0x0
LR [0000000000000000] 0x0
--- interrupt: 0
BUG: scheduling while atomic: swapper/0/1/0x00000002
INFO: lockdep is turned off.
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G                T  6.2.0-rc4-P9 #1
Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
Call Trace:
[c000000006c66f90] [c0000000012a3c80] dump_stack_lvl+0xb4/0x124 (unreliable)
[c000000006c66fd0] [c000000000163c68] __schedule_bug+0xf8/0x120
[c000000006c67050] [c0000000012f8c28] __schedule+0x1218/0x14d0
[c000000006c67150] [c0000000012f8fa4] schedule+0xc4/0x200
[c000000006c671d0] [c0000000013052e4] schedule_timeout+0x174/0x1f0
[c000000006c672b0] [c0000000012fa20c] __wait_for_common+0x15c/0x310
[c000000006c67360] [c0000000000f2ce8] cpuhp_issue_call+0x398/0x570
[c000000006c673f0] [c0000000000f3778] __cpuhp_setup_state_cpuslocked+0x1a8/0x5f0
[c000000006c674b0] [c0000000000f4ef8] __cpuhp_setup_state+0x168/0x3f0
[c000000006c67530] [c0000000000d3e5c] init_imc_pmu+0x137c/0x1790
[c000000006c67690] [c0000000000c4764] opal_imc_counters_probe+0x3a4/0x7e0
[c000000006c677e0] [c000000000dd09c0] platform_probe+0xa0/0x150
[c000000006c67860] [c000000000dcaf20] really_probe+0x170/0x590
[c000000006c67900] [c000000000dcb448] __driver_probe_device+0x108/0x1d0
[c000000006c67940] [c000000000dcb594] driver_probe_device+0x84/0x1a0
[c000000006c67990] [c000000000dcb9c4] __driver_attach+0x134/0x330
[c000000006c679e0] [c000000000dc5e6c] bus_for_each_dev+0xdc/0x150
[c000000006c67a30] [c000000000dc9fc0] driver_attach+0x40/0x70
[c000000006c67a60] [c000000000dc92b8] bus_add_driver+0x338/0x420
[c000000006c67b10] [c000000000dcd8d4] driver_register+0x154/0x310
[c000000006c67ba0] [c000000000dd0144] __platform_driver_register+0x54/0x80
[c000000006c67bd0] [c00000000203183c] opal_imc_driver_init+0x60/0x90
[c000000006c67c00] [c000000000011ee8] do_one_initcall+0xc8/0x630
[c000000006c67cf0] [c0000000020036cc] kernel_init_freeable+0x72c/0x864
[c000000006c67de0] [c000000000012b98] kernel_init+0x28/0x1d0
[c000000006c67e50] [c00000000000ce5c] ret_from_kernel_thread+0x5c/0x64
--- interrupt: 0 at 0x0
NIP:  0000000000000000 LR: 0000000000000000 CTR: 0000000000000000
REGS: c000000006c67e80 TRAP: 0000   Tainted: G                T   (6.2.0-rc4-P9)
MSR:  0000000000000000 <>  CR: 00000000  XER: 00000000
CFAR: 0000000000000000 IRQMASK: 0 
GPR00: 0000000000000000 c000000006c68000 0000000000000000 0000000000000000 
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 0000000000000000 c000000000012b78 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
NIP [0000000000000000] 0x0
LR [0000000000000000] 0x0
--- interrupt: 0
BUG: scheduling while atomic: swapper/0/1/0x00000000
INFO: lockdep is turned off.
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W       T  6.2.0-rc4-P9 #1
Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
Call Trace:
[c000000006c66f90] [c0000000012a3c80] dump_stack_lvl+0xb4/0x124 (unreliable)
[c000000006c66fd0] [c000000000163c68] __schedule_bug+0xf8/0x120
[c000000006c67050] [c0000000012f8c28] __schedule+0x1218/0x14d0
[c000000006c67150] [c0000000012f8fa4] schedule+0xc4/0x200
[c000000006c671d0] [c0000000013052e4] schedule_timeout+0x174/0x1f0
[c000000006c672b0] [c0000000012fa20c] __wait_for_common+0x15c/0x310
[c000000006c67360] [c0000000000f2ce8] cpuhp_issue_call+0x398/0x570
[c000000006c673f0] [c0000000000f3778] __cpuhp_setup_state_cpuslocked+0x1a8/0x5f0
[c000000006c674b0] [c0000000000f4ef8] __cpuhp_setup_state+0x168/0x3f0
[c000000006c67530] [c0000000000d2cd8] init_imc_pmu+0x1f8/0x1790
[c000000006c67690] [c0000000000c4764] opal_imc_counters_probe+0x3a4/0x7e0
[c000000006c677e0] [c000000000dd09c0] platform_probe+0xa0/0x150
[c000000006c67860] [c000000000dcaf20] really_probe+0x170/0x590
[c000000006c67900] [c000000000dcb448] __driver_probe_device+0x108/0x1d0
[c000000006c67940] [c000000000dcb594] driver_probe_device+0x84/0x1a0
[c000000006c67990] [c000000000dcb9c4] __driver_attach+0x134/0x330
[c000000006c679e0] [c000000000dc5e6c] bus_for_each_dev+0xdc/0x150
[c000000006c67a30] [c000000000dc9fc0] driver_attach+0x40/0x70
[c000000006c67a60] [c000000000dc92b8] bus_add_driver+0x338/0x420
[c000000006c67b10] [c000000000dcd8d4] driver_register+0x154/0x310
[c000000006c67ba0] [c000000000dd0144] __platform_driver_register+0x54/0x80
[c000000006c67bd0] [c00000000203183c] opal_imc_driver_init+0x60/0x90
[c000000006c67c00] [c000000000011ee8] do_one_initcall+0xc8/0x630
[c000000006c67cf0] [c0000000020036cc] kernel_init_freeable+0x72c/0x864
[c000000006c67de0] [c000000000012b98] kernel_init+0x28/0x1d0
[c000000006c67e50] [c00000000000ce5c] ret_from_kernel_thread+0x5c/0x64
--- interrupt: 0 at 0x0
NIP:  0000000000000000 LR: 0000000000000000 CTR: 0000000000000000
REGS: c000000006c67e80 TRAP: 0000   Tainted: G        W       T   (6.2.0-rc4-P9)
MSR:  0000000000000000 <>  CR: 00000000  XER: 00000000
CFAR: 0000000000000000 IRQMASK: 0 
GPR00: 0000000000000000 c000000006c68000 0000000000000000 0000000000000000 
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR12: 0000000000000000 0000000000000000 c000000000012b78 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR28: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
NIP [0000000000000000] 0x0
LR [0000000000000000] 0x0
--- interrupt: 0
[...]

Some data about the machine:
 # inxi -bZ
System:
  Host: T1000 Kernel: 6.2.0-rc4-P9 arch: ppc64 bits: 64 Console: pty pts/0
    Distro: Gentoo Base System release 2.9
Machine:
  Type: PPC System: T2P9D01 REV 1.01 details: N/A
CPU:
  Info: 2x 4-core POWER9 altivec supported [MT MCP SMP] speed (MHz):
    avg: 2581 min/max: 2154/3800
Graphics:
  Device-1: ASPEED Graphics Family driver: N/A
  Device-2: AMD R480 [Radeon X800 GTO] driver: radeon v: kernel
  Device-3: N/A driver: N/A
  Display: x11 server: X.Org v: 21.1.1 driver: X: loaded: radeon
    gpu: radeon resolution: 1440x900~60Hz
  OpenGL: renderer: llvmpipe (LLVM 15.0.6 128 bits) v: 4.5 Mesa 22.3.3
Network:
  Device-1: Broadcom NetXtreme BCM5719 Gigabit Ethernet PCIe driver: tg3
  Device-2: Broadcom NetXtreme BCM5719 Gigabit Ethernet PCIe driver: tg3
Drives:
  Local Storage: total: 447.13 GiB used: 11.63 GiB (2.6%)
Info:
  Processes: 399 Uptime: 1m Memory: 54.7 GiB used: 3.05 GiB (5.6%)
  Shell: Bash inxi: 3.3.17


The issue is reproducibly, I get it every boot. Kernel .config and full dmesg attached.

Regards,
Erhard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg_62-rc4_p9
Type: application/octet-stream
Size: 60067 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20230117/e3bb50a6/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config_62-rc4_p9
Type: application/octet-stream
Size: 117755 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20230117/e3bb50a6/attachment-0003.obj>


More information about the Linuxppc-dev mailing list