ppc: hard lockup / hang in v5.17-rc1 under QEMU

Miguel Ojeda miguel.ojeda.sandonis at gmail.com
Thu Jan 27 01:16:06 AEDT 2022


Hi PPC folks,

Our ppc64le CI deterministically triggers a hard lockup / hang under
QEMU since v5.17-rc1 (upgrading from v5.16).

Bisecting points to 0faf20a1ad16 ("powerpc/64s/interrupt: Don't enable
MSR[EE] in irq handlers unless perf is in use").

Cheers,
Miguel

[   16.328310] watchdog: CPU 1 detected hard LOCKUP on other CPUs 0
[   16.328955] watchdog: CPU 1 TB:16743325700, last SMP heartbeat
TB:8453096925 (16191ms ago)
[   16.330786] watchdog: CPU 0 Hard LOCKUP
[   16.331078] watchdog: CPU 0 TB:16744720354, last heartbeat
TB:8453109168 (16194ms ago)
[   16.331295] Kernel panic - not syncing: Hard LOCKUP
[   16.331312] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc2+ #28
[   16.331729] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.16.0-rc2+ #28
[   16.332294] NIP:  c000000000009784 LR: c00000000034ee60 CTR: c0000000000096e0
[   16.332339] REGS: c00000001ff87d60 TRAP: 0100   Not tainted  (5.16.0-rc2+)
[   16.332410] MSR:  8000000000001031 <SF
[   16.332520] Call Trace:
[   16.334429] ,ME,IR,DR,LE>  CR: 24000088  XER: 20040000
[   16.334770] CFAR: c00000000000977c IRQMASK: 3
[   16.334770] GPR00: c00000000034ee60 c00000000111f590
c00000000111f400 0000000000008002
[   16.334770] GPR04: 0000000000000001 c00000000113f400
c00000000106f400 c00000000110f400
[   16.334770] GPR08: c0000000003af400 0000000024000088
c00000000111f8a0 c000000000011d00
[   16.334770] GPR12: 8000000000009033 c0000000011d0000
0000000000000000 0000000000000000
[   16.334770] GPR16: c00000000110ab90 c0000000010722f0
c000000001142100 0000000000000001
[   16.334770] GPR20: 0000000000000000 000000000000000a
ffffffff00049233 c0000000010722a8
[   16.334770] GPR24: 0000000000000282 c000000001076580
c000000001143a00 0000000004200002
[   16.334770] GPR28: c0000000010b0480 c0000000010722a8
c0000000003b6cfa 00000000015e0000
[   16.334464] [c0000000062675c0] [c0000000003283e0] dump_stack_lvl+0x78/0xb8
[   16.335274] NIP [c000000000009784] decrementer_common_virt+0xa4/0x210
[   16.336451]  (unreliable)
[   16.335629] LR [c00000000034ee60] __do_softirq+0xe0/0x2c4
[   16.336658] [c000000006267600] [c00000000009378c] panic+0x150/0x3a4
[   16.336797] Call Trace:
[   16.337294]
[   16.336809] [c00000000111f590] [c00000000000c6d8]
interrupt_return_srr_kernel+0x8/0xec (unreliable)
[   16.337615] [c00000000111f8a0] [c0000000000d2b24]
trigger_load_balance+0x94/0x480
[   16.337863] [c00000000111f8d0] [c00000000034ee60] __do_softirq+0xe0/0x2c4
[   16.338079] [c00000000111f9c0] [c00000000009b018] irq_exit+0xa8/0x130
[   16.338225] [c00000000111fa00] [c00000000001b590] timer_interrupt+0x1b0/0x200
[   16.338496] [c00000000111fa60] [c0000000000098e8]
decrementer_common_virt+0x208/0x210
[   16.338803] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   16.339089] NIP:  c000000000072a48 LR: c000000000074d04 CTR: c000000000074c60
[   16.339103] REGS: c00000000111fad0 TRAP: 0900   Not tainted  (5.16.0-rc2+)
[   16.339117] MSR:  8000000002009033 <SF,VEC,EE,ME
[   16.337355] [c0000000062676b0] [c000000000092d78] nmi_panic+0x78/0x90
[   16.339231] ,IR,DR,RI,LE>  CR: 22000888  XER: 00000000
[   16.339354] CFAR: c000000000074d00 IRQMASK: 0
[   16.339354] GPR00: 0000000022000888
[   16.340017]
[   16.340120] [c000000006267710] [c000000000027650]
watchdog_smp_panic+0x420/0x4e0
[   16.340441] [c000000006267800] [c00000000002717c]
watchdog_timer_fn+0xac/0x160
[   16.340804] c00000000111fd70 c00000000111f400
[   16.340715] [c000000006267840] [c0000000001202d8] __run_hrtimer+0xc8/0x190
[   16.341234] 0000000000000000
[   16.341234] GPR04: 0000000000000001 c00000000106f400
0000000000000000 0000000001f40000
[   16.341234] GPR08: 001908b100000000 00000000000002ea
0000000000000000 000000000000006b
[   16.341234] GPR12: c000000000074c60 c0000000011d0000
0000000000000000 0000000000000000
[   16.341234] GPR16: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[   16.341234] GPR20: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
[   16.341234] GPR24:
[   16.341779]
[   16.342182] c00000000113f400 0000000000000001 c000000001145928
0000000000000001
[   16.342182] GPR28: c0000000010717a0 c000000001071798
c0000000010b0480 00000000015e0000
[   16.342160] [c000000006267890] [c00000000011f300]
hrtimer_run_queues+0x150/0x1c0
[   16.342276] NIP [c000000000072a48] plpar_hcall_norets_notrace+0x18/0x2c
[   16.342783]
[   16.342820] [c000000006267910] [c00000000011cf58]
update_process_times+0x88/0x110
[   16.343127] [c000000006267960] [c00000000012dbc8]
tick_nohz_handler+0xd8/0x150
[   16.344080] [c0000000062679a0] [c00000000001b570] timer_interrupt+0x190/0x200
[   16.344439] [c000000006267a00] [c0000000000098e8]
decrementer_common_virt+0x208/0x210
[   16.342542] LR [c000000000074d04] pseries_lpar_idle+0xa4/0x160
[   16.347000] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   16.347224] --- interrupt: 900
[   16.347525] NIP:  c000000000072a48 LR: c000000000074d04 CTR: c000000000074c60
[   16.347637] REGS: c000000006267a70 TRAP: 0900   Not tainted  (5.16.0-rc2+)
[   16.347716] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE
[   16.347269] [c00000000111fd70] [c00000000111fdc0] .TOC.+0x9c0/0xc00
[   16.347871] >  CR: 22000048  XER: 00000000
[   16.347910] CFAR: c000000000074d00 IRQMASK: 0
[   16.347910] GPR00: 0000000022000048 c000000006267d10
c00000000111f400 0000000000000000
[   16.347910] GPR04: 0000000000000001
[   16.348004]  (unreliable)
[   16.348331] c00000000106f400 0000000000000800 0000000000007b26
[   16.348331] GPR08: 0005204180000000 0000000000000000
0000000000000000 000000000000002c
[   16.348331] GPR12: c000000000074c60 c00000001ffe4d00
[   16.348347] [c00000000111fdf0] [c0000000000179a4] arch_cpu_idle+0x74/0x110
[   16.348453] c000000001145a28 c00000000118f400
[   16.348453] GPR16: c000000001070488 c0000000011458d0
c0000000011458d8 0000000000000001
[   16.348453] GPR20: c00000000118f400 0000000000000000
0000000000000000 0000000000000002
[   16.348749]
[   16.349107]
[   16.349107] GPR24: c00000000113f400 0000000000000002
c000000001145928 0000000000000001
[   16.349107] GPR28: c0000000010717a0 c000000001071798
c0000000061aa200 c00000000003d350
[   16.349137] [c00000000111fe30] [c00000000034e28c] default_idle_call+0x4c/0x90
[   16.349398] NIP [c000000000072a48] plpar_hcall_norets_notrace+0x18/0x2c
[   16.349562]
[   16.349771] [c00000000111fe50] [c0000000000d0b40] do_idle+0x110/0x1d0
[   16.349946] [c00000000111feb0] [c0000000000d0c34] cpu_startup_entry+0x34/0x50
[   16.350175] [c00000000111fee0] [c000000000011180] rest_init+0xe0/0x110
[   16.350307] [c00000000111ff10] [c0000000010046e8] start_kernel+0x3ac/0x424
[   16.350910] [c00000000111ff90] [c00000000000d560] start_here_common+0x1c/0x3c
[   16.351302] Instruction dump:
[   16.351659] 4182000c 39400001 48000008 894d0932 714a0001 39400008
408223b4 718a4000
[   16.351782] 7c2a0b78 3821fcf0 41c20008 e82d0910 <0981fcf0> f92101a0
f9610170 f9810178
[   16.350674] LR [c000000000074d04] pseries_lpar_idle+0xa4/0x160
[   16.352526] Oops: Unrecoverable System Reset, sig: 6 [#1]
[   16.353142] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2 pSeries
[   16.353575] --- interrupt: 900
[   16.353629] [c000000006267d10] [c0000000000159fc]
__switch_to+0x1cc/0x290 (unreliable)
[   16.353954] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc2+ #28
[   16.353890] [c000000006267d90] [c0000000000179a4] arch_cpu_idle+0x74/0x110
[   16.354193] NIP:  c000000000009784 LR: c00000000034ee60 CTR: c0000000000096e0
[   16.354350] REGS: c00000001ff87d60 TRAP: 0100   Not tainted  (5.16.0-rc2+)
[   16.354497] MSR:  8000000000001031 <SF,ME,IR
[   16.354351] [c000000006267dd0] [c00000000034e28c] default_idle_call+0x4c/0x90
[   16.354774] [c000000006267df0] [c0000000000d0b40] do_idle+0x110/0x1d0
[   16.355001] [c000000006267e50] [c0000000000d0c34] cpu_startup_entry+0x34/0x50
[   16.355318] [c000000006267e80] [c00000000003e6d0]
start_secondary+0xc30/0x1060
[   16.355597] [c000000006267f90] [c00000000000ce54]
start_secondary_prolog+0x10/0x14
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config
Type: application/octet-stream
Size: 35096 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20220126/e1026d1c/attachment-0001.obj>


More information about the Linuxppc-dev mailing list