[Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()

Jason A. Donenfeld Jason at zx2c4.com
Thu Dec 19 21:41:21 AEDT 2019


Hi folks,

I'm actually still experiencing this sporadically in the WireGuard test 
suite, which you can see being run on https://build.wireguard.com/ . 
About 50% of the time the powerpc64 build will fail at a place like this:

[   65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
[   65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
[   65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
[   65.149745] NIP:  c000000000033330 LR: c00000000007eda0 CTR: 
c00000000007ed80
[   65.149934] REGS: c000000000a47970 TRAP: 0700   Not tainted  (5.5.0-rc1+)
[   65.150032] MSR:  800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> 
CR: 48008288  XER: 00000000
[   65.150352] CFAR: c0000000000332bc IRQMASK: 1
[   65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100 
0000000000000001
[   65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908 
000000000f509000
[   65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000 
c00000000ff24f00
[   65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000 
0000000000000000
[   65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8 
00000000008c98b0
[   65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014 
0000000000e6c100
[   65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000 
c000000000a50998
[   65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002 
0000000000000000
[   65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
[   65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
[   65.151913] Call Trace:
[   65.152109] [c000000000a47c00] [c0000000000c7c9c] 
_nohz_idle_balance+0xbc/0x300 (unreliable)
[   65.152370] [c000000000a47c30] [c000000000036508] 
smp_send_reschedule+0x98/0xb0
[   65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
[   65.152962] [c000000000a47ca0] [c0000000000c86d8] 
newidle_balance+0x4e8/0x500
[   65.153213] [c000000000a47d20] [c0000000000c8788] 
pick_next_task_fair+0x48/0x3a0
[   65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
[   65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
[   65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
[   65.154121] [c000000000a47e70] [c0000000000c0e94] 
cpu_startup_entry+0x34/0x40
[   65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
[   65.154414] [c000000000a47ee0] [c000000000500004] 
start_kernel+0x568/0x594
[   65.154585] [c000000000a47f90] [c00000000000a7cc] 
start_here_common+0x1c/0x330
[   65.154854] Instruction dump:
[   65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228 
81290000 3929ffff
[   65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178 
812a0000 3929ffff
[   65.156155] ---[ end trace 6180d12e268ffdaf ]---
[   65.185452]
[   66.187490] Kernel panic - not syncing: Fatal exception

This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.

I'm not totally sure what's going on here. I'm emulating a pseries, and 
using that with qemu's pseries model, and I see that selecting the 
pseries forces the selection of 'config PPC_DOORBELL' (twice in the same 
section, actually). Then inside the kernel there appears to be some 
runtime CPU check for doorbell support. Is this a case in which QEMU is 
advertising doorbell support that TCG doesn't have? Or is something else 
happening here?

Thanks,
Jason


More information about the Linuxppc-dev mailing list