[Qemu-ppc] pseries on qemu-system-ppc64le crashes in doorbell_core_ipi()
Jason A. Donenfeld
Jason at zx2c4.com
Thu Dec 19 21:41:21 AEDT 2019
Hi folks,
I'm actually still experiencing this sporadically in the WireGuard test
suite, which you can see being run on https://build.wireguard.com/ .
About 50% of the time the powerpc64 build will fail at a place like this:
[ 65.147823] Oops: Exception in kernel mode, sig: 4 [#1]
[ 65.149198] LE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=4 NUMA pSeries
[ 65.149595] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc1+ #1
[ 65.149745] NIP: c000000000033330 LR: c00000000007eda0 CTR:
c00000000007ed80
[ 65.149934] REGS: c000000000a47970 TRAP: 0700 Not tainted (5.5.0-rc1+)
[ 65.150032] MSR: 800000000288b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>
CR: 48008288 XER: 00000000
[ 65.150352] CFAR: c0000000000332bc IRQMASK: 1
[ 65.150352] GPR00: c000000000036508 c000000000a47c00 c000000000a4c100
0000000000000001
[ 65.150352] GPR04: c000000000a50998 0000000000000000 c000000000a50908
000000000f509000
[ 65.150352] GPR08: 0000000028000000 0000000000000000 0000000000000000
c00000000ff24f00
[ 65.150352] GPR12: c00000000007ed80 c000000000ad9000 0000000000000000
0000000000000000
[ 65.150352] GPR16: 00000000008c9190 00000000008c94a8 00000000008c92f8
00000000008c98b0
[ 65.150352] GPR20: 00000000008f2f88 fffffffffffffffd 0000000000000014
0000000000e6c100
[ 65.150352] GPR24: 0000000000e6c100 0000000000000001 0000000000000000
c000000000a50998
[ 65.150352] GPR28: c000000000a9e280 c000000000a50aa4 0000000000000002
0000000000000000
[ 65.151591] NIP [c000000000033330] doorbell_try_core_ipi+0xd0/0xf0
[ 65.151750] LR [c00000000007eda0] smp_pseries_cause_ipi+0x20/0x70
[ 65.151913] Call Trace:
[ 65.152109] [c000000000a47c00] [c0000000000c7c9c]
_nohz_idle_balance+0xbc/0x300 (unreliable)
[ 65.152370] [c000000000a47c30] [c000000000036508]
smp_send_reschedule+0x98/0xb0
[ 65.152711] [c000000000a47c50] [c0000000000c1634] kick_ilb+0x114/0x140
[ 65.152962] [c000000000a47ca0] [c0000000000c86d8]
newidle_balance+0x4e8/0x500
[ 65.153213] [c000000000a47d20] [c0000000000c8788]
pick_next_task_fair+0x48/0x3a0
[ 65.153424] [c000000000a47d80] [c000000000466620] __schedule+0xf0/0x430
[ 65.153612] [c000000000a47de0] [c000000000466b04] schedule_idle+0x34/0x70
[ 65.153786] [c000000000a47e10] [c0000000000c0bc8] do_idle+0x1a8/0x220
[ 65.154121] [c000000000a47e70] [c0000000000c0e94]
cpu_startup_entry+0x34/0x40
[ 65.154313] [c000000000a47ea0] [c00000000000ef1c] rest_init+0x10c/0x124
[ 65.154414] [c000000000a47ee0] [c000000000500004]
start_kernel+0x568/0x594
[ 65.154585] [c000000000a47f90] [c00000000000a7cc]
start_here_common+0x1c/0x330
[ 65.154854] Instruction dump:
[ 65.155191] 38210030 e8010010 7c0803a6 4e800020 3d220004 39295228
81290000 3929ffff
[ 65.155498] 7d284038 7c0004ac 5508017e 65082800 <7c00411c> e94d0178
812a0000 3929ffff
[ 65.156155] ---[ end trace 6180d12e268ffdaf ]---
[ 65.185452]
[ 66.187490] Kernel panic - not syncing: Fatal exception
This is with "qemu-system-ppc64 -smp 4 -machine pseries" on QEMU 4.0.0.
I'm not totally sure what's going on here. I'm emulating a pseries, and
using that with qemu's pseries model, and I see that selecting the
pseries forces the selection of 'config PPC_DOORBELL' (twice in the same
section, actually). Then inside the kernel there appears to be some
runtime CPU check for doorbell support. Is this a case in which QEMU is
advertising doorbell support that TCG doesn't have? Or is something else
happening here?
Thanks,
Jason
More information about the Linuxppc-dev
mailing list