[powerpc][next-20210727] Boot failure - kernel BUG at arch/powerpc/kernel/interrupt.c:98!

Nathan Chancellor nathan at kernel.org
Thu Jul 29 03:35:34 AEST 2021


On Wed, Jul 28, 2021 at 01:31:06PM +0530, Sachin Sant wrote:
> linux-next fails to boot on Power server (POWER8/POWER9). Following traces
> are seen during boot
> 
> [    0.010799] software IO TLB: tearing down default memory pool
> [    0.010805] ------------[ cut here ]------------
> [    0.010808] kernel BUG at arch/powerpc/kernel/interrupt.c:98!
> [    0.010812] Oops: Exception in kernel mode, sig: 5 [#1]
> [    0.010816] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [    0.010820] Modules linked in:
> [    0.010824] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc3-next-20210727 #1
> [    0.010830] NIP:  c000000000032cfc LR: c00000000000c764 CTR: c00000000000c670
> [    0.010834] REGS: c000000003603b10 TRAP: 0700   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010838] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000002
> [    0.010848] CFAR: c00000000000c760 IRQMASK: 3 
> [    0.010848] GPR00: c00000000000c764 c000000003603db0 c0000000029bd000 0000000000000001 
> [    0.010848] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010848] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000003 
> [    0.010848] GPR12: ffffffffffffffff c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010848] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010848] GPR24: 000000000000f134 0000000000000000 ffffffffffffffff c000000003603868 
> [    0.010848] GPR28: 0000000000000400 0000000000000a68 c00000000202e9c0 c000000003603e80 
> [    0.010896] NIP [c000000000032cfc] system_call_exception+0x8c/0x2e0
> [    0.010901] LR [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010907] Call Trace:
> [    0.010909] [c000000003603db0] [c00000000016a6dc] calculate_sigpending+0x4c/0xe0 (unreliable)
> [    0.010915] [c000000003603e10] [c00000000000c764] system_call_common+0xf4/0x258
> [    0.010921] --- interrupt: c00 at kvm_template_end+0x4/0x8
> [    0.010926] NIP:  c000000000092dec LR: c000000000114fc8 CTR: 0000000000000000
> [    0.010930] REGS: c000000003603e80 TRAP: 0c00   Not tainted  (5.14.0-rc3-next-20210727)
> [    0.010934] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000000
> [    0.010943] IRQMASK: 0 
> [    0.010943] GPR00: c00000000202e9c0 c000000003603b00 c0000000029bd000 000000000000f134 
> [    0.010943] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
> [    0.010943] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR12: 0000000000000000 c00000001ec9ee80 c000000000012a28 0000000000000000 
> [    0.010943] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [    0.010943] GPR24: c0000000020033c4 c00000000110afc0 c000000002081950 c000000003277d40 
> [    0.010943] GPR28: 0000000000000000 c00000000a680000 0000000004000000 00000000000d0000 
> [    0.010989] NIP [c000000000092dec] kvm_template_end+0x4/0x8
> [    0.010993] LR [c000000000114fc8] set_memory_encrypted+0x38/0x60
> [    0.010999] --- interrupt: c00
> [    0.011001] [c000000003603b00] [c00000000000c764] system_call_common+0xf4/0x258 (unreliable)
> [    0.011008] Instruction dump:
> [    0.011011] 694a0003 312affff 7d495110 0b0a0000 60000000 60000000 e87f0108 68690002 
> [    0.011019] 7929ffe2 0b090000 68634000 786397e2 <0b030000> e93f0138 792907e0 0b090000 
> [    0.011029] ---[ end trace a20ad55589efcb10 ]---
> [    0.012297] 
> [    1.012304] Kernel panic - not syncing: Fatal exception
> 
> next-20210723 was good. The boot failure seems to have been introduced with next-20210726.
> 
> I have attached the boot log.

I noticed this with OpenSUSE's ppc64le config [1] and my bisect landed on
commit ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()"). That
series just keeps on giving... Adding some people from that thread to
this one. Original thread:
https://lore.kernel.org/r/1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com/

[1]: https://github.com/openSUSE/kernel-source/raw/master/config/ppc64le/default

Cheers,
Nathan


More information about the Linuxppc-dev mailing list