[powerpc][next-20210727] Boot failure - kernel BUG at arch/powerpc/kernel/interrupt.c:98!

Nicholas Piggin npiggin at gmail.com
Thu Jul 29 14:08:52 AEST 2021


Excerpts from Nathan Chancellor's message of July 29, 2021 3:35 am:
> On Wed, Jul 28, 2021 at 01:31:06PM +0530, Sachin Sant wrote:
>> linux-next fails to boot on Power server (POWER8/POWER9). Following traces
>> are seen during boot
>> 
>> [    0.010799] software IO TLB: tearing down default memory pool
>> [    0.010805] ------------[ cut here ]------------
>> [    0.010808] kernel BUG at arch/powerpc/kernel/interrupt.c:98!
>> [    0.010812] Oops: Exception in kernel mode, sig: 5 [#1]
>> [    0.010816] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> [    0.010820] Modules linked in:
>> [    0.010824] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc3-next-20210727 #1
>> [    0.010830] NIP:  c000000000032cfc LR: c00000000000c764 CTR: c00000000000c670
>> [    0.010834] REGS: c000000003603b10 TRAP: 0700   Not tainted  (5.14.0-rc3-next-20210727)
>> [    0.010838] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000002
>> [    0.010848] CFAR: c00000000000c760 IRQMASK: 3 
>> [    0.010848] GPR00: c00000000000c764 c000000003603db0 c0000000029bd000 0000000000000001 
>> [    0.010848] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
>> [    0.010848] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000003 
>> [    0.010848] GPR12: ffffffffffffffff c00000001ec9ee80 c000000000012a28 0000000000000000 
>> [    0.010848] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [    0.010848] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [    0.010848] GPR24: 000000000000f134 0000000000000000 ffffffffffffffff c000000003603868 
>> [    0.010848] GPR28: 0000000000000400 0000000000000a68 c00000000202e9c0 c000000003603e80 
>> [    0.010896] NIP [c000000000032cfc] system_call_exception+0x8c/0x2e0
>> [    0.010901] LR [c00000000000c764] system_call_common+0xf4/0x258
>> [    0.010907] Call Trace:
>> [    0.010909] [c000000003603db0] [c00000000016a6dc] calculate_sigpending+0x4c/0xe0 (unreliable)
>> [    0.010915] [c000000003603e10] [c00000000000c764] system_call_common+0xf4/0x258
>> [    0.010921] --- interrupt: c00 at kvm_template_end+0x4/0x8
>> [    0.010926] NIP:  c000000000092dec LR: c000000000114fc8 CTR: 0000000000000000
>> [    0.010930] REGS: c000000003603e80 TRAP: 0c00   Not tainted  (5.14.0-rc3-next-20210727)
>> [    0.010934] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000222  XER: 00000000
>> [    0.010943] IRQMASK: 0 
>> [    0.010943] GPR00: c00000000202e9c0 c000000003603b00 c0000000029bd000 000000000000f134 
>> [    0.010943] GPR04: 0000000000000a68 0000000000000400 c000000003603868 ffffffffffffffff 
>> [    0.010943] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [    0.010943] GPR12: 0000000000000000 c00000001ec9ee80 c000000000012a28 0000000000000000 
>> [    0.010943] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [    0.010943] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
>> [    0.010943] GPR24: c0000000020033c4 c00000000110afc0 c000000002081950 c000000003277d40 
>> [    0.010943] GPR28: 0000000000000000 c00000000a680000 0000000004000000 00000000000d0000 
>> [    0.010989] NIP [c000000000092dec] kvm_template_end+0x4/0x8
>> [    0.010993] LR [c000000000114fc8] set_memory_encrypted+0x38/0x60
>> [    0.010999] --- interrupt: c00
>> [    0.011001] [c000000003603b00] [c00000000000c764] system_call_common+0xf4/0x258 (unreliable)
>> [    0.011008] Instruction dump:
>> [    0.011011] 694a0003 312affff 7d495110 0b0a0000 60000000 60000000 e87f0108 68690002 
>> [    0.011019] 7929ffe2 0b090000 68634000 786397e2 <0b030000> e93f0138 792907e0 0b090000 
>> [    0.011029] ---[ end trace a20ad55589efcb10 ]---
>> [    0.012297] 
>> [    1.012304] Kernel panic - not syncing: Fatal exception
>> 
>> next-20210723 was good. The boot failure seems to have been introduced with next-20210726.
>> 
>> I have attached the boot log.
> 
> I noticed this with OpenSUSE's ppc64le config [1] and my bisect landed on
> commit ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()"). That
> series just keeps on giving... Adding some people from that thread to
> this one. Original thread:
> https://lore.kernel.org/r/1905CD70-7656-42AE-99E2-A31FC3812EAC@linux.vnet.ibm.com/

This is because powerpc's set_memory_encrypted makes an ultracall but it 
does not exist on that processor.

x86's set_memory_encrypted/decrypted have

       /* Nothing to do if memory encryption is not active */
        if (!mem_encrypt_active())
                return 0;

Probably powerpc should just do that too.

Thanks,
Nick


More information about the Linuxppc-dev mailing list