[6.0.0-rc7-next-20220930] kernel BUG at arch/powerpc/kernel/exceptions-64s.S:2831!

Sachin Sant sachinp at linux.ibm.com
Sun Oct 2 19:51:59 AEDT 2022


With recent versions of linux-next I am observing kernel crashes on Power server.
I saw this crash once just after boot. I also saw similar crash while compiling a
Kernel or during a git clone of kernel source. Seem to occur at random times.

[  175.165592] ------------[ cut here ]------------
[  175.165618] kernel BUG at arch/powerpc/kernel/exceptions-64s.S:2831!
[  175.165637] Oops: Exception in kernel mode, sig: 5 [#1]
[  175.165647] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[  175.165657] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) rfkill(E) tls(E) ip_set(E) nf_tables(E) libcrc32c(E) nfnetlink(E) sunrpc(E) pseries_rng(E) vmx_crypto(E) ext4(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) ipmi_devintf(E) ipmi_msghandler(E) fuse(E)
[  175.165805] CPU: 6 PID: 11059 Comm: sed Tainted: G            E      6.0.0-rc7-next-20220930 #1
[  175.165820] Hardware name: IBM,8375-42A POWER9 (raw) 0x4e0202 0xf000005 of:IBM,FW950.50 (VL950_105) hv:phyp pSeries
[  175.165832] NIP:  c00000000000be38 LR: c00000000001cdfc CTR: c000000000008ed0
[  175.165844] REGS: c00000002840b5b0 TRAP: 0700   Tainted: G            E       (6.0.0-rc7-next-20220930)
[  175.165856] MSR:  8000000000021031 <SF,ME,IR,DR,LE>  CR: 44828844  XER: 00000000
[  175.165881] CFAR: c000000000008f74 IRQMASK: 1 
[  175.165881] GPR00: c00000000001df08 c00000002840b850 c00000000135e800 0000000002802000 
[  175.165881] GPR04: c000000003717e80 0000000000000000 0000000000000166 c000000002a3aa80 
[  175.165881] GPR08: 800000000280b033 0000000000000005 0000000000000004 c00000000001c864 
[  175.165881] GPR12: 800000000280b033 c00000001ec58b00 0000000000000000 c00000002840bbd8 
[  175.165881] GPR16: c000000002988b70 0000000000000009 61c8864680b583eb 0000000000000002 
[  175.165881] GPR20: c000000002aa3e00 c00000002840bac8 0000000000000001 c000000201eb9f40 
[  175.165881] GPR24: c000000201eba420 c000000003718ca0 000000063a4a0000 c000000002160db0 
[  175.165881] GPR28: c000000002160db0 c000000201eb9600 800000004280f033 c000000201eb9600 
[  175.166058] NIP [c00000000000be38] masked_interrupt+0xc/0xe4
[  175.166076] LR [c00000000001cdfc] giveup_all+0x6c/0x130
[  175.166088] Call Trace:
[  175.166094] [c00000002840b850] [c00000002840b8e0] 0xc00000002840b8e0 (unreliable)
[  175.166113] [c00000002840b880] [c00000000001df08] __switch_to+0x108/0x4b0
[  175.166131] [c00000002840b8e0] [c000000000ed07c0] __schedule+0x2b0/0x9e0
[  175.166147] [c00000002840b9b0] [c000000000ed0f68] schedule+0x78/0x140
[  175.166163] [c00000002840ba20] [c000000000ed169c] io_schedule+0x2c/0x50
[  175.166182] [c00000002840ba50] [c000000000419fb4] filemap_fault+0xc74/0x1240
[  175.166199] [c00000002840bb70] [c00000000047a484] __do_fault+0x64/0x240
[  175.166215] [c00000002840bbb0] [c00000000047e598] __handle_mm_fault+0x1078/0x16f0
[  175.166232] [c00000002840bcb0] [c00000000047ed38] handle_mm_fault+0x128/0x320
[  175.166247] [c00000002840bd00] [c000000000092054] ___do_page_fault+0x2f4/0xb50
[  175.166265] [c00000002840bdb0] [c000000000092ac0] hash__do_page_fault+0x30/0x70
[  175.166281] [c00000002840bde0] [c00000000009b918] do_hash_fault+0x278/0x470
[  175.166304] [c00000002840be10] [c000000000008ce8] instruction_access_common_virt+0x198/0x1a0
[  175.166325] Instruction dump:
[  175.166337] e96a0110 e96a0198 e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 
[  175.166368] 4e800020 912d00b4 892d0933 71290025 <0b090000> 892d0933 7d295378 992d0933 
[  175.166401] ---[ end trace 0000000000000000 ]---
[  175.173284]  

Another instance of this crash:

[    3.109142] ------------[ cut here ]------------
[    3.109151] kernel BUG at arch/powerpc/kernel/exceptions-64s.S:2831!
[    3.109156] Oops: Exception in kernel mode, sig: 5 [#1]
[    3.109160] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[    3.109164] Modules linked in: sd_mod(E) t10_pi(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) fuse(E)
[    3.109177] CPU: 14 PID: 600 Comm: fsck.ext4 Tainted: G            E      6.0.0-rc7-next-20220930 #1
[    3.109182] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00 (NH1030_026) hv:phyp pSeries
[    3.109187] NIP:  c00000000000be38 LR: c00000000001d5e8 CTR: c000000000008ed0
[    3.109191] REGS: c0000000210b3a90 TRAP: 0700   Tainted: G            E       (6.0.0-rc7-next-20220930)
[    3.109195] MSR:  8000000000021031 <SF,ME,IR,DR,LE>  CR: 44042874  XER: 20040000
[    3.109202] CFAR: c000000000008f74 IRQMASK: 1 
[    3.109202] GPR00: c0000000000340b4 c0000000210b3d30 c00000000135e800 0000000002800000 
[    3.109202] GPR04: c0000000210b3e80 0000000000000000 0000000000000000 0000000000000000 
[    3.109202] GPR08: 8000000002809033 0000000000000005 0000000000000004 c00000000001c864 
[    3.109202] GPR12: 8000000002809033 c000000affd00300 000000013850b460 0000000138514380 
[    3.109202] GPR16: 000000013851af10 0000000000000002 0000000138514a84 000000000000dd20 
[    3.109202] GPR20: 000000012f682548 000000012f682498 00007fffcf59c7b0 0000000000000001 
[    3.109202] GPR24: 0000000000b823f5 0000000000000001 0000000002002000 0000000002802000 
[    3.109202] GPR28: 0000000002800000 c0000000210b3e80 0000000000800000 0000000002800000 
[    3.109246] NIP [c00000000000be38] masked_interrupt+0xc/0xe4
[    3.109254] LR [c00000000001d5e8] restore_math+0xf8/0x2e0
[    3.109259] Call Trace:
[    3.109260] [c0000000210b3d30] [00000001384f8f90] 0x1384f8f90 (unreliable)
[    3.109266] [c0000000210b3d80] [c0000000000340b4] interrupt_exit_user_prepare_main+0x84/0x270
[    3.109272] [c0000000210b3de0] [c000000000034314] syscall_exit_prepare+0x74/0x160
[    3.109277] [c0000000210b3e10] [c00000000000c6e0] system_call_common+0x100/0x278
[    3.109283] --- interrupt: c00 at 0x7fffa0a1e744
[    3.109286] NIP:  00007fffa0a1e744 LR: 00007fffa0cb72b0 CTR: 0000000000000000
[    3.109290] REGS: c0000000210b3e80 TRAP: 0c00   Tainted: G            E       (6.0.0-rc7-next-20220930)
[    3.109294] MSR:  800000000000f033 <SF,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 22042472  XER: 00000000
[    3.109303] IRQMASK: 0 
[    3.109303] GPR00: 00000000000000b4 00007fffcf59c6a0 00007fffa0b07300 0000000000001000 
[    3.109303] GPR04: 00000001384f8f90 0000000000001000 0000000b823f5000 00000001384f8f90 
[    3.109303] GPR08: 00000001384f1608 0000000000000000 0000000000000000 0000000000000000 
[    3.109303] GPR12: 0000000000000000 00007fffa0dace90 000000013850b460 0000000138514380 
[    3.109303] GPR16: 000000013851af10 0000000000000002 0000000138514a84 000000000000dd20 
[    3.109303] GPR20: 000000012f682548 000000012f682498 00007fffcf59c7b0 0000000000000001 
[    3.109303] GPR24: 0000000000b823f5 0000000000000001 00000001384f8f90 00000001384f14a0 
[    3.109303] GPR28: 0000000b823f5000 0000000000001000 00000001384f1570 0000000000000000 
[    3.109344] NIP [00007fffa0a1e744] 0x7fffa0a1e744
[    3.109347] LR [00007fffa0cb72b0] 0x7fffa0cb72b0
[    3.109350] --- interrupt: c00
[    3.109352] Instruction dump:
[    3.109355] e96a0110 e96a0198 e96a0220 e96a02a8 e96a0330 e96a03b8 394a0400 4200ffdc 
[    3.109362] 4e800020 912d00b4 892d0933 71290025 <0b090000> 892d0933 7d295378 992d0933 
[    3.109369] ---[ end trace 0000000000000000 ]—

This BUG entry was added with 
commit c39fb71a54f09977eba7584ef0eebb25047097c6
    powerpc/64s/interrupt: masked handler debug check for previous hard disable

CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is set.

Thanks
 - Sachin


More information about the Linuxppc-dev mailing list