[PATCH v3 0/4] Fixes for 3 separate NMI reentrancy bugs
Satheesh Rajendran
sathnaga at linux.vnet.ibm.com
Tue Feb 26 17:51:10 AEDT 2019
On Tue, Feb 26, 2019 at 04:08:57PM +1000, Nicholas Piggin wrote:
> This series fixes several similar but unrelated bugs with NMIs
> clobbering live registers without noticing it, because MSR[RI] is set.
> Pretty rare bugs, but serious silent corruption consequences.
>
> For the most part these can be observed and tested quite easily
> with the mambo simulator, except that it does not seem to follow
> the architecture wrt leaving MSR[RI] unchanged for HV interrupts.
> Mambo clears MSR[RI], so you have to account for that manually.
>
> Since v1:
> - Fixed several build bugs.
>
> Since v2:
> - Improved changelog and comments.
> - Fixed the NIA test for virt mode interrupts.
Hit with below crash on Power8 box, patch built with linuxppc merge branch with `ppc64le_defconfig`
UnknownStateTransition: Something happened system state="8" and we transitioned to UNKNOWN state. Review the following for more details
Message="OpTestSystem in run_IPLing and Exception="Kernel OOPS (machine in state '5'): Oops: Kernel access of bad area, sig: 11 [#1]
[ 0.000000] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7-gf46b87021 #1
[ 0.000000] NIP: c000000000c1306c LR: c000000000c12f64 CTR: c00000000033d860
[ 0.000000] REGS: c0000000014878b0 TRAP: 0380 Not tainted (5.0.0-rc7-gf46b87021)
[ 0.000000] MSR: 9000000000001033 <SF,HV,ME,IR,DR,RI,LE> CR: 28002224 XER: 00000000
[ 0.000000] CFAR: c000000000c12f7c IRQMASK: 1
[ 0.000000] GPR00: c000000000c12f64 c000000001487b40 c000000001488400 f000000000000000
[ 0.000000] GPR04: c000000001487b18 c000000001487b20 0000000000000000 c000000001388400
[ 0.000000] GPR08: f000000000000000 f000000000000008 0000000000000000 0000000800000000
[ 0.000000] GPR12: c0000000015e1ed0 c000000001670000 0000000000000000 0000000000000000
[ 0.000000] GPR16: 0000000000000000 0000000000000000 c0000000015e0d40 0000000000000001
[ 0.000000] GPR20: ffffffffffffffff ffffffffffffffff 0000000008000000 c000000001413b90
[ 0.000000] GPR24: c000000001413b98 007ffff000000000 0000000000080000 0000000000000000
[ 0.000000] GPR28: 0000000000000000 0000000000000000 007ffff000001000 0000000000000000
[ 0.000000] NIP [c000000000c1306c] memmap_init_zone+0x258/0x308
[ 0.000000] LR [c000000000c12f64] memmap_init_zone+0x150/0x308
[ 0.000000] Call Trace:
[ 0.000000] [c000000001487b40] [c000000000c12f64] memmap_init_zone+0x150/0x308 (unreliable)
[ 0.000000] [c000000001487be0] [c000000000f87acc] free_area_init_node+0x480/0x518
[ 0.000000] [c000000001487cf0] [c000000000f88630] free_area_init_nodes+0x838/0x940
[ 0.000000] [c000000001487e10] [c000000000f6340c] paging_init+0x8c/0xa8
[ 0.000000] [c000000001487e80] [c000000000f5bc00] setup_arch+0x3b4/0x3f0
[ 0.000000] [c000000001487ef0] [c000000000f53b68] start_kernel+0x94/0x630
[ 0.000000] [c000000001487f90] [c00000000000b37c] start_here_common+0x1c/0x520
[ 0.000000] Instruction dump:
[ 0.000000] 71290002 41820014 ebea0008 7cc6fa14 78df8402 48000070 3d22000c 7bea3664
[ 0.000000] 39299d20 e9090000 7c685214 39230008 <fa290010> fa290018 fa290020 fa290030
[ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x40/0x80 with crng_init=0
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000]
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Rebooting in 10 seconds" caused the system to go to UNKNOWN_BAD and the system will be stopping."
Regards,
-Satheesh.
>
> Nicholas Piggin (4):
> powerpc/64s: Fix HV NMI vs HV interrupt recoverability test
> powerpc/64s: system reset interrupt preserve HSRRs
> powerpc/64s: Prepare to handle data interrupts vs d-side MCE
> reentrancy
> powerpc/64s: Fix data interrupts vs d-side MCE reentrancy
>
> arch/powerpc/include/asm/asm-prototypes.h | 8 ++
> arch/powerpc/include/asm/nmi.h | 2 +
> arch/powerpc/kernel/exceptions-64s.S | 92 +++++++++++++++++++----
> arch/powerpc/kernel/mce.c | 3 +
> arch/powerpc/kernel/traps.c | 91 +++++++++++++++++++++-
> 5 files changed, 179 insertions(+), 17 deletions(-)
>
> --
> 2.18.0
>
More information about the Linuxppc-dev
mailing list