[PATCH 3/4] powerpc/64e/interrupt: Prevent NMI PMI causing a dangerous warning

Nicholas Piggin npiggin at gmail.com
Fri Oct 7 01:04:12 AEDT 2022


As explained in the fix for 64s, NMI PMIs should not return using
the normal interrupt_return function. If such a PMI hits in code
returning to user with the context switched to user mode, this warning
can fire. This was enough to cause crashes when reproducing on 64s,
because another perf interrupt would hit while reporting bug, and
that would cause another bug, and so on.

Work around this for now just by disabling that warning on 64e, which
improves stability. Make a note of what the cleaner fix would be.

Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
---
 arch/powerpc/kernel/exceptions-64e.S |  7 +++++++
 arch/powerpc/kernel/interrupt.c      | 13 ++++++++++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 930e36099015..d8bf8b94401b 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -813,6 +813,13 @@ kernel_dbg_exc:
 	EXCEPTION_COMMON(0x260)
 	CHECK_NAPPING()
 	addi	r3,r1,STACK_FRAME_OVERHEAD
+	/*
+	 * XXX: Returning from performance_monitor_exception taken as a
+	 * soft-NMI (Linux irqs disabled) may be risky to use interrupt_return
+	 * and could cause bugs in return or elsewhere. That case should just
+	 * restore registers and return. There is a workaround for this for one
+	 * known problem in interrupt_exit_kernel_prepare().
+	 */
 	bl	performance_monitor_exception
 	b	interrupt_return
 
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index f9db0a172401..299683d1f8e5 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -374,10 +374,17 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 	if (regs_is_unrecoverable(regs))
 		unrecoverable_exception(regs);
 	/*
-	 * CT_WARN_ON comes here via program_check_exception,
-	 * so avoid recursion.
+	 * CT_WARN_ON comes here via program_check_exception, so avoid
+	 * recursion.
+	 *
+	 * Skip the assertion if 64e to work around a problem caused by NMI
+	 * PMIs incorrectly taking this interrupt return path, it's possible
+	 * for this to hit after interrupt exit to user switches context to
+	 * user. See also the performance monitor handler in
+	 * exceptions-64e.S
 	 */
-	if (TRAP(regs) != INTERRUPT_PROGRAM)
+	if (TRAP(regs) != INTERRUPT_PROGRAM &&
+			!(IS_ENABLED(CONFIG_PPC_BOOK3E_64)))
 		CT_WARN_ON(ct_state() == CONTEXT_USER);
 
 	kuap = kuap_get_and_assert_locked();
-- 
2.37.2



More information about the Linuxppc-dev mailing list