[PATCH v3] powerpc: Handle MCE on POWER9 with only DSISR bit 33 set

Michael Neuling mikey at neuling.org
Fri Sep 22 13:32:21 AEST 2017


On POWER9 DD2.1 and below, it's possible for a paste instruction to
cause a Machine Check Exception (MCE) where only DSISR bit 33 is
set. This will result in the MCE handler seeing an unknown event,
which triggers linux to crash.

We change this by detecting unknown events caused by load/stores in
the MCE handler and marking them as handled so that we no longer
crash.

An MCE that occurs like this is spurious, so we don't need to do
anything in terms of servicing it. If there is something that needs to
be serviced, the CPU will raise the MCE again with the correct DSISR
so that it can be serviced properly.

Signed-off-by: Michael Neuling <mikey at neuling.org>
Reviewed-by: Nicholas Piggin <npiggin at gmail.com
--
v3: Simplification and SRR1 check suggestions from Nick
v2: update commit message based on Balbir's comments
---
 arch/powerpc/kernel/mce_power.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c
index b76ca198e0..e423cf0e43 100644
--- a/arch/powerpc/kernel/mce_power.c
+++ b/arch/powerpc/kernel/mce_power.c
@@ -624,5 +624,15 @@ long __machine_check_early_realmode_p8(struct pt_regs *regs)
 
 long __machine_check_early_realmode_p9(struct pt_regs *regs)
 {
+	/*
+	 * On POWER9 DD2.1 and below, it's possible to get machine
+	 * check caused by a paste instruction where only DSISR bit 33
+	 * is set. This will result in the MCE handler seeing an
+	 * unknown event and us crashing.  Change this to mark as
+	 * handled.
+	 */
+	if (SRR1_MC_LOADSTORE(regs->msr) && regs->dsisr == 0x40000000)
+		return 1;
+
 	return mce_handle_error(regs, mce_p9_derror_table, mce_p9_ierror_table);
 }
-- 
2.11.0



More information about the Linuxppc-dev mailing list