[Skiboot] [PATCH 02/22] core/mce: POWER9 fix machine check decoding of async errors

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Fri Jun 25 16:19:17 AEST 2021


From: Nicholas Piggin <npiggin at gmail.com>

Async machine check errors due to bad real address from store or
foreign link time out comes with the load/store bit (PPC bit 42)
set in SRR1 but the cause is set in SRR1 not DSISR, unlike other
errors that have the load/store bit set.

This behaviour was omitted from the POWER9 User Manual but it is
confirmed to be the expected one. Update the machine check decoder
to match.

Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
---
 core/mce.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/core/mce.c b/core/mce.c
index a07eeb6a8..3f5091628 100644
--- a/core/mce.c
+++ b/core/mce.c
@@ -175,6 +175,19 @@ void decode_mce(uint64_t srr0, uint64_t srr1,
 		return;
 	}
 
+	/*
+	 * Async machine check due to bad real address from store or foreign
+	 * link time out comes with the load/store bit (PPC bit 42) set in
+	 * SRR1, but the cause comes in SRR1 not DSISR. Clear bit 42 so we're
+	 * directed to the ierror table so it will find the cause (which
+	 * describes it correctly as a store error).
+	 */
+	if (SRR1_MC_LOADSTORE(srr1) &&
+			((srr1 & 0x081c0000) == 0x08140000 ||
+			 (srr1 & 0x081c0000) == 0x08180000)) {
+		srr1 &= ~PPC_BIT(42);
+	}
+
 	if (SRR1_MC_LOADSTORE(srr1)) {
 		decode_derror(mce_p9_derror_table, dsisr, type, error_str);
 		if (*type & MCE_INVOLVED_EA)
-- 
2.31.1



More information about the Skiboot mailing list