[Skiboot] [PATCH v2] opal/hmi: Fix the soft lockup issue on HMI for certain TB errors.

Wed Oct 7 05:48:06 AEDT 2015

From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>

A while loop from wait_for_subcore_threads() function loops until one
thread from each subcore completes the pre-cleanup task and sets a
cleanup done bit.

	while (!(*(this_cpu()->core_hmi_state_ptr) & HMI_STATE_CLEANUP_DONE))
		cpu_relax();

Without a memory barrier we see that the compiler optimizes the above while
loop not to re-fetch the data from memory pointed by
this_cpu()->core_hmi_state_ptr. This makes CPU to spin infinitely
even though the other CPUs have modified the data causing soft lockup in
kernel.

There are two ways to fix this, 1) introduce volatile specifier to force
re-read the fresh value from the memory. 2) Add barrier() call to cpu_relax().
Second approach will avoid similar bugs in future.

This patch uses the second approach to fix this issue.

This patch also introduces a timeout for the while loop to handle a worst
situation where all other threads are badly stuck without setting a
cleanup done bit. Under such situation timeout will help to avoid soft
lockups and report failure to kernel.

Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
---
 core/hmi.c    |   18 +++++++++++++++++-
 include/cpu.h |    1 +
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/core/hmi.c b/core/hmi.c
index ee556fc..aeeabe8 100644
--- a/core/hmi.c
+++ b/core/hmi.c
@@ -164,6 +164,9 @@
  */
 #define NX_HMI_ACTIVE		PPC_BIT(54)
 
+/* Number of iterations for the various timeouts */
+#define TIMEOUT_LOOPS		20000000
+
 static const struct core_xstop_bit_info {
 	uint8_t bit;		/* CORE FIR bit number */
 	enum OpalHMI_CoreXstopReason reason;
@@ -448,8 +451,21 @@ static int decode_malfunction(struct OpalHMIEvent *hmi_evt)
 
 static void wait_for_subcore_threads(void)
 {
-	while (!(*(this_cpu()->core_hmi_state_ptr) & HMI_STATE_CLEANUP_DONE))
+	uint64_t timeout = 0;
+
+	while (!(*(this_cpu()->core_hmi_state_ptr) & HMI_STATE_CLEANUP_DONE)) {
+		if (++timeout >= (TIMEOUT_LOOPS*3)) {
+			/*
+			 * Break out the loop here and fall through
+			 * recovery code. If recovery fails, kernel will get
+			 * informed about the failure. This way we can avoid
+			 * looping here if other threads are stuck.
+			 */
+			prlog(PR_DEBUG, "HMI: TB pre-recovery timeout\n");
+			break;
+		}
 		cpu_relax();
+	}
 }
 
 /*
diff --git a/include/cpu.h b/include/cpu.h
index 03a51f9..982b48a 100644
--- a/include/cpu.h
+++ b/include/cpu.h
@@ -113,6 +113,7 @@ static inline void __nomcount cpu_relax(void)
 		     "nop; nop; nop; nop;\n"
 		     "nop; nop; nop; nop;\n");
 	smt_medium();
+	barrier();
 }
 
 /* Initialize CPUs */