[PATCH REPOST] powerpc/rtas: Fix hang in race against concurrent cpu offline

Juliet Kim julietk at linux.vnet.ibm.com
Wed Jun 26 03:48:49 AEST 2019


The commit
(“powerpc/rtas: Fix a potential race between CPU-Offline & Migration)
attempted to fix a hang in Live Partition Mobility(LPM) by abandoning
the LPM attempt if a race between LPM and concurrent CPU offline was
detected.

However, that fix failed to notify Hypervisor that the LPM attempted
had been abandoned which results in a system hang.

Fix this by sending a signal PHYP to cancel the migration, so that PHYP
can stop waiting, and clean up the migration.

Fixes: dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration")
Signed-off-by: Juliet Kim <julietk at linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/hvcall.h | 7 +++++++
 arch/powerpc/kernel/rtas.c        | 8 ++++++++
 2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 463c63a..29ca285 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -261,6 +261,7 @@
 #define H_ADD_CONN		0x284
 #define H_DEL_CONN		0x288
 #define H_JOIN			0x298
+#define H_VASI_SIGNAL           0x2A0
 #define H_VASI_STATE            0x2A4
 #define H_VIOCTL		0x2A8
 #define H_ENABLE_CRQ		0x2B0
@@ -348,6 +349,12 @@
 #define H_SIGNAL_SYS_RESET_ALL_OTHERS		-2
 /* >= 0 values are CPU number */
 
+/* Values for argument to H_VASI_SIGNAL */
+#define H_SIGNAL_CANCEL_MIG     0x01
+
+/* Values for 2nd argument to H_VASI_SIGNAL */
+#define H_CPU_OFFLINE_DETECTED  0x0000000006000004
+
 /* H_GET_CPU_CHARACTERISTICS return values */
 #define H_CPU_CHAR_SPEC_BAR_ORI31	(1ull << 63) // IBM bit 0
 #define H_CPU_CHAR_BCCTRL_SERIALISED	(1ull << 62) // IBM bit 1
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index b824f4c..f9002b7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -981,6 +981,14 @@ int rtas_ibm_suspend_me(u64 handle)
 
 	/* Check if we raced with a CPU-Offline Operation */
 	if (unlikely(!cpumask_equal(cpu_present_mask, cpu_online_mask))) {
+
+		/* We uses CANCEL, not ABORT to gracefully cancel migration */
+		rc = plpar_hcall_norets(H_VASI_SIGNAL, handle,
+			H_SIGNAL_CANCEL_MIG, H_CPU_OFFLINE_DETECTED);
+
+		if (rc != H_SUCCESS)
+			pr_err("%s: vasi_signal failed %ld\n", __func__, rc);
+
 		pr_err("%s: Raced against a concurrent CPU-Offline\n",
 		       __func__);
 		atomic_set(&data.error, -EBUSY);
-- 
1.8.3.1



More information about the Linuxppc-dev mailing list