[PATCH REPOST] powerpc/rtas: Fix hang in race against concurrent cpu offline
Juliet Kim
julietk at linux.vnet.ibm.com
Wed Jun 26 03:48:49 AEST 2019
The commit
(“powerpc/rtas: Fix a potential race between CPU-Offline & Migration)
attempted to fix a hang in Live Partition Mobility(LPM) by abandoning
the LPM attempt if a race between LPM and concurrent CPU offline was
detected.
However, that fix failed to notify Hypervisor that the LPM attempted
had been abandoned which results in a system hang.
Fix this by sending a signal PHYP to cancel the migration, so that PHYP
can stop waiting, and clean up the migration.
Fixes: dfd718a2ed1f ("powerpc/rtas: Fix a potential race between CPU-Offline & Migration")
Signed-off-by: Juliet Kim <julietk at linux.vnet.ibm.com>
---
arch/powerpc/include/asm/hvcall.h | 7 +++++++
arch/powerpc/kernel/rtas.c | 8 ++++++++
2 files changed, 15 insertions(+)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index 463c63a..29ca285 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -261,6 +261,7 @@
#define H_ADD_CONN 0x284
#define H_DEL_CONN 0x288
#define H_JOIN 0x298
+#define H_VASI_SIGNAL 0x2A0
#define H_VASI_STATE 0x2A4
#define H_VIOCTL 0x2A8
#define H_ENABLE_CRQ 0x2B0
@@ -348,6 +349,12 @@
#define H_SIGNAL_SYS_RESET_ALL_OTHERS -2
/* >= 0 values are CPU number */
+/* Values for argument to H_VASI_SIGNAL */
+#define H_SIGNAL_CANCEL_MIG 0x01
+
+/* Values for 2nd argument to H_VASI_SIGNAL */
+#define H_CPU_OFFLINE_DETECTED 0x0000000006000004
+
/* H_GET_CPU_CHARACTERISTICS return values */
#define H_CPU_CHAR_SPEC_BAR_ORI31 (1ull << 63) // IBM bit 0
#define H_CPU_CHAR_BCCTRL_SERIALISED (1ull << 62) // IBM bit 1
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index b824f4c..f9002b7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -981,6 +981,14 @@ int rtas_ibm_suspend_me(u64 handle)
/* Check if we raced with a CPU-Offline Operation */
if (unlikely(!cpumask_equal(cpu_present_mask, cpu_online_mask))) {
+
+ /* We uses CANCEL, not ABORT to gracefully cancel migration */
+ rc = plpar_hcall_norets(H_VASI_SIGNAL, handle,
+ H_SIGNAL_CANCEL_MIG, H_CPU_OFFLINE_DETECTED);
+
+ if (rc != H_SUCCESS)
+ pr_err("%s: vasi_signal failed %ld\n", __func__, rc);
+
pr_err("%s: Raced against a concurrent CPU-Offline\n",
__func__);
atomic_set(&data.error, -EBUSY);
--
1.8.3.1
More information about the Linuxppc-dev
mailing list