[Skiboot] [PATCH 19/22] core/cpu: Initialize all cpu thread areas to avoid invalid memory access.

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Fri Jun 25 16:19:34 AEST 2021


From: Mahesh Salgaonkar <mahesh at linux.ibm.com>

Starting from p10 hostboot will no longer clear all the system memory except
its own space. OPAL uses the memory at SKIBOOT_BASE + SKIBOOT_SIZE for cpu
stack with pir as index. With hostboot no longer clearing memory this region
may hold junk contents. Currently opal initialize cpu stack memory only for
cpu pir that is found on the device-tree. For the rest, the cpu thread
contents are uninitialized. This sometime causes for_each_cpu* macros to
return cpu thread for pir/cpu which isn't present on the system. The
for_each_cpu* macros iterate over cpu stacks using pir as index and returns
cpu thread pointer if state != cpu_state_no_cpu. For cpus that are not found
on device-tree the state may hold junk value leading OPAL to access invalid
cpu thread area. This further leads to accessing pointers with junk values
causing machine check (MCE) during OPAL init code. Fix this by Initializing
all the cpu thread areas upto cpu_max_pir.

[  182.049714372,3] ***********************************************
[  182.049878580,3] Fatal MCE at 0000000030039738   .init_trace_buffers+0x21c  MSR 9000000000201002
[  182.049943811,3] Cause: load real address error
[  182.049968681,3] Effective address: 0x480113a4791c4a50
[  182.050000736,3] CFAR : 00000000300395b8 MSR  : 9000000000201002
[  182.050035376,3] SRR0 : 0000000030039738 SRR1 : 9000000000201002
[  182.050072878,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000
[  182.050117303,3] DSISR: 00000040         DAR  : 480113a4791c4a50
[  182.050149054,3] LR   : 0000000030039744 CTR  : 0000000000000000
[  182.050182991,3] CR   : 42000224         XER  : 00000000
[  182.050217262,3] GPR00: 000000003003962c GPR16: 0000000032d50000
[  182.050255746,3] GPR01: 0000000032d53a50 GPR17: 0000000030003198
[  182.050288081,3] GPR02: 000000003014cb00 GPR18: 0000000000000000
[  182.050331474,3] GPR03: 0000000031c50000 GPR19: 0000000000000000
[  182.050371934,3] GPR04: 0000000000000000 GPR20: 0000000000000000
[  182.050416212,3] GPR05: ffffffffffffffff GPR21: 0000000000000001
[  182.050454130,3] GPR06: 0000000000000005 GPR22: 00000000300f74eb
[  182.050488053,3] GPR07: 0000000000000028 GPR23: 00000000000fffd8
[  182.050522774,3] GPR08: 000000000000067f GPR24: 00000000000fff40
[  182.050566878,3] GPR09: 480113a4791c4a18 GPR25: 0000000000000070
[  182.050601524,3] GPR10: 00000000078b0353 GPR26: 00000000300f7527
[  182.050640345,3] GPR11: 0000000000000000 GPR27: 00000000300f7516
[  182.050680816,3] GPR12: 0000000042000222 GPR28: 000000003acd0000
[  182.050724099,3] GPR13: 000000000025a908 GPR29: 000000003acd0000
[  182.050759728,3] GPR14: 0000000000000000 GPR30: 0000000000000000
[  182.050790430,3] GPR15: 0000000000000000 GPR31: 00000000301f0038
CPU 0228 Backtrace:
 S: 0000000032d53d60 R: 000000003003962c   .init_trace_buffers+0x110
 S: 0000000032d53e30 R: 0000000030022f84   .main_cpu_entry+0x550
 S: 0000000032d53f00 R: 00000000300031f8   not_fused+0x11c

Signed-off-by: Mahesh Salgaonkar <mahesh at linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
[Folded Nick's patch to that added mark_all_secondary_cpus_absent() - Vasant]
Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
---
 core/cpu.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/core/cpu.c b/core/cpu.c
index f2b5bbc5d..dbc1ff445 100644
--- a/core/cpu.c
+++ b/core/cpu.c
@@ -1150,10 +1150,30 @@ void init_cpu_max_pir(void)
 	prlog(PR_DEBUG, "CPU: New max PIR set to 0x%x\n", cpu_max_pir);
 }
 
+/*
+ * Set cpu->state to cpu_state_no_cpu for all secondaries, before the dt is
+ * parsed and they will be flipped to present as populated CPUs are found.
+ *
+ * Some configurations (e.g., with memory encryption) will not zero system
+ * memory at boot, so can't rely on cpu->state to be zero (== cpu_state_no_cpu).
+ */
+static void mark_all_secondary_cpus_absent(void)
+{
+	unsigned int pir;
+	struct cpu_thread *cpu;
+
+	for (pir = 0; pir <= cpu_max_pir; pir++) {
+		cpu = &cpu_stacks[pir].cpu;
+		if (cpu == boot_cpu)
+			continue;
+		cpu->state = cpu_state_no_cpu;
+	}
+}
+
 void init_all_cpus(void)
 {
 	struct dt_node *cpus, *cpu;
-	unsigned int thread;
+	unsigned int pir, thread;
 	int dec_bits = find_dec_bits();
 
 	cpus = dt_find_by_path(dt_root, "/cpus");
@@ -1161,9 +1181,11 @@ void init_all_cpus(void)
 
 	init_tm_suspend_mode_property();
 
+	mark_all_secondary_cpus_absent();
+
 	/* Iterate all CPUs in the device-tree */
 	dt_for_each_child(cpus, cpu) {
-		unsigned int pir, server_no, chip_id, threads;
+		unsigned int server_no, chip_id, threads;
 		enum cpu_thread_state state;
 		const struct dt_property *p;
 		struct cpu_thread *t, *pt0, *pt1;
-- 
2.31.1



More information about the Skiboot mailing list