[PATCH v2 1/2] sched/topology: Skip updating masks for non-online nodes

Srikar Dronamraju srikar at linux.vnet.ibm.com
Thu Jul 1 14:15:51 AEST 2021


Currently scheduler doesn't check if node is online before adding CPUs
to the node mask. However on some architectures, node distance is only
available for nodes that are online. Its not sure how much to rely on
the node distance, when one of the nodes is offline.

If said node distance is fake (since one of the nodes is offline) and
the actual node distance is different, then the cpumask of such nodes
when the nodes become becomes online will be wrong.

This can cause topology_span_sane to throw up a warning message and the
rest of the topology being not updated properly.

Resolve this by skipping update of cpumask for nodes that are not
online.

However by skipping, relevant CPUs may not be set when nodes are
onlined. i.e when coming up with NUMA masks at a certain NUMA distance,
CPUs that are part of other nodes, which are already online will not be
part of the NUMA mask. Hence the first time, a CPU is added to the newly
onlined node, add the other CPUs to the numa_mask.

Cc: LKML <linux-kernel at vger.kernel.org>
Cc: linuxppc-dev at lists.ozlabs.org
Cc: Nathan Lynch <nathanl at linux.ibm.com>
Cc: Michael Ellerman <mpe at ellerman.id.au>
Cc: Ingo Molnar <mingo at kernel.org>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: Valentin Schneider <valentin.schneider at arm.com>
Cc: Gautham R Shenoy <ego at linux.vnet.ibm.com>
Cc: Dietmar Eggemann <dietmar.eggemann at arm.com>
Cc: Mel Gorman <mgorman at techsingularity.net>
Cc: Vincent Guittot <vincent.guittot at linaro.org>
Cc: Rik van Riel <riel at surriel.com>
Cc: Geetika Moolchandani <Geetika.Moolchandani1 at ibm.com>
Cc: Laurent Dufour <ldufour at linux.ibm.com>
Reported-by: Geetika Moolchandani <Geetika.Moolchandani1 at ibm.com>
Signed-off-by: Srikar Dronamraju <srikar at linux.vnet.ibm.com>
---
Changelog v1->v2:
v1 link: http://lore.kernel.org/lkml/20210520154427.1041031-4-srikar@linux.vnet.ibm.com/t/#u
Update the NUMA masks, whenever 1st CPU is added to cpuless node

 kernel/sched/topology.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b77ad49dc14f..f25dbcab4fd2 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1833,6 +1833,9 @@ void sched_init_numa(void)
 			sched_domains_numa_masks[i][j] = mask;
 
 			for_each_node(k) {
+				if (!node_online(j))
+					continue;
+
 				if (sched_debug() && (node_distance(j, k) != node_distance(k, j)))
 					sched_numa_warn("Node-distance not symmetric");
 
@@ -1891,12 +1894,30 @@ void sched_init_numa(void)
 void sched_domains_numa_masks_set(unsigned int cpu)
 {
 	int node = cpu_to_node(cpu);
-	int i, j;
+	int i, j, empty;
 
+	empty = cpumask_empty(sched_domains_numa_masks[0][node]);
 	for (i = 0; i < sched_domains_numa_levels; i++) {
 		for (j = 0; j < nr_node_ids; j++) {
-			if (node_distance(j, node) <= sched_domains_numa_distance[i])
+			if (!node_online(j))
+				continue;
+
+			if (node_distance(j, node) <= sched_domains_numa_distance[i]) {
 				cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
+
+				/*
+				 * We skip updating numa_masks for offline
+				 * nodes. However now that the node is
+				 * finally online, CPUs that were added
+				 * earlier, should now be accommodated into
+				 * newly oneline node's numa mask.
+				 */
+				if (node != j && empty) {
+					cpumask_or(sched_domains_numa_masks[i][node],
+							sched_domains_numa_masks[i][node],
+							sched_domains_numa_masks[0][j]);
+				}
+			}
 		}
 	}
 }
-- 
2.27.0



More information about the Linuxppc-dev mailing list