[Skiboot] [PATCH] numa/associativity: Add a new level of NUMA for GPU's
Balbir Singh
bsingharora at gmail.com
Thu Jul 6 11:57:54 AEST 2017
Today we have an issue where the NUMA nodes corresponding
to GPU's have the same affinity/distance as normal memory
nodes. Our reference-points today supports two levels
[0x4, 0x4] for normal systems and [0x4, 0x3] for Power8E
systems. This patch adds a new level [0x4, X, 0x2] and
uses node-id as at all levels for the GPU.
Cc: Reza Arbab <arbab at linux.vnet.ibm.com>
Cc: Alistair Popple <alistair at popple.id.au>
Cc: Benjamin Herrenschmidt <benh at kernel.crashing.org>
Signed-off-by: Balbir Singh <bsingharora at gmail.com>
---
Tested on a system, ensured existing nodes have a node
distances are not impacted. GPU nodes have a distance
of 80 w.r.t all other nodes. No changes are needed in
the Linux kernel.
core/affinity.c | 14 +++++++++-----
doc/device-tree/ibm,opal.rst | 2 +-
hw/npu2.c | 3 ++-
3 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/core/affinity.c b/core/affinity.c
index 9f489d3..10d483d 100644
--- a/core/affinity.c
+++ b/core/affinity.c
@@ -72,10 +72,10 @@ void add_associativity_ref_point(void)
/*
* Note about our use of reference points:
*
- * Linux currently supports two levels of NUMA. We use the first
- * reference point for the node ID and the second reference point
- * for a second level of affinity. We always use the chip ID (4)
- * for the first reference point.
+ * Linux currently supports up to three levels of NUMA. We use the
+ * first reference point for the node ID and the second reference
+ * point for a second level of affinity. We always use the chip ID
+ * (4) for the first reference point.
*
* Choosing the second level of affinity is model specific
* unfortunately. Current POWER8E models should use the DCM
@@ -83,12 +83,16 @@ void add_associativity_ref_point(void)
*
* If there is a way to obtain this information from the FSP
* that would be ideal, but for now hardwire our POWER8E setting.
+ *
+ * For GPU nodes we add a third level of NUMA, such that the
+ * distance of the GPU node from all other nodes is uniformly
+ * the highest.
*/
if (PVR_TYPE(mfspr(SPR_PVR)) == PVR_TYPE_P8E)
ref2 = 0x3;
dt_add_property_cells(opal_node, "ibm,associativity-reference-points",
- 0x4, ref2);
+ 0x4, ref2, 0x2);
}
void add_chip_dev_associativity(struct dt_node *dev)
diff --git a/doc/device-tree/ibm,opal.rst b/doc/device-tree/ibm,opal.rst
index 149050c..932f41d 100644
--- a/doc/device-tree/ibm,opal.rst
+++ b/doc/device-tree/ibm,opal.rst
@@ -25,7 +25,7 @@ Top level ibm,opal node
* ibm,opal-v2 is *NOT* present on POWER9 and above.
*/
- ibm,associativity-reference-points = <0x4 0x3>;
+ ibm,associativity-reference-points = <0x4 0x3, 0x2>;
ibm,heartbeat-ms = <0x7d0>;
/* how often any OPAL call needs to be made to avoid a watchdog timer on BMC
diff --git a/hw/npu2.c b/hw/npu2.c
index b81e49d..83451c3 100644
--- a/hw/npu2.c
+++ b/hw/npu2.c
@@ -521,7 +521,8 @@ static struct dt_node *npu2_create_memory_dn(uint64_t addr, uint64_t size)
dt_add_property_u64s(mem, "reg", addr, size);
dt_add_property_cells(mem, "ibm,chip-id", chip_id);
dt_add_property_u64s(mem, "linux,usable-memory", addr, 0);
- dt_add_property_cells(mem, "ibm,associativity", 4, 0, 0, 0, chip_id--);
+ dt_add_property_cells(mem, "ibm,associativity", 4, chip_id, chip_id, chip_id, chip_id);
+ chip_id--;
assert(chip_id);
return mem;
--
2.9.4
More information about the Skiboot
mailing list