[RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details
Aneesh Kumar K.V
aneesh.kumar at linux.ibm.com
Fri Jun 18 00:04:11 AEST 2021
Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> writes:
> David Gibson <david at gibson.dropbear.id.au> writes:
>
>> On Tue, Jun 15, 2021 at 12:35:17PM +0530, Aneesh Kumar K.V wrote:
>>> David Gibson <david at gibson.dropbear.id.au> writes:
>>>
>>> > On Tue, Jun 15, 2021 at 11:27:50AM +0530, Aneesh Kumar K.V wrote:
>>> >> David Gibson <david at gibson.dropbear.id.au> writes:
>>> >>
>>> >> > On Mon, Jun 14, 2021 at 10:10:03PM +0530, Aneesh Kumar K.V wrote:
> .....
>
>> I'm still not understanding why the latency we care about is different
>> in the two cases. Can you give an example of when this would result
>> in different actual node assignments for the two different cases?
>
> How about the below update?
>
> With Form2 "ibm,associativity" for resources is listed as below:
>
> "ibm,associativity" property for resources in node 0, 8 and 40
> { 3, 6, 7, 0 }
> { 3, 6, 9, 8 }
> { 4, 6, 7, 0, 40}
>
> With "ibm,associativity-reference-points" { 0x3, 0x2 }
>
> Form2 adds additional property which can be used with devices like persistence
> memory devices which would also like to be presented as memory-only NUMA nodes.
>
> "ibm,associativity-memory-node-reference-point" property contains a number
> representing the domainID index to be used to find the domainID that should be used
> when using the resource as memory only NUMA node. The NUMA distance information
> w.r.t this domainID will take into consideration the latency of the media. A
> high latency memory device will have a large NUMA distance value assigned w.r.t
> the domainID found at at "ibm,associativity-memory-node-reference-point" domainID index.
>
> prop-encoded-array: An integer encoded as with encode-int specifying the domainID index
>
> In the above example:
> "ibm,associativity-memory-node-reference-point" { 0x4 }
>
> ex:
>
> --------------------------------------
> | NUMA node0 |
> | ProcA -----> MEMA |
> | | |
> | | |
> | -------------------> PMEMB |
> | |
> ---------------------------------------
>
> ---------------------------------------
> | NUMA node1 |
> | |
> | ProcB -------> MEMC |
> | | |
> | -------------------> PMEMD |
> | |
> | |
> ---------------------------------------
>
> --------------------------------------------------------------------------------
> | domainID 20 |
> | --------------------------------------- |
> | | NUMA node0 | |
> | | | -------------------- |
> | | ProcA -------> MEMA | | NUMA node40 | |
> | | | | | | |
> | | ---------------------------------- |--------> | PMEMB | |
> | | | -------------------- |
> | | | |
> | --------------------------------------- |
> | |
> | --------------------------------------- |
> | | NUMA node1 | |
> | | | |
> | | ProcB -------> MEMC | ------------------- |
> | | | | | NUMA node41 | |
> | | --------------------------------------------> | PMEMD | |
> | | | ------------------- |
> | | | |
> | --------------------------------------- |
> | |
> --------------------------------------------------------------------------------
>
> For a topology like the above application running of ProcA wants to find out
> persistent memory mount local to its NUMA node. Hence when using it as
> pmem fsdax mount or devdax device we want PMEMB to have associativity
> of NUMA node0 and PMEMD to have associativity of NUMA node1. But when
> we want to use it as memory using dax kmem driver, we want both PMEMB
> and PMEMD to appear as memory only NUMA node at a distance that is
> derived based on the latency of the media.
>
> "ibm,associativity":
> PROCA/MEMA -> { 2, 20, 0 }
> PROCB/MEMC -> { 2, 20, 1 }
> PMEMB -> { 3, 20, 0, 40}
> PMEMB -> { 3, 20, 1, 41}
>
> "ibm,associativity-reference-points" -> { 2, 1 }
> "ibm,associativity-memory-node-reference-points" -> { 3 }
Another option is to make sure that numa-distance-value is populated
such that PMEMB distance indicates it is closer to node0 when compared
to node1. ie, node_distance[40][0] < node_distance[40][1]. One could
possibly infer the grouping based on the distance value and not deepend
on ibm,associativity for that purpose.
-aneesh
More information about the Linuxppc-dev
mailing list