[RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Thu Jun 17 20:59:01 AEST 2021


On 6/17/21 1:16 PM, David Gibson wrote:
> On Tue, Jun 15, 2021 at 12:35:17PM +0530, Aneesh Kumar K.V wrote:
>> David Gibson <david at gibson.dropbear.id.au> writes:
>>
>>> On Tue, Jun 15, 2021 at 11:27:50AM +0530, Aneesh Kumar K.V wrote:
>>>> David Gibson <david at gibson.dropbear.id.au> writes:

...

>>> It's weird to me that you'd want to consider them in different nodes
>>> for those different purposes.
>>
>>
>>     --------------------------------------
>>    |                            NUMA node0 |
>>    |    ProcA -----> MEMA                  |
>>    |     |                                 |
>>    |	|                                 |
>>    |	-------------------> PMEMB        |
>>    |                                       |
>>     ---------------------------------------
>>
>>     ---------------------------------------
>>    |                            NUMA node1 |
>>    |                                       |
>>    |    ProcB -------> MEMC                |
>>    |	|                                 |
>>    |	-------------------> PMEMD        |
>>    |                                       |
>>    |                                       |
>>     ---------------------------------------
>>   
>>
>> For a topology like the above application running of ProcA wants to find out
>> persistent memory mount local to its NUMA node. Hence when using it as
>> pmem fsdax mount or devdax device we want PMEMB to have associativity
>> of NUMA node0 and PMEMD to have associativity of NUMA node 1. But when
>> we want to use it as memory using dax kmem driver, we want both PMEMB
>> and PMEMD to appear as memory only NUMA node at a distance that is
>> derived based on the latency of the media.
> 
> I'm still not understanding why the latency we care about is different
> in the two cases.  Can you give an example of when this would result
> in different actual node assignments for the two different cases?
> 

In the above example in order allow use of PMEMB and PMEMD as memory 
only NUMA nodes
we need platform to represent them in its own domainID. Let's assume that
platform assigned id 40 and 41 and hence both PMEMB and PMEMD will have 
associativity array like below

{ 4, 6, 0}  -> PROCA/MEMA
{ 4, 6, 40} -> PMEMB
{ 4, 6, 41} -> PMEMD
{ 4, 6, 1} ->  PROCB/MEMB

When we want to use this device PMEMB and PMEMD as fsdax/devdax devices, 
we essentially look for the first nearest online node. Which means both 
PMEMB and PMEMD will appear as devices attached to node0. That is not 
ideal for for many applications.

using secondary domainID index as explained here helps to associate
each PMEM device to the right group. On a non virtualized config or hard 
partitioned config such a device tree representation can be looked at as 
a hint to identify which socket the actual device is connected to.

-aneesh


More information about the Linuxppc-dev mailing list