[RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Thu Jun 17 21:11:13 AEST 2021


Daniel Henrique Barboza <danielhb413 at gmail.com> writes:

> On 6/17/21 4:46 AM, David Gibson wrote:
>> On Tue, Jun 15, 2021 at 12:35:17PM +0530, Aneesh Kumar K.V wrote:
>>> David Gibson <david at gibson.dropbear.id.au> writes:
>>>
>>>> On Tue, Jun 15, 2021 at 11:27:50AM +0530, Aneesh Kumar K.V wrote:
>>>>> David Gibson <david at gibson.dropbear.id.au> writes:
>>>>>
>>>>>> On Mon, Jun 14, 2021 at 10:10:03PM +0530, Aneesh Kumar K.V wrote:
>>>>>>> FORM2 introduce a concept of secondary domain which is identical to the
>>>>>>> conceept of FORM1 primary domain. Use secondary domain as the numa node
>>>>>>> when using persistent memory device. For DAX kmem use the logical domain
>>>>>>> id introduced in FORM2. This new numa node
>>>>>>>
>>>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
>>>>>>> ---
>>>>>>>   arch/powerpc/mm/numa.c                    | 28 +++++++++++++++++++++++
>>>>>>>   arch/powerpc/platforms/pseries/papr_scm.c | 26 +++++++++++++--------
>>>>>>>   arch/powerpc/platforms/pseries/pseries.h  |  1 +
>>>>>>>   3 files changed, 45 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>>>>>>> index 86cd2af014f7..b9ac6d02e944 100644
>>>>>>> --- a/arch/powerpc/mm/numa.c
>>>>>>> +++ b/arch/powerpc/mm/numa.c
>>>>>>> @@ -265,6 +265,34 @@ static int associativity_to_nid(const __be32 *associativity)
>>>>>>>   	return nid;
>>>>>>>   }
>>>>>>>   
>>>>>>> +int get_primary_and_secondary_domain(struct device_node *node, int *primary, int *secondary)
>>>>>>> +{
>>>>>>> +	int secondary_index;
>>>>>>> +	const __be32 *associativity;
>>>>>>> +
>>>>>>> +	if (!numa_enabled) {
>>>>>>> +		*primary = NUMA_NO_NODE;
>>>>>>> +		*secondary = NUMA_NO_NODE;
>>>>>>> +		return 0;
>>>>>>> +	}
>>>>>>> +
>>>>>>> +	associativity = of_get_associativity(node);
>>>>>>> +	if (!associativity)
>>>>>>> +		return -ENODEV;
>>>>>>> +
>>>>>>> +	if (of_read_number(associativity, 1) >= primary_domain_index) {
>>>>>>> +		*primary = of_read_number(&associativity[primary_domain_index], 1);
>>>>>>> +		secondary_index = of_read_number(&distance_ref_points[1], 1);
>>>>>>
>>>>>> Secondary ID is always the second reference point, but primary depends
>>>>>> on the length of resources?  That seems very weird.
>>>>>
>>>>> primary_domain_index is distance_ref_point[0]. With Form2 we would find
>>>>> both primary and secondary domain ID same for all resources other than
>>>>> persistent memory device. The usage w.r.t. persistent memory is
>>>>> explained in patch 7.
>>>>
>>>> Right, I misunderstood
>>>>
>>>>>
>>>>> With Form2 the primary domainID and secondary domainID are used to identify the NUMA nodes
>>>>> the kernel should use when using persistent memory devices.
>>>>
>>>> This seems kind of bogus.  With Form1, the primary/secondary ID are a
>>>> sort of heirarchy of distance (things with same primary ID are very
>>>> close, things with same secondary are kinda-close, etc.).  With Form2,
>>>> it's referring to their effective node for different purposes.
>>>>
>>>> Using the same terms for different meanings seems unnecessarily
>>>> confusing.
>>>
>>> They are essentially domainIDs. The interpretation of them are different
>>> between Form1 and Form2. Hence I kept referring to them as primary and
>>> secondary domainID. Any suggestion on what to name them with Form2?
>> 
>> My point is that reusing associativity-reference-points for something
>> with completely unrelated semantics seems like a very poor choice.
>
>
> I agree that this reuse can be confusing. I could argue that there is
> precedent for that in PAPR - FORM0 puts a different spin on the same
> property as well - but there is no need to keep following existing PAPR
> practices in new spec (and some might argue it's best not to).
>
> As far as QEMU goes, renaming this property to "numa-associativity-mode"
> (just an example) is a quick change to do since we separated FORM1 and FORM2
> code over there.
>
> Doing such a rename can also help with the issue of having to describe new
> FORM2 semantics using "least significant boundary" or "primary domain" or
> any FORM0|FORM1 related terminology.
>

It is not just changing the name, we will then have to explain the
meaning of ibm,associativity-reference-points with FORM2 right?

With FORM2 we want to represent the topology better

 --------------------------------------------------------------------------------
|                                                         domainID 20            |
|   ---------------------------------------                                      |
|  |                            NUMA node1 |                                     |
|  |                                       |            --------------------     |
|  |    ProcB -------> MEMC                |           |        NUMA node40 |    |
|  |	|                                  |           |                    |    |
|  |	---------------------------------- |-------->  |  PMEMD             |    |
|  |                                       |            --------------------     |
|  |                                       |                                     |
|   ---------------------------------------                                      |
 --------------------------------------------------------------------------------

ibm,associativity:
        { 20, 1, 40}  -> PMEMD
        { 20, 1, 1}  -> PROCB/MEMC

is the suggested FORM2 representation.

-aneesh


More information about the Linuxppc-dev mailing list