[RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details
Aneesh Kumar K.V
aneesh.kumar at linux.ibm.com
Thu Jun 17 21:11:13 AEST 2021
Daniel Henrique Barboza <danielhb413 at gmail.com> writes:
> On 6/17/21 4:46 AM, David Gibson wrote:
>> On Tue, Jun 15, 2021 at 12:35:17PM +0530, Aneesh Kumar K.V wrote:
>>> David Gibson <david at gibson.dropbear.id.au> writes:
>>>
>>>> On Tue, Jun 15, 2021 at 11:27:50AM +0530, Aneesh Kumar K.V wrote:
>>>>> David Gibson <david at gibson.dropbear.id.au> writes:
>>>>>
>>>>>> On Mon, Jun 14, 2021 at 10:10:03PM +0530, Aneesh Kumar K.V wrote:
>>>>>>> FORM2 introduce a concept of secondary domain which is identical to the
>>>>>>> conceept of FORM1 primary domain. Use secondary domain as the numa node
>>>>>>> when using persistent memory device. For DAX kmem use the logical domain
>>>>>>> id introduced in FORM2. This new numa node
>>>>>>>
>>>>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
>>>>>>> ---
>>>>>>> arch/powerpc/mm/numa.c | 28 +++++++++++++++++++++++
>>>>>>> arch/powerpc/platforms/pseries/papr_scm.c | 26 +++++++++++++--------
>>>>>>> arch/powerpc/platforms/pseries/pseries.h | 1 +
>>>>>>> 3 files changed, 45 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>>>>>>> index 86cd2af014f7..b9ac6d02e944 100644
>>>>>>> --- a/arch/powerpc/mm/numa.c
>>>>>>> +++ b/arch/powerpc/mm/numa.c
>>>>>>> @@ -265,6 +265,34 @@ static int associativity_to_nid(const __be32 *associativity)
>>>>>>> return nid;
>>>>>>> }
>>>>>>>
>>>>>>> +int get_primary_and_secondary_domain(struct device_node *node, int *primary, int *secondary)
>>>>>>> +{
>>>>>>> + int secondary_index;
>>>>>>> + const __be32 *associativity;
>>>>>>> +
>>>>>>> + if (!numa_enabled) {
>>>>>>> + *primary = NUMA_NO_NODE;
>>>>>>> + *secondary = NUMA_NO_NODE;
>>>>>>> + return 0;
>>>>>>> + }
>>>>>>> +
>>>>>>> + associativity = of_get_associativity(node);
>>>>>>> + if (!associativity)
>>>>>>> + return -ENODEV;
>>>>>>> +
>>>>>>> + if (of_read_number(associativity, 1) >= primary_domain_index) {
>>>>>>> + *primary = of_read_number(&associativity[primary_domain_index], 1);
>>>>>>> + secondary_index = of_read_number(&distance_ref_points[1], 1);
>>>>>>
>>>>>> Secondary ID is always the second reference point, but primary depends
>>>>>> on the length of resources? That seems very weird.
>>>>>
>>>>> primary_domain_index is distance_ref_point[0]. With Form2 we would find
>>>>> both primary and secondary domain ID same for all resources other than
>>>>> persistent memory device. The usage w.r.t. persistent memory is
>>>>> explained in patch 7.
>>>>
>>>> Right, I misunderstood
>>>>
>>>>>
>>>>> With Form2 the primary domainID and secondary domainID are used to identify the NUMA nodes
>>>>> the kernel should use when using persistent memory devices.
>>>>
>>>> This seems kind of bogus. With Form1, the primary/secondary ID are a
>>>> sort of heirarchy of distance (things with same primary ID are very
>>>> close, things with same secondary are kinda-close, etc.). With Form2,
>>>> it's referring to their effective node for different purposes.
>>>>
>>>> Using the same terms for different meanings seems unnecessarily
>>>> confusing.
>>>
>>> They are essentially domainIDs. The interpretation of them are different
>>> between Form1 and Form2. Hence I kept referring to them as primary and
>>> secondary domainID. Any suggestion on what to name them with Form2?
>>
>> My point is that reusing associativity-reference-points for something
>> with completely unrelated semantics seems like a very poor choice.
>
>
> I agree that this reuse can be confusing. I could argue that there is
> precedent for that in PAPR - FORM0 puts a different spin on the same
> property as well - but there is no need to keep following existing PAPR
> practices in new spec (and some might argue it's best not to).
>
> As far as QEMU goes, renaming this property to "numa-associativity-mode"
> (just an example) is a quick change to do since we separated FORM1 and FORM2
> code over there.
>
> Doing such a rename can also help with the issue of having to describe new
> FORM2 semantics using "least significant boundary" or "primary domain" or
> any FORM0|FORM1 related terminology.
>
It is not just changing the name, we will then have to explain the
meaning of ibm,associativity-reference-points with FORM2 right?
With FORM2 we want to represent the topology better
--------------------------------------------------------------------------------
| domainID 20 |
| --------------------------------------- |
| | NUMA node1 | |
| | | -------------------- |
| | ProcB -------> MEMC | | NUMA node40 | |
| | | | | | |
| | ---------------------------------- |--------> | PMEMD | |
| | | -------------------- |
| | | |
| --------------------------------------- |
--------------------------------------------------------------------------------
ibm,associativity:
{ 20, 1, 40} -> PMEMD
{ 20, 1, 1} -> PROCB/MEMC
is the suggested FORM2 representation.
-aneesh
More information about the Linuxppc-dev
mailing list