[PATCH v2 1/3] powerpc/numa: Introduce logical numa id
Srikar Dronamraju
srikar at linux.vnet.ibm.com
Mon Aug 17 21:49:08 AEST 2020
* Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> [2020-08-17 17:04:24]:
> On 8/17/20 4:29 PM, Srikar Dronamraju wrote:
> > * Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> [2020-08-17 16:02:36]:
> >
> > > We use ibm,associativity and ibm,associativity-lookup-arrays to derive the numa
> > > node numbers. These device tree properties are firmware indicated grouping of
> > > resources based on their hierarchy in the platform. These numbers (group id) are
> > > not sequential and hypervisor/firmware can follow different numbering schemes.
> > > For ex: on powernv platforms, we group them in the below order.
> > >
> > > * - CCM node ID
> > > * - HW card ID
> > > * - HW module ID
> > > * - Chip ID
> > > * - Core ID
> > >
> > > Based on ibm,associativity-reference-points we use one of the above group ids as
> > > Linux NUMA node id. (On PowerNV platform Chip ID is used). This results
> > > in Linux reporting non-linear NUMA node id and which also results in Linux
> > > reporting empty node 0 NUMA nodes.
> > >
> > > This can be resolved by mapping the firmware provided group id to a logical Linux
> > > NUMA id. In this patch, we do this only for pseries platforms considering the
> > > firmware group id is a virtualized entity and users would not have drawn any
> > > conclusion based on the Linux Numa Node id.
> > >
> > > On PowerNV platform since we have historically mapped Chip ID as Linux NUMA node
> > > id, we keep the existing Linux NUMA node id numbering.
> >
> > I still dont understand how you are going to handle numa distances.
> > With your patch, have you tried dlpar add/remove on a sparsely noded machine?
> >
>
> We follow the same steps when fetching distance information. Instead of
> using affinity domain id, we now use the mapped node id. The relevant hunk
> in the patch is
>
> + nid = affinity_domain_to_nid(&domain);
>
> if (nid > 0 &&
> - of_read_number(associativity, 1) >= distance_ref_points_depth) {
> + of_read_number(associativity, 1) >= distance_ref_points_depth) {
> /*
> * Skip the length field and send start of associativity array
> */
>
> I haven't tried dlpar add/remove. I don't have a setup to try that. Do you
> see a problem there?
>
Yes, I think there can be 2 problems.
1. distance table may be filled with incorrect data.
2. numactl -H distance table shows symmetric data, the symmetric nature may
be lost.
> -aneesh
>
>
--
Thanks and Regards
Srikar Dronamraju
More information about the Linuxppc-dev
mailing list