[PATCH v2 1/3] powerpc/numa: Introduce logical numa id

Srikar Dronamraju srikar at linux.vnet.ibm.com
Mon Aug 17 21:49:08 AEST 2020


* Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> [2020-08-17 17:04:24]:

> On 8/17/20 4:29 PM, Srikar Dronamraju wrote:
> > * Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> [2020-08-17 16:02:36]:
> > 
> > > We use ibm,associativity and ibm,associativity-lookup-arrays to derive the numa
> > > node numbers. These device tree properties are firmware indicated grouping of
> > > resources based on their hierarchy in the platform. These numbers (group id) are
> > > not sequential and hypervisor/firmware can follow different numbering schemes.
> > > For ex: on powernv platforms, we group them in the below order.
> > > 
> > >   *     - CCM node ID
> > >   *     - HW card ID
> > >   *     - HW module ID
> > >   *     - Chip ID
> > >   *     - Core ID
> > > 
> > > Based on ibm,associativity-reference-points we use one of the above group ids as
> > > Linux NUMA node id. (On PowerNV platform Chip ID is used). This results
> > > in Linux reporting non-linear NUMA node id and which also results in Linux
> > > reporting empty node 0 NUMA nodes.
> > > 
> > > This can  be resolved by mapping the firmware provided group id to a logical Linux
> > > NUMA id. In this patch, we do this only for pseries platforms considering the
> > > firmware group id is a virtualized entity and users would not have drawn any
> > > conclusion based on the Linux Numa Node id.
> > > 
> > > On PowerNV platform since we have historically mapped Chip ID as Linux NUMA node
> > > id, we keep the existing Linux NUMA node id numbering.
> > 
> > I still dont understand how you are going to handle numa distances.
> > With your patch, have you tried dlpar add/remove on a sparsely noded machine?
> > 
> 
> We follow the same steps when fetching distance information. Instead of
> using affinity domain id, we now use the mapped node id. The relevant hunk
> in the patch is
> 
> +	nid = affinity_domain_to_nid(&domain);
> 
>  	if (nid > 0 &&
> -		of_read_number(associativity, 1) >= distance_ref_points_depth) {
> +	    of_read_number(associativity, 1) >= distance_ref_points_depth) {
>  		/*
>  		 * Skip the length field and send start of associativity array
>  		 */
> 
> I haven't tried dlpar add/remove. I don't have a setup to try that. Do you
> see a problem there?
> 

Yes, I think there can be 2 problems.

1. distance table may be filled with incorrect data.
2. numactl -H distance table shows symmetric data, the symmetric nature may
be lost.

> -aneesh
> 
> 

-- 
Thanks and Regards
Srikar Dronamraju


More information about the Linuxppc-dev mailing list