[PATCH] powerpc/numa: Restrict possible nodes based on platform

Tyrel Datwyler tyreld at linux.ibm.com
Tue Jul 7 06:58:42 AEST 2020


On 7/5/20 11:40 PM, Srikar Dronamraju wrote:
> As per PAPR, there are 2 device tree property
> ibm,max-associativity-domains (which defines the maximum number of
> domains that the firmware i.e PowerVM can support) and
> ibm,current-associativity-domains (which defines the maximum number of
> domains that the platform can support). Value of
> ibm,max-associativity-domains property is always greater than or equal
> to ibm,current-associativity-domains property.
> 
> Powerpc currently uses ibm,max-associativity-domains  property while
> setting the possible number of nodes. This is currently set at 32.
> However the possible number of nodes for a platform may be significantly
> less. Hence set the possible number of nodes based on
> ibm,current-associativity-domains property.
> 
> $ lsprop /proc/device-tree/rtas/ibm,*associ*-domains
> /proc/device-tree/rtas/ibm,current-associativity-domains
> 		 00000005 00000001 00000002 00000002 00000002 00000010
> /proc/device-tree/rtas/ibm,max-associativity-domains
> 		 00000005 00000001 00000008 00000020 00000020 00000100
> 
> $ cat /sys/devices/system/node/possible ##Before patch
> 0-31
> 
> $ cat /sys/devices/system/node/possible ##After patch
> 0-1
> 
> Note the maximum nodes this platform can support is only 2 but the
> possible nodes is set to 32.
> 
> This is important because lot of kernel and user space code allocate
> structures for all possible nodes leading to a lot of memory that is
> allocated but not used.
> 
> I ran a simple experiment to create and destroy 100 memory cgroups on
> boot on a 8 node machine (Power8 Alpine).
> 
> Before patch
> free -k at boot
>               total        used        free      shared  buff/cache   available
> Mem:      523498176     4106816   518820608       22272      570752   516606720
> Swap:       4194240           0     4194240
> 
> free -k after creating 100 memory cgroups
>               total        used        free      shared  buff/cache   available
> Mem:      523498176     4628416   518246464       22336      623296   516058688
> Swap:       4194240           0     4194240
> 
> free -k after destroying 100 memory cgroups
>               total        used        free      shared  buff/cache   available
> Mem:      523498176     4697408   518173760       22400      627008   515987904
> Swap:       4194240           0     4194240
> 
> After patch
> free -k at boot
>               total        used        free      shared  buff/cache   available
> Mem:      523498176     3969472   518933888       22272      594816   516731776
> Swap:       4194240           0     4194240
> 
> free -k after creating 100 memory cgroups
>               total        used        free      shared  buff/cache   available
> Mem:      523498176     4181888   518676096       22208      640192   516496448
> Swap:       4194240           0     4194240
> 
> free -k after destroying 100 memory cgroups
>               total        used        free      shared  buff/cache   available
> Mem:      523498176     4232320   518619904       22272      645952   516443264
> Swap:       4194240           0     4194240
> 
> Observations:
> Fixed kernel takes 137344 kb (4106816-3969472) less to boot.
> Fixed kernel takes 309184 kb (4628416-4181888-137344) less to create 100 memcgs.
> 
> Cc: Nathan Lynch <nathanl at linux.ibm.com>
> Cc: Michael Ellerman <mpe at ellerman.id.au>
> Cc: linuxppc-dev at lists.ozlabs.org
> Cc: Anton Blanchard <anton at ozlabs.org>
> Cc: Bharata B Rao <bharata at linux.ibm.com>
> Signed-off-by: Srikar Dronamraju <srikar at linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/numa.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 9fcf2d195830..3d55cef1a2dc 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -897,7 +897,7 @@ static void __init find_possible_nodes(void)
>  		return;
> 
>  	if (of_property_read_u32_index(rtas,
> -				"ibm,max-associativity-domains",
> +				"ibm,current-associativity-domains",

I'm not sure ibm,current-associativity-domains is guaranteed to exist on older
firmware. You may need check that it exists and fall back to
ibm,max-associativity-domains in the event it doesn't

-Tyrel

>  				min_common_depth, &numnodes))
>  		goto out;
> 



More information about the Linuxppc-dev mailing list