[PATCH v5 4/5] powerpc/numa: Early request for home node associativity

Michael Ellerman mpe at ellerman.id.au
Thu Jan 16 15:58:34 AEDT 2020


Srikar Dronamraju <srikar at linux.vnet.ibm.com> writes:
> Currently the kernel detects if its running on a shared lpar platform
> and requests home node associativity before the scheduler sched_domains
> are setup. However between the time NUMA setup is initialized and the
> request for home node associativity, workqueue initializes its per node
> cpumask. The per node workqueue possible cpumask may turn invalid
> after home node associativity resulting in weird situations like
> workqueue possible cpumask being a subset of workqueue online cpumask.
>
> This can be fixed by requesting home node associativity earlier just
> before NUMA setup. However at the NUMA setup time, kernel may not be in
> a position to detect if its running on a shared lpar platform. So
> request for home node associativity and if the request fails, fallback
> on the device tree property.
>
> Signed-off-by: Srikar Dronamraju <srikar at linux.vnet.ibm.com>
> Cc: Michael Ellerman <mpe at ellerman.id.au>
> Cc: Nicholas Piggin <npiggin at gmail.com>
> Cc: Nathan Lynch <nathanl at linux.ibm.com>
> Cc: linuxppc-dev at lists.ozlabs.org
> Cc: Abdul Haleem <abdhalee at linux.vnet.ibm.com>
> Cc: Satheesh Rajendran <sathnaga at linux.vnet.ibm.com>
> Reported-by: Abdul Haleem <abdhalee at linux.vnet.ibm.com>
> Reviewed-by: Nathan Lynch <nathanl at linux.ibm.com>
> ---
> Changelog (v2->v3):
> - Handled comments from Nathan Lynch
>   * Use first thread of the core for cpu-to-node map.
>   * get hardware-id in numa_setup_cpu
>
> Changelog (v1->v2):
> - Handled comments from Nathan Lynch
>   * Dont depend on pacas to be setup for the hwid
>
>
>  arch/powerpc/mm/numa.c | 45 +++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 40 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 63ec0c3c817f..f837a0e725bc 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -461,13 +461,27 @@ static int of_drconf_to_nid_single(struct drmem_lmb *lmb)
>  	return nid;
>  }
>  
> +static int vphn_get_nid(long hwid)
> +{
> +	__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
> +	long rc;
> +
> +	rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);

This breaks the build for some defconfigs.

eg. ppc64_book3e_allmodconfig:

  arch/powerpc/mm/numa.c: In function ‘vphn_get_nid’:
  arch/powerpc/mm/numa.c:469:7: error: implicit declaration of function ‘hcall_vphn’ [-Werror=implicit-function-declaration]
    469 |  rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);
        |       ^~~~~~~~~~

It needs to be inside #ifdef CONFIG_PPC_SPLPAR.

> +	if (rc == H_SUCCESS)
> +		return associativity_to_nid(associativity);
> +
> +	return NUMA_NO_NODE;
> +}
> +
>  /*
>   * Figure out to which domain a cpu belongs and stick it there.
> + * cpu_to_phys_id is only valid between smp_setup_cpu_maps() and
> + * smp_setup_pacas(). If called outside this window, set get_hwid to true.
>   * Return the id of the domain used.
>   */
> -static int numa_setup_cpu(unsigned long lcpu)
> +static int numa_setup_cpu(unsigned long lcpu, bool get_hwid)

I really dislike this bool.

> @@ -485,6 +499,27 @@ static int numa_setup_cpu(unsigned long lcpu)
>  		return nid;
>  	}
>  
> +	/*
> +	 * On a shared lpar, device tree will not have node associativity.
> +	 * At this time lppaca, or its __old_status field may not be
> +	 * updated. Hence kernel cannot detect if its on a shared lpar. So
> +	 * request an explicit associativity irrespective of whether the
> +	 * lpar is shared or dedicated. Use the device tree property as a
> +	 * fallback.
> +	 */
> +	if (firmware_has_feature(FW_FEATURE_VPHN)) {
> +		long hwid;
> +
> +		if (get_hwid)
> +			hwid = get_hard_smp_processor_id(lcpu);
> +		else
> +			hwid = cpu_to_phys_id[lcpu];

This should move inside vphn_get_nid(), and just do:

	if (cpu_to_phys_id)
		hwid = cpu_to_phys_id[lcpu];
	else
		hwid = get_hard_smp_processor_id(lcpu);


> +		nid = vphn_get_nid(hwid);
> +	}
> +
> +	if (nid != NUMA_NO_NODE)
> +		goto out_present;
> +
>  	cpu = of_get_cpu_node(lcpu, NULL);


cheers


More information about the Linuxppc-dev mailing list