[PATCH v3] powerpc/topology: Check at boot for topology updates

Michael Bringmann mbringm at us.ibm.com
Sat Aug 11 00:50:34 AEST 2018


On 08/10/2018 08:21 AM, Srikar Dronamraju wrote:
> * Michael Ellerman <mpe at ellerman.id.au> [2018-08-10 21:42:28]:
>
>> Srikar Dronamraju <srikar at linux.vnet.ibm.com> writes:
>>> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
>>> index 16b077801a5f..70f2d2285ba7 100644
>>> --- a/arch/powerpc/include/asm/topology.h
>>> +++ b/arch/powerpc/include/asm/topology.h
>>> @@ -113,6 +114,10 @@ static inline int timed_topology_update(int nsecs)
>>>  {
>>>  	return 0;
>>>  }
>>> +
>>> +#ifdef CONFIG_SMP
>>> +static inline void check_topology_updates(void) {}
>>> +#endif
>> I realise that's only called from smp.c but it doesn't really need to be
>> inside #ifdef CONFIG_SMP, it's harmless to have an empty version for
>> SMP=n.
>>
> I did that just to fix
> https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/577//
>
>
>>> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
>>> index 4794d6b4f4d2..2aa0ffd954c9 100644
>>> --- a/arch/powerpc/kernel/smp.c
>>> +++ b/arch/powerpc/kernel/smp.c
>>> @@ -1156,6 +1156,12 @@ void __init smp_cpus_done(unsigned int max_cpus)
>>>  	if (smp_ops && smp_ops->bringup_done)
>>>  		smp_ops->bringup_done();
>>>  
>>> +	/*
>>> +	 * On a shared LPAR, associativity needs to be requested.
>>> +	 * Hence, check for numa topology updates before dumping
>>> +	 * cpu topology
>>> +	 */
>>> +	check_topology_updates();
>> I get that you're calling it here because you want to get it correct
>> before we do the dump below, but we could move the dump later also.
>>
>> You mention we need to do the check before the sched domains are
>> initialised, but where exactly does that happen?
> Almost immediately after smp_cpus_done, 
>
> kernel_init_freeable() calls smp_init() and sched_init_smp() back to
> back.
>
> smp_init() calls smp_cpus_done()
> sched_init_smp() calls sched_init_numa() and sched_init_domains()
>
> sched_init_numa() initialises sched_domains_numa_masks which is the
> cpumask that matters the most in our discussions.
> sched_init_domains initialised the sched_domain.
>
> So I dont think we can move any later.
>
>>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>>> index 0c7e05d89244..32c13a208589 100644
>>> --- a/arch/powerpc/mm/numa.c
>>> +++ b/arch/powerpc/mm/numa.c
>>> @@ -1515,6 +1515,7 @@ int start_topology_update(void)
>>>  		   lppaca_shared_proc(get_lppaca())) {
>>>  		if (!vphn_enabled) {
>>>  			vphn_enabled = 1;
>>> +			topology_update_needed = 1;
>> I really don't understand what topology_update_needed is for.
>> We set it here.
> topology_update_needed is even set by numa_update_cpu_topology which
> mighet get called from rtasd.
Yes.  While looking at problems with Shared CPU Topology updates
for NUMA last year, I observed that numa_update_cpu_topology
was being called before topology_update_init(), and thus, the next
call to numa_update_cpu_topology did not correctly initialize the
topology based upon VPHN.
>
>>>  			setup_cpu_associativity_change_counters();
>>>  			timer_setup(&topology_timer, topology_timer_fn,
>>>  				    TIMER_DEFERRABLE);
>>> @@ -1551,6 +1552,19 @@ int prrn_is_enabled(void)
>>>  	return prrn_enabled;
>>>  }
>>>  
>>> +void __init check_topology_updates(void)
>>> +{
>>> +	/* Do not poll for changes if disabled at boot */
>>> +	if (topology_updates_enabled)
>>> +		start_topology_update();
>> Here we call start_topology_update(), which might set
>> topology_update_needed to 1.
>>
>>> +
>>> +	if (topology_update_needed) {
>> And then we check it here?
>> Why is this logic not *in* start_topology_update() ?
> I have split topology_update_init into two parts.  First part being
> check_topology_update() and the next part being topology_update_init().
>
> By the time the current topology_update_init() gets called,
> sched_domains are already created. So we need to move this part before.
>
> However the scheduled topology updates because of vphn can wait, atleast
> that how the current code is written. topology_inited indicates if the
> scheduled topology updates have been setup.
>
>>> +		bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
>>> +			    nr_cpumask_bits);
>>> +		numa_update_cpu_topology(false);
>>> +	}
>>> +}
>> I'm not really clear on what check_topology_update() is responsible for
>> doing vs topology_update_init(). Why do we need both?
> I am okay to move the complete topology_update_init above and delete the
> device_initcall. But then we would be moving the polling of vphn events
> to a little bit earlier stage in the initialisation.
>
>>> @@ -1608,10 +1618,6 @@ static int topology_update_init(void)
>>>  		return -ENOMEM;
>>>  
>>>  	topology_inited = 1;
>> What does that mean? Didn't we already do the init earlier?
> If there is an explicit request for topology update via
> numa_update_cpu_topology() even before we looked at the feature flags
> and set vphn_enabled/prrn_enabled, then we would defer handle it till
> the feature flags are checked. topology_inited helps us in achieving
> this.
Yes.
>
>>> -	if (topology_update_needed)
>>> -		bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
>>> -					nr_cpumask_bits);
>>> -
>>>  	return 0;
>>>  }
>>>  device_initcall(topology_update_init);
> Thanks for the review.
>
Looks good.

-- 
----------
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:       (512) 466-0650
mbringm at us.ibm.com



More information about the Linuxppc-dev mailing list