[PATCH v03] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update
Michael Bringmann
mwb at linux.vnet.ibm.com
Sat Feb 9 06:43:43 AEDT 2019
On 2/7/19 11:44 PM, Srikar Dronamraju wrote:
>>
>> int arch_update_cpu_topology(void)
>> {
>> - return numa_update_cpu_topology(true);
>> + int changed = topology_changed;
>> +
>> + topology_changed = 0;
>> + return changed;
>> }
>>
>
> Do we need Powerpc override for arch_update_cpu_topology() now? That
> topology_changed sometime back doesn't seem to have help. The scheduler
> atleast now is neglecting whether the topology changed or not.
I was dealing with a a concurrency problem. Revisiting again.
>
> Also we can do away with the new topology_changed.
>
>> static void topology_work_fn(struct work_struct *work)
>> {
>> - rebuild_sched_domains();
>> + lock_device_hotplug();
>> + if (numa_update_cpu_topology(true))
>> + rebuild_sched_domains();
>> + unlock_device_hotplug();
>> }
>
> Should this hunk be a separate patch by itself to say why
> rebuild_sched_domains with a changelog that explains why it should be under
> lock_device_hotplug? rebuild_sched_domains already takes cpuset_mutex.
> So I am not sure if we need to take device_hotplug_lock.
topology_work_fn runs in its own thread like the DLPAR operations.
This patch adds calls to Nathan's 'dlpar_cpu_readd' from the topology_work_fn
thread. The lock/unlock_device_hotplug guard against concurrency issues
with the DLPAR operations, grabbing that lock here to avoid overlap with
those other operations. This mod is dependent upon using dlpar_cpu_readd.
>
>> static DECLARE_WORK(topology_work, topology_work_fn);
>>
>> -static void topology_schedule_update(void)
>> +void topology_schedule_update(void)
>> {
>> - schedule_work(&topology_work);
>> + if (!topology_update_in_progress)
>> + schedule_work(&topology_work);
>> }
>>
>> static void topology_timer_fn(struct timer_list *unused)
>> {
>> + bool sdo = false;
>
> Is sdo any abbrevation?
'for do the schedule update'. Will remove per below.
>
>> +
>> + if (topology_scans < 1)
>> + bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
>> + nr_cpumask_bits);
>
> Why do we need topology_scan? Just to make sure
> cpu_associativity_changes_mask is populated only once?
> cant we use a static bool inside the function for the same?
I was running into a race condition. On one of my test systems,
start_topology_update via shared_proc_topology_init and the PHYP did
not provide any change info about the CPUs that early in the boot.
The first run erased the cpu bits in cpu_associativity_changes_mask,
and subsequent runs did not pay attention to the reported updates.
Taking another look.
>
>
>> +
>> if (prrn_enabled && cpumask_weight(&cpu_associativity_changes_mask))
>> - topology_schedule_update();
>> - else if (vphn_enabled) {
>> + sdo = true;
>> + if (vphn_enabled) {
>
> Any reason to remove the else above?
When vphn_enabled and prrn_enabled, it was not calling 'update_cpu_associativity_changes_mask()',
so was not getting the necessary change info.
>> if (update_cpu_associativity_changes_mask() > 0)
>> - topology_schedule_update();
>> + sdo = true;
>> reset_topology_timer();
>> }
>> + if (sdo)
>> + topology_schedule_update();
>> + topology_scans++;
>> }
>
> Are the above two hunks necessary? Not getting how the current changes are
> different from the previous.
Not important. Will undo.
>
--
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line 363-5196
External: (512) 286-5196
Cell: (512) 466-0650
mwb at linux.vnet.ibm.com
More information about the Linuxppc-dev
mailing list