[V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active

Nicholas Piggin npiggin at gmail.com
Tue Nov 2 22:35:47 AEDT 2021


Excerpts from Michael Ellerman's message of October 29, 2021 11:15 pm:
> Nicholas Piggin <npiggin at gmail.com> writes:
>> Excerpts from Athira Rajeev's message of October 29, 2021 1:05 pm:
>>> @@ -631,12 +632,18 @@ static int pseries_migrate_partition(u64 handle)
>>>  	if (ret)
>>>  		return ret;
>>>  
>>> +	/* Disable PMU before suspend */
>>> +	on_each_cpu(&mobility_pmu_disable, NULL, 0);
>>
>> Why was this moved out of stop machine and to an IPI?
>>
>> My concern would be, what are the other CPUs doing at this time? Is it 
>> possible they could take interrupts and schedule? Could that mess up the
>> perf state here?
> 
> pseries_migrate_partition() is called directly from migration_store(),
> which is the sysfs store function, which can be called concurrently by
> different CPUs.
> 
> It's also potentially called from rtas_syscall_dispatch_ibm_suspend_me(),
> from sys_rtas(), again with no locking.
> 
> So we could have two CPUs calling into here at the same time, which
> might not crash, but is unlikely to work well.
> 
> I think the lack of locking might have been OK in the past because only
> one CPU will successfully get the other CPUs to call do_join() in
> pseries_suspend(). But I could be wrong.
> 
> Anyway, now that we're mutating the PMU state before suspending we need
> to be more careful. So I think we need a lock around the whole sequence.

My concern is still that we wouldn't necessarily have the other CPUs 
under control at that point even if we serialize the migrate path.
They could take interrupts, possibly call into perf subsystem after
the mobility_pmu_disable (e.g., via syscall or context switch) which 
might mess things up.

I think the stop machine is a reasonable place for the code in this 
case. It's a low level disabling of hardware facility and saving off 
registers.

Thanks,
Nick



More information about the Linuxppc-dev mailing list