[PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active

Athira Rajeev atrajeev at linux.vnet.ibm.com
Tue Oct 26 04:09:48 AEDT 2021



> On 21-Oct-2021, at 10:47 PM, Nathan Lynch <nathanl at linux.ibm.com> wrote:
> 
> Athira Rajeev <atrajeev at linux.vnet.ibm.com <mailto:atrajeev at linux.vnet.ibm.com>> writes:
>> During Live Partition Migration (LPM), it is observed that perf
>> counter values reports zero post migration completion. However
>> 'perf stat' with workload continues to show counts post migration
>> since PMU gets disabled/enabled during sched switches. But incase
>> of system/cpu wide monitoring, zero counts were reported with 'perf
>> stat' after migration completion.
>> 
>> Example:
>> ./perf stat -e r1001e -I 1000
>>           time             counts unit events
>>     1.001010437         22,137,414      r1001e
>>     2.002495447         15,455,821      r1001e
>> <<>> As seen in next below logs, the counter values shows zero
>>        after migration is completed.
>> <<>>
>>    86.142535370    129,392,333,440      r1001e
>>    87.144714617                  0      r1001e
>>    88.146526636                  0      r1001e
>>    89.148085029                  0      r1001e
> 
> Confirmed in my environment:
> 
>    51.099987985            300,338      cache-misses
>    52.101839374            296,586      cache-misses
>    53.116089796            263,150      cache-misses
>    54.117949249            232,290      cache-misses
>    55.602029375     68,700,421,711      cache-misses
>    56.610073969                  0      cache-misses
>    57.614732000                  0      cache-misses
> 
> I wonder what it means that there is a very unlikely huge value before
> the counter stops working -- I believe your example has this phenomenon
> too.
> 
> 
>> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
>> index e83e089..ff7a77c 100644
>> --- a/arch/powerpc/platforms/pseries/mobility.c
>> +++ b/arch/powerpc/platforms/pseries/mobility.c
>> @@ -476,6 +476,8 @@ static int do_join(void *arg)
>> retry:
>> 	/* Must ensure MSR.EE off for H_JOIN. */
>> 	hard_irq_disable();
>> +	/* Disable PMU before suspend */
>> +	mobility_pmu_disable();
>> 	hvrc = plpar_hcall_norets(H_JOIN);
>> 
>> 	switch (hvrc) {
>> @@ -530,6 +532,8 @@ static int do_join(void *arg)
>> 	 * reset the watchdog.
>> 	 */
>> 	touch_nmi_watchdog();
>> +	/* Enable PMU after resuming */
>> +	mobility_pmu_enable();
>> 	return ret;
>> }
> 
> We should minimize calls into other subsystems from this context (the
> callback function we've passed to stop_machine); it's fairly sensitive.
> Can this be moved out to pseries_migrate_partition() or similar?

Hi Nathan

Thanks for the review.
I will move the callbacks to “pseries_migrate_partition” in next version

Athira.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20211025/d326b7c2/attachment-0001.htm>


More information about the Linuxppc-dev mailing list