[PATCH V2] powerpc/perf: Enable PMU counters post partition migration if PMU is active

Madhavan Srinivasan maddy at linux.ibm.com
Fri Oct 22 14:33:51 AEDT 2021


On 10/21/21 11:03 PM, Nathan Lynch wrote:
> Nicholas Piggin <npiggin at gmail.com> writes:
>> Excerpts from Athira Rajeev's message of July 11, 2021 10:25 pm:
>>> During Live Partition Migration (LPM), it is observed that perf
>>> counter values reports zero post migration completion. However
>>> 'perf stat' with workload continues to show counts post migration
>>> since PMU gets disabled/enabled during sched switches. But incase
>>> of system/cpu wide monitoring, zero counts were reported with 'perf
>>> stat' after migration completion.
>>>
>>> Example:
>>>   ./perf stat -e r1001e -I 1000
>>>             time             counts unit events
>>>       1.001010437         22,137,414      r1001e
>>>       2.002495447         15,455,821      r1001e
>>> <<>> As seen in next below logs, the counter values shows zero
>>>          after migration is completed.
>>> <<>>
>>>      86.142535370    129,392,333,440      r1001e
>>>      87.144714617                  0      r1001e
>>>      88.146526636                  0      r1001e
>>>      89.148085029                  0      r1001e
>>>
>>> Here PMU is enabled during start of perf session and counter
>>> values are read at intervals. Counters are only disabled at the
>>> end of session. The powerpc mobility code presently does not handle
>>> disabling and enabling back of PMU counters during partition
>>> migration. Also since the PMU register values are not saved/restored
>>> during migration, PMU registers like Monitor Mode Control Register 0
>>> (MMCR0), Monitor Mode Control Register 1 (MMCR1) will not contain
>>> the value it was programmed with. Hence PMU counters will not be
>>> enabled correctly post migration.
>>>
>>> Fix this in mobility code by handling disabling and enabling of
>>> PMU in all cpu's before and after migration. Patch introduces two
>>> functions 'mobility_pmu_disable' and 'mobility_pmu_enable'.
>>> mobility_pmu_disable() is called before the processor threads goes
>>> to suspend state so as to disable the PMU counters. And disable is
>>> done only if there are any active events running on that cpu.
>>> mobility_pmu_enable() is called after the processor threads are
>>> back online to enable back the PMU counters.
>>>
>>> Since the performance Monitor counters ( PMCs) are not
>>> saved/restored during LPM, results in PMC value being zero and the
>>> 'event->hw.prev_count' being non-zero value. This causes problem
>> Interesting. Are they defined to not be migrated, or may not be
>> migrated?
> PAPR may be silent on this... at least I haven't found anything yet. But
> I'm not very familiar with perf counters.

IIUC, from the internal discussion with pHYP, migration of counters is 
OS thing.

> How much assurance do we have that hardware events we've programmed on
> the source can be reliably re-enabled on the destination, with the same
> semantics? Aren't there some model-specific counters that don't make
> sense to handle this way?

migration to same generation processor/model should be ok
but not to the different generation/model (but it is a case
to handle). That said, this patch is to fix the issue of large
value seen when migrating.

>
>
>>> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
>>> index 9dc97d2..cea72d7 100644
>>> --- a/arch/powerpc/include/asm/rtas.h
>>> +++ b/arch/powerpc/include/asm/rtas.h
>>> @@ -380,5 +380,13 @@ static inline void rtas_initialize(void) { }
>>>   static inline void read_24x7_sys_info(void) { }
>>>   #endif
>>>   
>>> +#ifdef CONFIG_PPC_PERF_CTRS
>>> +void mobility_pmu_disable(void);
>>> +void mobility_pmu_enable(void);
>>> +#else
>>> +static inline void mobility_pmu_disable(void) { }
>>> +static inline void mobility_pmu_enable(void) { }
>>> +#endif
>>> +
>>>   #endif /* __KERNEL__ */
>>>   #endif /* _POWERPC_RTAS_H */
>> It's not implemented in rtas, maybe consider putting this into a perf
>> header?
> +1
>


More information about the Linuxppc-dev mailing list