[V3] powerpc/perf: Enable PMU counters post partition migration if PMU is active
Nicholas Piggin
npiggin at gmail.com
Fri Oct 29 17:16:23 AEDT 2021
Excerpts from Athira Rajeev's message of October 29, 2021 1:05 pm:
> During Live Partition Migration (LPM), it is observed that perf
> counter values reports zero post migration completion. However
> 'perf stat' with workload continues to show counts post migration
> since PMU gets disabled/enabled during sched switches. But incase
> of system/cpu wide monitoring, zero counts were reported with 'perf
> stat' after migration completion.
>
> Example:
> ./perf stat -e r1001e -I 1000
> time counts unit events
> 1.001010437 22,137,414 r1001e
> 2.002495447 15,455,821 r1001e
> <<>> As seen in next below logs, the counter values shows zero
> after migration is completed.
> <<>>
> 86.142535370 129,392,333,440 r1001e
> 87.144714617 0 r1001e
> 88.146526636 0 r1001e
> 89.148085029 0 r1001e
This is the output without the patch? After the patch it keeps counting
I suppose? And does the very large count go away too?
>
> Here PMU is enabled during start of perf session and counter
> values are read at intervals. Counters are only disabled at the
> end of session. The powerpc mobility code presently does not handle
> disabling and enabling back of PMU counters during partition
> migration. Also since the PMU register values are not saved/restored
> during migration, PMU registers like Monitor Mode Control Register 0
> (MMCR0), Monitor Mode Control Register 1 (MMCR1) will not contain
> the value it was programmed with. Hence PMU counters will not be
> enabled correctly post migration.
>
> Fix this in mobility code by handling disabling and enabling of
> PMU in all cpu's before and after migration. Patch introduces two
> functions 'mobility_pmu_disable' and 'mobility_pmu_enable'.
> mobility_pmu_disable() is called before the processor threads goes
> to suspend state so as to disable the PMU counters. And disable is
> done only if there are any active events running on that cpu.
> mobility_pmu_enable() is called after the migrate is done to enable
> back the PMU counters.
>
> Since the performance Monitor counters ( PMCs) are not
> saved/restored during LPM, results in PMC value being zero and the
> 'event->hw.prev_count' being non-zero value. This causes problem
> during updation of event->count since we always accumulate
> (event->hw.prev_count - PMC value) in event->count. If
> event->hw.prev_count is greater PMC value, event->count becomes
> negative. To fix this, 'prev_count' also needs to be re-initialised
> for all events while enabling back the events. Hence read the
> existing events and clear the PMC index (stored in event->hw.idx)
> for all events im mobility_pmu_disable. By this way, event count
> settings will get re-initialised correctly in power_pmu_enable.
>
> Signed-off-by: Athira Rajeev <atrajeev at linux.vnet.ibm.com>
> [ Fixed compilation error reported by kernel test robot ]
> Reported-by: kernel test robot <lkp at intel.com>
> ---
> Changelog:
> Change from v2 -> v3:
> Addressed review comments from Nicholas Piggin.
> - Removed the "migrate" field which was added in initial
> patch to address updation of event count settings correctly
> in power_pmu_enable. Instead read off existing events in
> mobility_pmu_disable before power_pmu_enable.
> - Moved the mobility_pmu_disable/enable declaration from
> rtas.h to perf event header file.
>
> Addressed review comments from Nathan.
> - Moved the mobility function calls from stop_machine
> context out to pseries_migrate_partition. Also now this
> is a per cpu invocation.
>
> Change from v1 -> v2:
> - Moved the mobility_pmu_enable and mobility_pmu_disable
> declarations under CONFIG_PPC_PERF_CTRS in rtas.h.
> Also included 'asm/rtas.h' in core-book3s to fix the
> compilation warning reported by kernel test robot.
>
> arch/powerpc/include/asm/perf_event.h | 8 +++++
> arch/powerpc/perf/core-book3s.c | 39 +++++++++++++++++++++++
> arch/powerpc/platforms/pseries/mobility.c | 7 ++++
> 3 files changed, 54 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h
> index 164e910bf654..88aab6cf840c 100644
> --- a/arch/powerpc/include/asm/perf_event.h
> +++ b/arch/powerpc/include/asm/perf_event.h
> @@ -17,6 +17,14 @@ static inline bool is_sier_available(void) { return false; }
> static inline unsigned long get_pmcs_ext_regs(int idx) { return 0; }
> #endif
>
> +#ifdef CONFIG_PPC_PERF_CTRS
> +void mobility_pmu_disable(void *unused);
> +void mobility_pmu_enable(void *unused);
> +#else
> +static inline void mobility_pmu_disable(void *unused) { }
> +static inline void mobility_pmu_enable(void *unused) { }
> +#endif
> +
> #ifdef CONFIG_FSL_EMB_PERF_EVENT
> #include <asm/perf_event_fsl_emb.h>
> #endif
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index 73e62e9b179b..2e8c4c668fa3 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1343,6 +1343,33 @@ static void power_pmu_disable(struct pmu *pmu)
> local_irq_restore(flags);
> }
>
> +/*
> + * Called from pseries_migrate_partition() function
> + * before migration, from powerpc/mobility code.
> + */
> +void mobility_pmu_disable(void *unused)
> +{
> + struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
> + struct perf_event *event;
> +
> + if (cpuhw->n_events != 0) {
> + int i;
> +
> + power_pmu_disable(NULL);
> + /*
> + * Read off any pre-existing events because the register
> + * values may not be migrated.
> + */
> + for (i = 0; i < cpuhw->n_events; ++i) {
> + event = cpuhw->event[i];
> + if (event->hw.idx) {
> + power_pmu_read(event);
> + event->hw.idx = 0;
> + }
> + }
> + }
> +}
> +
> /*
> * Re-enable all events if disable == 0.
> * If we were previously disabled and events were added, then
> @@ -1515,6 +1542,18 @@ static void power_pmu_enable(struct pmu *pmu)
> local_irq_restore(flags);
> }
>
> +/*
> + * Called from pseries_migrate_partition() function
> + * after migration, from powerpc/mobility code.
> + */
> +void mobility_pmu_enable(void *unused)
> +{
> + struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
> +
> + cpuhw->n_added = cpuhw->n_events;
> + power_pmu_enable(NULL);
> +}
> +
> static int collect_events(struct perf_event *group, int max_count,
> struct perf_event *ctrs[], u64 *events,
> unsigned int *flags)
> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
> index e83e0891272d..3e96485ccba4 100644
> --- a/arch/powerpc/platforms/pseries/mobility.c
> +++ b/arch/powerpc/platforms/pseries/mobility.c
> @@ -22,6 +22,7 @@
> #include <linux/delay.h>
> #include <linux/slab.h>
> #include <linux/stringify.h>
> +#include <linux/perf_event.h>
>
> #include <asm/machdep.h>
> #include <asm/rtas.h>
> @@ -631,12 +632,18 @@ static int pseries_migrate_partition(u64 handle)
> if (ret)
> return ret;
>
> + /* Disable PMU before suspend */
> + on_each_cpu(&mobility_pmu_disable, NULL, 0);
Why was this moved out of stop machine and to an IPI?
My concern would be, what are the other CPUs doing at this time? Is it
possible they could take interrupts and schedule? Could that mess up the
perf state here?
Thanks,
Nick
More information about the Linuxppc-dev
mailing list