[PATCH V3 2/2] powerpc/perf: Fix the power10 event alternatives array to have correct sort order
Athira Rajeev
atrajeev at linux.vnet.ibm.com
Tue Apr 19 21:48:28 AEST 2022
When scheduling a group of events, there are constraint checks
done to make sure all events can go in a group. Example, one of
the criteria is that events in a group cannot use same PMC.
But platform specific PMU supports alternative event for some
of the event codes. During perf_event_open, if any event
group doesn't match constraint check criteria, further lookup
is done to find alternative event.
By current design, the array of alternatives events in PMU
code is expected to be sorted by column 0. This is because in
find_alternative() function, the return criteria is based on
event code comparison. ie "event < ev_alt[i][0])". This
optimisation is there since find_alternative() can get called
multiple times. In power10 PMU code, the alternative event array
is not sorted list and hence there is breakage in finding
alternative event.
To work with existing logic, fix the alternative event array
to be sorted by column 0 for power10-pmu.c
Results:
In case where an alternative event is not chosen when we could,
events will be multiplexed. ie, time sliced where it could
actually run concurrently.
Example, in power10 PM_INST_CMPL_ALT(0x00002) has alternative
event, PM_INST_CMPL(0x500fa). Without the fix, if a group of
events with PMC1 to PMC4 is used along with PM_INST_CMPL_ALT,
it will be time sliced since all programmable PMC's are
consumed already. But with the fix, when it picks alternative
event on PMC5, all events will run concurrently.
<< Before Patch >>
# perf stat -e r00002,r100fc,r200fa,r300fc,r400fc
^C
Performance counter stats for 'system wide':
328668935 r00002 (79.94%)
56501024 r100fc (79.95%)
49564238 r200fa (79.95%)
376 r300fc (80.19%)
660 r400fc (79.97%)
4.039150522 seconds time elapsed
With the fix, since alternative event is chosen to run
on PMC6, events will be run concurrently.
<< After Patch >>
# perf stat -e r00002,r100fc,r200fa,r300fc,r400fc
^C
Performance counter stats for 'system wide':
23596607 r00002
4907738 r100fc
2283608 r200fa
135 r300fc
248 r400fc
1.664671390 seconds time elapsed
Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support")
Signed-off-by: Athira Rajeev <atrajeev at linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy at linux.vnet.ibm.com>
---
Changelog:
v1 -> v2:
Added Fixes tag and reworded commit message
Added Reviewed-by from Maddy
v2 -> v3:
Added info about what is the breakage with current
code.
arch/powerpc/perf/power10-pmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index d3398100a60f..c6d51e7093cf 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -91,8 +91,8 @@ extern u64 PERF_REG_EXTENDED_MASK;
/* Table of alternatives, sorted by column 0 */
static const unsigned int power10_event_alternatives[][MAX_ALT] = {
- { PM_CYC_ALT, PM_CYC },
{ PM_INST_CMPL_ALT, PM_INST_CMPL },
+ { PM_CYC_ALT, PM_CYC },
};
static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
--
2.35.1
More information about the Linuxppc-dev
mailing list