[PATCH] tools/perf: Fix aggr_printout to display cpu field irrespective of core value
Athira Rajeev
atrajeev at linux.vnet.ibm.com
Thu Sep 29 18:49:21 AEST 2022
> On 28-Sep-2022, at 9:05 PM, James Clark <james.clark at arm.com> wrote:
>
>
>
Hi James,
Thanks for looking at the patch and sharing review comments.
> On 13/09/2022 12:57, Athira Rajeev wrote:
>> perf stat includes option to specify aggr_mode to display
>> per-socket, per-core, per-die, per-node counter details.
>> Also there is option -A ( AGGR_NONE, -no-aggr ), where the
>> counter values are displayed for each cpu along with "CPU"
>> value in one field of the output.
>>
>> Each of the aggregate mode uses the information fetched
>> from "/sys/devices/system/cpu/cpuX/topology" like core_id,
>
> I thought that this wouldn't apply to the cpu field because cpu is
> basically interchangeable as an index in cpumap, rather than anything
> being read from the topology file.
The cpu value is filled in this function:
Function : aggr_cpu_id__cpu
Code: util/cpumap.c
>
>> physical_package_id. Utility functions in "cpumap.c" fetches
>> this information and populates the socket id, core id, cpu etc.
>> If the platform does not expose the topology information,
>> these values will be set to -1. Example, in case of powerpc,
>> details like physical_package_id is restricted to be exposed
>> in pSeries platform. So id.socket, id.core, id.cpu all will
>> be set as -1.
>>
>> In case of displaying socket or die value, there is no check
>> done in the "aggr_printout" function to see if it points to
>> valid socket id or die. But for displaying "cpu" value, there
>> is a check for "if (id.core > -1)". In case of powerpc pSeries
>> where detail like physical_package_id is restricted to be
>> exposed, id.core will be set to -1. Hence the column or field
>> itself for CPU won't be displayed in the output.
>>
>> Result for per-socket:
>>
>> <<>>
>> perf stat -e branches --per-socket -a true
>>
>> Performance counter stats for 'system wide':
>>
>> S-1 32 416,851 branches
>> <<>>
>>
>> Here S has -1 in above result. But with -A option which also
>> expects CPU in one column in the result, below is observed.
>>
>> <<>>
>> /bin/perf stat -e instructions -A -a true
>>
>> Performance counter stats for 'system wide':
>>
>> 47,146 instructions
>> 45,226 instructions
>> 43,354 instructions
>> 45,184 instructions
>> <<>>
>>
>> If the cpu id value is pointing to -1 also, it makes sense
>> to display the column in the output to replicate the behaviour
>> or to be in precedence with other aggr options(like per-socket,
>> per-core). Remove the check "id.core" so that CPU field gets
>> displayed in the output.
>
> Why would you want to print -1 out? Seems like the if statement was a
> good one to me, otherwise the output looks a bit broken to users. Are
> the other aggregation modes even working if -1 is set for socket and
> die? Maybe we need to not print -1 in those cases or exit earlier with a
> failure.
>
> The -1 value has a specific internal meaning which is "to not
> aggregate". It doesn't mean "not set".
Currently, this check is done only for printing cpu value.
For socket/die/core values, this check is not done. Pasting an
example snippet from a powerpc system ( specifically from pseries platform where
the value is set to -1 )
./perf stat --per-core -a -C 1 true
Performance counter stats for 'system wide':
S-1-D-1-C-1 1 1.06 msec cpu-clock # 1.018 CPUs utilized
S-1-D-1-C-1 1 2 context-switches # 1.879 K/sec
S-1-D-1-C-1 1 0 cpu-migrations # 0.000 /sec
Here though the value is -1, we are displaying it. Where as in case of cpu, the first column will be
empty since we do a check before printing.
Example:
./perf stat --per-core -A -C 1 true
Performance counter stats for 'CPU(s) 1':
0.88 msec cpu-clock # 1.022 CPUs utilized
2 context-switches
0 cpu-migrations
No sure, whether there are scripts out there, which consume the current format and
not displaying -1 may break it. That is why we tried with change to remove check for cpu, similar to
other modes like socket, die, core etc.
Also perf code ie “aggr_cpu_id__empty” in util/cpumap.c initialises the
values to -1 . I was checking to see where we are mapping -1 to “to not aggregate”.
What I could find is AGGR_NONE ( which is for no-aggr ) has value as zero.
Reference: defined in util/stat.h
enum aggr_mode {
AGGR_NONE,
James, can you point me to reference for that meaning if I have missed anything.
Thanks
Athira
>
>>
>> After the fix:
>>
>> <<>>
>> perf stat -e instructions -A -a true
>>
>> Performance counter stats for 'system wide':
>>
>> CPU-1 64,034 instructions
>> CPU-1 68,941 instructions
>> CPU-1 59,418 instructions
>> CPU-1 70,478 instructions
>> CPU-1 65,201 instructions
>> CPU-1 63,704 instructions
>> <<>>
>>
>> This is caught while running "perf test" for
>> "stat+json_output.sh" and "stat+csv_output.sh".
>
> Is it possible to fix the issue by making the tests cope with the lack
> of the CPU id?
>
>>
>> Reported-by: Disha Goel <disgoel at linux.vnet.ibm.com>
>> Signed-off-by: Athira Rajeev <atrajeev at linux.vnet.ibm.com>
>> ---
>> tools/perf/util/stat-display.c | 6 ++----
>> 1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index b82844cb0ce7..1b751a730271 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -168,10 +168,9 @@ static void aggr_printout(struct perf_stat_config *config,
>> id.socket,
>> id.die,
>> id.core);
>> - } else if (id.core > -1) {
>> + } else
>
> This should have been "id.cpu.cpu > -1". Looks like it was changed by
> some kind of bad merge or rebase in df936cadfb because there is no
> obvious justification for the change to .core in that commit.
>
>> fprintf(config->output, "\"cpu\" : \"%d\", ",
>> id.cpu.cpu);
>> - }
>> } else {
>> if (evsel->percore && !config->percore_show_thread) {
>> fprintf(config->output, "S%d-D%d-C%*d%s",
>> @@ -179,11 +178,10 @@ static void aggr_printout(struct perf_stat_config *config,
>> id.die,
>> config->csv_output ? 0 : -3,
>> id.core, config->csv_sep);
>> - } else if (id.core > -1) {
>> + } else
>> fprintf(config->output, "CPU%*d%s",
>> config->csv_output ? 0 : -7,
>> id.cpu.cpu, config->csv_sep);
>> - }
>> }
>> break;
>> case AGGR_THREAD:
More information about the Linuxppc-dev
mailing list