[PATCH V2 4/9] tools/perf: Add support to capture and parse raw instruction in objdump

Athira Rajeev atrajeev at linux.vnet.ibm.com
Wed May 22 23:58:21 AEST 2024



> On 10 May 2024, at 7:56 PM, Arnaldo Carvalho de Melo <acme at kernel.org> wrote:
> 
> On Thu, May 09, 2024 at 10:56:23PM +0530, Athira Rajeev wrote:
>> 
>> 
>>> On 7 May 2024, at 3:05 PM, Christophe Leroy <christophe.leroy at csgroup.eu> wrote:
>>> 
>>> 
>>> 
>>> Le 06/05/2024 à 14:19, Athira Rajeev a écrit :
>>>> Add support to capture and parse raw instruction in objdump.
>>> 
>>> What's the purpose of using 'objdump' for reading raw instructions ? 
>>> Can't they be read directly without invoking 'objdump' ? It looks odd to 
>>> me to use objdump to provide readable text and then parse it back.
>> 
>> Hi Christophe,
>> 
>> Thanks for your review comments.
>> 
>> Current implementation for data type profiling on X86 uses "objdump" tool to get the disassembled code.
> 
> commit 6d17edc113de1e21fc66afa76be475a4f7c91826
> Author: Namhyung Kim <namhyung at kernel.org>
> Date:   Fri Mar 29 14:58:11 2024 -0700
> 
>    perf annotate: Use libcapstone to disassemble
> 
>    Now it can use the capstone library to disassemble the instructions.
>    Let's use that (if available) for perf annotate to speed up.  Currently
>    it only supports x86 architecture.  With this change I can see ~3x speed
>    up in data type profiling.
> 
>    But note that capstone cannot give the source file and line number info.
>    For now, users should use the external objdump for that by specifying
>    the --objdump option explicitly.
> 
>    Signed-off-by: Namhyung Kim <namhyung at kernel.org>
>    Tested-by: Ian Rogers <irogers at google.com>
>    Cc: Adrian Hunter <adrian.hunter at intel.com>
>    Cc: Changbin Du <changbin.du at huawei.com>
>    Cc: Ingo Molnar <mingo at kernel.org>
>    Cc: Jiri Olsa <jolsa at kernel.org>
>    Cc: Kan Liang <kan.liang at linux.intel.com>
>    Cc: Peter Zijlstra <peterz at infradead.org>
>    Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@kernel.org
>    Signed-off-by: Arnaldo Carvalho de Melo <acme at redhat.com>
> 
> From a quick look at http://www.capstone-engine.org/compile.html it
> seems PowerPC is supported.
> 
> But since we did it first with objdump output parsing, its good to have
> it as an alternative and sometimes a fallback:

Hi Arnaldo, Namhyung

Thanks for the suggestions. libcapstone is a good option and it is faster too.
I will address these changes in V3.

Thanks
Athira
> 
> commit f35847de2a65137e011e559f38a3de5902a5463f
> Author: Namhyung Kim <namhyung at kernel.org>
> Date:   Wed Apr 24 17:51:56 2024 -0700
> 
>    perf annotate: Fallback disassemble to objdump when capstone fails
> 
>    I found some cases that capstone failed to disassemble.  Probably my
>    capstone is an old version but anyway there's a chance it can fail.  And
>    then it silently stopped in the middle.  In my case, it didn't
>    understand "RDPKRU" instruction.
> 
>    Let's check if the capstone disassemble reached the end of the function
>    and fallback to objdump if not
> 
> ---------------
> 
> - Arnaldo
> 
>> And then the objdump result lines are parsed to get the instruction
>> name and register fields. The initial patchset I posted to enable the
>> data type profiling feature in powerpc was using the same way by
>> getting disassembled code from objdump and parsing the disassembled
>> lines. But in V2, we are introducing change for powerpc to use "raw
>> instruction" and fetch opcode, reg fields from the raw instruction.
> 
>> I tried to explain below that current objdump uses option
>> "--no-show-raw-insn" which doesn't capture raw instruction.  So to
>> capture raw instruction, V2 patchset has changes to use default option
>> "--show-raw-insn" and get the raw instruction [ for powerpc ] along
>> with human readable annotation [ which is used by other archs ]. Since
>> perf tool already has objdump implementation in place, I went in the
>> direction to enhance it to use "--show-raw-insn" for powerpc purpose.
> 
>> But as you mentioned, we can directly read raw instruction without
>> using "objdump" tool.  perf has support to read object code. The dso
>> open/read utilities and helper functions are already present in
>> "util/dso.c" And "dso__data_read_offset" function reads data from dso
>> file offset. We can use these functions and I can make changes to
>> directly read binary instruction without using objdump.
> 
>> Namhyung, Arnaldo, Christophe
>> Looking for your valuable feedback on this approach. Please suggest if this approach looks fine
>> 
>> 
>> Thanks
>> Athira
>>> 
>>>> Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
>>>> with "objdump" while disassemble. Example from powerpc with this option
>>>> for an instruction address is:
>>> 
>>> Yes and that makes sense because the purpose of objdump is to provide 
>>> human readable annotations, not to perform automated analysis. Am I 
>>> missing something ?
>>> 
>>>> 
>>>> Snippet from:
>>>> objdump  --start-address=<address> --stop-address=<address>  -d --no-show-raw-insn -C <vmlinux>
>>>> 
>>>> c0000000010224b4: lwz     r10,0(r9)
>>>> 
>>>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>>>> registers names and offset. Also to find whether there is a memory
>>>> reference in the operands, "memory_ref_char" field of objdump is used.
>>>> For x86, "(" is used as memory_ref_char to tackle instructions of the
>>>> form "mov  (%rax), %rcx".
>>>> 
>>>> In case of powerpc, not all instructions using "(" are the only memory
>>>> instructions. Example, above instruction can also be of extended form (X
>>>> form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
>>>> and extract the source/target registers, patch adds support to use raw
>>>> instruction. With raw instruction, macros are added to extract opcode
>>>> and register fields.
>>>> 
>>>> "struct ins_operands" and "struct ins" is updated to carry opcode and
>>>> raw instruction binary code (raw_insn). Function "disasm_line__parse"
>>>> is updated to fill the raw instruction hex value and opcode in newly
>>>> added fields. There is no changes in existing code paths, which parses
>>>> the disassembled code. The architecture using the instruction name and
>>>> present approach is not altered. Since this approach targets powerpc,
>>>> the macro implementation is added for powerpc as of now.
>>>> 
>>>> Example:
>>>> representation using --show-raw-insn in objdump gives result:
>>>> 
>>>> 38 01 81 e8     ld      r4,312(r1)
>>>> 
>>>> Here "38 01 81 e8" is the raw instruction representation. In powerpc,
>>>> this translates to instruction form: "ld RT,DS(RA)" and binary code
>>>> as:
>>>> _____________________________________
>>>> | 58 |  RT  |  RA |      DS       | |
>>>> -------------------------------------
>>>> 0    6     11    16              30 31
>>>> 
>>>> Function "disasm_line__parse" is updated to capture:
>>>> 
>>>> line:    38 01 81 e8     ld      r4,312(r1)
>>>> opcode and raw instruction "38 01 81 e8"
>>>> Raw instruction is used later to extract the reg/offset fields.
>>>> 
>>>> Signed-off-by: Athira Rajeev <atrajeev at linux.vnet.ibm.com>
>>>> ---




More information about the Linuxppc-dev mailing list