[PATCH 2/3] tools/erf/util/annotate: Set register_char and memory_ref_char for powerpc

Athira Rajeev atrajeev at linux.vnet.ibm.com
Mon Mar 18 22:33:04 AEDT 2024



> On 09-Mar-2024, at 11:13 PM, Segher Boessenkool <segher at kernel.crashing.org> wrote:
> 
> All instructions with a primary opcode from 32 to 63 are memory insns,
> and no others.  It's trivial to see whether it is a load or store, too
> (just bit 3 of the insn).  Trying to parse disassembled code is much
> harder, and you easily make some mistakes here.

Hi Segher

Thanks for checking the patch and sharing review comments.

Ok, I am checking on this part.

> 
> On Sat, Mar 09, 2024 at 12:55:12PM +0530, Athira Rajeev wrote:
>> To identify if the instruction has any memory reference,
>> "memory_ref_char" field needs to be set for specific architecture.
>> Example memory instruction:
>> lwz     r10,0(r9)
>> 
>> Here "(" is the memory_ref_char. Set this as part of arch->objdump
> 
> What about "lwzx r10,0,r19", semantically exactly the same instruction?
> There is no paren in there.  Not all instructions using parens are
> memory insns, either, not in assembler code at least.
Yes, right Segher.

So, for the basic foundational patches, I targeted for instructions of this form (D form)
There are still samples, which comes as unknown and in that, X form instructions also needs to be checked.
Targeted to first get these basic foundational patches to add support in powerpc and get the remaining “unknowns” addressed in follow up.
But yes, X-form instructions also will be covered as part of the changes needed for powerpc.

> 
>> To get register number and access offset from the given instruction,
>> arch->objdump.register_char is used. In case of powerpc, the register
>> char is "r" (since reg names are r0 to r31). Hence set register_char
>> as "r".
> 
> cr0..cr7
> r0..r31
> f0..f31
> v0..v31
> vs0..vs63
> and many other spellings.  Plain "0..63" is also fine.
Ok 
> 
> The "0" in my lwzx example is a register field as well (and it stands
> for "no register", not for "r0").  Called "(RA|0)" usually (incidentally,
> see the parens there as well, oh joy).
> 
> Don't you have the binary code here as well, not just a disassembled
> representation of it?  It is way easier to just use that, and you'll get
> much better results that way :-)
> 

Thanks Segher for the suggestion on this. I will check on this as well.

Thanks
Athira Rajeev

> 
> Segher



More information about the Linuxppc-dev mailing list