PowerPC ftrace function trace optimisation

Benjamin Herrenschmidt benh at kernel.crashing.org
Thu Apr 29 11:02:47 EST 2010


> The option Alan added reduces the footprint to 3 instructions which can
> be noped out completely. The rest of the function does not rely on the first
> three instructions. No stack spill is forced either:
> 
> # gcc -pg -mprofile-kernel

>From a quick test it appears that this only works with -m64, not -m32.
Alan is that correct ? Any chance you can fix that in future gcc
versions ?

Also should we implement support for both type of mcounts or just only
allow enabling of ftrace with gcc's that support this ?

Cheers,
Ben.

> 0000000000000000 <.foo>:
>    0:   7c 08 02 a6     mflr    r0
>    4:   f8 01 00 10     std     r0,16(r1)
>    8:   48 00 00 01     bl      8 <.foo+0x8>	<--- call to mcount
> 
>    c:   7c 08 02 a6     mflr    r0
>   10:   f8 01 00 10     std     r0,16(r1)
>   14:   f8 21 ff d1     stdu    r1,-48(r1)
>   18:   e9 22 00 00     ld      r9,0(r2)
>   1c:   e8 69 00 02     lwa     r3,0(r9)
>   20:   38 21 00 30     addi    r1,r1,48
>   24:   e8 01 00 10     ld      r0,16(r1)
>   28:   7c 08 03 a6     mtlr    r0
>   2c:   4e 80 00 20     blr
> 
> 
> This mean we could support ftrace function trace with very little overhead.
> 
> In fact if we are careful when switching to the new mcount ABI and don't
> rely on the store of r0, we could probably optimise this even further in a
> future gcc and remove the store completely. mcount would be 2 instructions:
> 
>    mflr    r0              
>    bl      8 <.foo+0x8>
> 
> Anton




More information about the Linuxppc-dev mailing list