PowerPC ftrace function trace optimisation
Benjamin Herrenschmidt
benh at kernel.crashing.org
Thu Apr 29 11:02:47 EST 2010
> The option Alan added reduces the footprint to 3 instructions which can
> be noped out completely. The rest of the function does not rely on the first
> three instructions. No stack spill is forced either:
>
> # gcc -pg -mprofile-kernel
>From a quick test it appears that this only works with -m64, not -m32.
Alan is that correct ? Any chance you can fix that in future gcc
versions ?
Also should we implement support for both type of mcounts or just only
allow enabling of ftrace with gcc's that support this ?
Cheers,
Ben.
> 0000000000000000 <.foo>:
> 0: 7c 08 02 a6 mflr r0
> 4: f8 01 00 10 std r0,16(r1)
> 8: 48 00 00 01 bl 8 <.foo+0x8> <--- call to mcount
>
> c: 7c 08 02 a6 mflr r0
> 10: f8 01 00 10 std r0,16(r1)
> 14: f8 21 ff d1 stdu r1,-48(r1)
> 18: e9 22 00 00 ld r9,0(r2)
> 1c: e8 69 00 02 lwa r3,0(r9)
> 20: 38 21 00 30 addi r1,r1,48
> 24: e8 01 00 10 ld r0,16(r1)
> 28: 7c 08 03 a6 mtlr r0
> 2c: 4e 80 00 20 blr
>
>
> This mean we could support ftrace function trace with very little overhead.
>
> In fact if we are careful when switching to the new mcount ABI and don't
> rely on the store of r0, we could probably optimise this even further in a
> future gcc and remove the store completely. mcount would be 2 instructions:
>
> mflr r0
> bl 8 <.foo+0x8>
>
> Anton
More information about the Linuxppc-dev
mailing list