[PATCH 2/2] powerpc: Align hot loops of some string functions

Segher Boessenkool segher at kernel.crashing.org
Fri May 27 16:26:53 AEST 2016


On Fri, May 27, 2016 at 07:45:18AM +0200, Christophe Leroy wrote:
> >>Wouldn't it be better to add nops before the function entry in order to
> >>get the hot loop aligned, instead of adding nops in the middle of the
> >>function ?
> >Why would that be better?  The nops are executed once per function call
> >in either case, there are the same number of nops in either case, and
> >on most CPUs nops aren't actually executed anyway (they are decoded and
> >the thrown away).
> >
> The idea was to not execute them:
> 
> |.balign 16 nop nop _GLOBAL(strcpy) addi	r5,r3,-1 addi	r4,r4,-1 1: 
> lbzu r0,1(r4) cmpwi	0,r0,0 stbu	r0,1(r5) bne	1b blr |

That performs _worse_ on most modern CPUs (the first decode will decode
less, so instructions are available for execution later).  That's why
functions are aligned in the first place!


Segher


More information about the Linuxppc-dev mailing list