[PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

Nicholas Piggin npiggin at gmail.com
Fri Aug 5 21:00:52 AEST 2016


On Thu,  4 Aug 2016 16:53:22 +1000
Anton Blanchard <anton at ozlabs.org> wrote:

> From: Anton Blanchard <anton at samba.org>
> 
> Align the hot loops in our assembly implementation of memset()
> and backwards_memcpy().
> 
> backwards_memcpy() is called from tcp_v4_rcv(), so we might
> want to optimise this a little more.
> 
> Signed-off-by: Anton Blanchard <anton at samba.org>
> ---
>  arch/powerpc/lib/mem_64.S | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S
> index 43435c6..eda7a96 100644
> --- a/arch/powerpc/lib/mem_64.S
> +++ b/arch/powerpc/lib/mem_64.S
> @@ -37,6 +37,7 @@ _GLOBAL(memset)
>  	clrldi	r5,r5,58
>  	mtctr	r0
>  	beq	5f
> +	.balign 16
>  4:	std	r4,0(r6)
>  	std	r4,8(r6)
>  	std	r4,16(r6)

Hmm. If we execute this loop once, we'll only fetch additional nops. Twice, and
we make up for them by not fetching unused instructions. More than twice and we
may start winning.

For large sizes it probably helps, but I'd like to see what sizes memset sees.



More information about the Linuxppc-dev mailing list