[PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()
npiggin at gmail.com
Fri Aug 5 21:00:52 AEST 2016
On Thu, 4 Aug 2016 16:53:22 +1000
Anton Blanchard <anton at ozlabs.org> wrote:
> From: Anton Blanchard <anton at samba.org>
> Align the hot loops in our assembly implementation of memset()
> and backwards_memcpy().
> backwards_memcpy() is called from tcp_v4_rcv(), so we might
> want to optimise this a little more.
> Signed-off-by: Anton Blanchard <anton at samba.org>
> arch/powerpc/lib/mem_64.S | 2 ++
> 1 file changed, 2 insertions(+)
> diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S
> index 43435c6..eda7a96 100644
> --- a/arch/powerpc/lib/mem_64.S
> +++ b/arch/powerpc/lib/mem_64.S
> @@ -37,6 +37,7 @@ _GLOBAL(memset)
> clrldi r5,r5,58
> mtctr r0
> beq 5f
> + .balign 16
> 4: std r4,0(r6)
> std r4,8(r6)
> std r4,16(r6)
Hmm. If we execute this loop once, we'll only fetch additional nops. Twice, and
we make up for them by not fetching unused instructions. More than twice and we
may start winning.
For large sizes it probably helps, but I'd like to see what sizes memset sees.
More information about the Linuxppc-dev