[PATCH 2/2] crypto: powerpc: Add POWER8 optimised crc32c
segher at kernel.crashing.org
Mon Jul 4 15:57:44 AEST 2016
On Fri, Jul 01, 2016 at 08:19:45AM +1000, Anton Blanchard wrote:
> +#ifdef BYTESWAP_DATA
> + addis r3,r2,.byteswap_constant at toc@ha
> + addi r3,r3,.byteswap_constant at toc@l
> + lvx byteswap,0,r3
> + addi r3,r3,16
You already have r0=0, so you can just do
(the top bits of the permute vector bytes aren't used after all).
Or if you find that distasteful,
Btw, the value in r3 isn't used after this, that last addi is useless?
> + /*
> + * The reflected version of Barrett reduction. Instead of bit
> + * reflecting our data (which is expensive to do), we bit reflect our
> + * constants and our algorithm, which means the intermediate data in
> + * our vector registers goes from 0-63 instead of 63-0. We can reflect
> + * the algorithm because we don't carry in mod 2 arithmetic.
> + */
(or fold these last two together, needs another constant though).
> + lvx v0,0,r4
> + lvx v16,0,r3
> + VPERM(v0,v0,v16,byteswap)
> + vxor v0,v0,v8 /* xor in initial value */
> + VPMSUMW(v0,v0,v16)
> + bdz .Lv0
That VPERM looks strange... You probably want v0 instead of v16. Not
that it matters here.
More information about the Linuxppc-dev