[PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

Andy Polyakov appro at cryptogams.org
Wed May 15 19:06:58 AEST 2024


Hi,

> +SYM_FUNC_START(x25519_fe51_sqr_times)
> ...
> +
> +.Lsqr_times_loop:
> ...
> +
> +	std	9,16(3)
> +	std	10,24(3)
> +	std	11,32(3)
> +	std	7,0(3)
> +	std	8,8(3)
> +	bdnz	.Lsqr_times_loop

I see no reason for why the stores can't be moved outside the loop in 
question.

> +SYM_FUNC_START(x25519_fe51_frombytes)
> +.align	5
> +
> +	li	12, -1
> +	srdi	12, 12, 13	# 0x7ffffffffffff
> +
> +	ld	5, 0(4)
> +	ld	6, 8(4)
> +	ld	7, 16(4)
> +	ld	8, 24(4)

Is there actual guarantee that the byte input is 64-bit aligned? While 
it is true that processor is obliged to handle misaligned loads and 
stores by the ISA specification, them being inefficient doesn't go 
against it. Most notably inefficiency is likely to be noted at the page 
boundaries. What I'm trying to say is that it would be more appropriate 
to avoid the unaligned loads (and stores).

Cheers.



More information about the Linuxppc-dev mailing list