[PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.
Danny Tsen
dtsen at linux.ibm.com
Wed May 15 23:04:03 AEST 2024
See inline.
On 5/15/24 4:06 AM, Andy Polyakov wrote:
> Hi,
>
>> +SYM_FUNC_START(x25519_fe51_sqr_times)
>> ...
>> +
>> +.Lsqr_times_loop:
>> ...
>> +
>> + std 9,16(3)
>> + std 10,24(3)
>> + std 11,32(3)
>> + std 7,0(3)
>> + std 8,8(3)
>> + bdnz .Lsqr_times_loop
>
> I see no reason for why the stores can't be moved outside the loop in
> question.
>
Yeah. I'll fix it.
>> +SYM_FUNC_START(x25519_fe51_frombytes)
>> +.align 5
>> +
>> + li 12, -1
>> + srdi 12, 12, 13 # 0x7ffffffffffff
>> +
>> + ld 5, 0(4)
>> + ld 6, 8(4)
>> + ld 7, 16(4)
>> + ld 8, 24(4)
>
> Is there actual guarantee that the byte input is 64-bit aligned? While
> it is true that processor is obliged to handle misaligned loads and
> stores by the ISA specification, them being inefficient doesn't go
> against it. Most notably inefficiency is likely to be noted at the
> page boundaries. What I'm trying to say is that it would be more
> appropriate to avoid the unaligned loads (and stores).
Good point. Maybe I can handle it with 64-bit aligned for the input.
Thanks.
>
> Cheers.
>
More information about the Linuxppc-dev
mailing list