[Skiboot] [PATCH 05/15] libc/string: Add memcpy_ci
stewart at linux.vnet.ibm.com
Wed Sep 14 19:12:27 AEST 2016
Claudio Carvalho <cclaudio at linux.vnet.ibm.com> writes:
>>> This function doesn't check if srcpp and dstpp are multiple of uint32_t,
>>> but it assumes that the processor is able to read a char and a uint32_t
>>> from src. Thus, the first loop copies at most len % sizeof(uint32_t) to
>>> optimize the number of reads
>> The length checking is fine, the bigger issue here is that the first
>> loop will leave srcp unaligned. If we wanted to copy 7 bytes starting
>> from 0x0 the first loop would move srcp to 0x3 before attempting to
>> finish the copy with a single 32 bit load. On P8 unaligned CI loads
>> will cause an alignment interrupt so the assumption that the processor
>> can read a uint32_t from srcp is faulty. If you want to keep this
>> optimisation the first loop needs to align srcp so that (srcp % block)
>> == 0. The bulk of the copy can then be done with with 32 (or 64bit)
>> loads and finished with a byte-by-byte copy. You could move the byte
>> copies into a helper function to eliminate all the int <-> pointer
>> casting too.
> Is this better? I am not sure if I understand what you mean about the
> helper function.
Helper to do the byte by byte copy to get to alignment and then the
So in_8() loop for getting to alignment, then in_be32() to do in blocks,
then in_8() for any remaining.
OPAL Architect, IBM.
More information about the Skiboot