[PATCH] powerpc: memcpy optimization for 64bit LE

Michael Neuling mikey at neuling.org
Thu Nov 7 13:10:38 EST 2013


OK, can you add that and/or maybe antons description to the patch changelog?

Mikey

On Wed, Nov 6, 2013 at 9:21 PM, Philippe Bergheaud
<felix at linux.vnet.ibm.com> wrote:
> Michael Neuling wrote:
>>
>> Philippe Bergheaud <felix at linux.vnet.ibm.com> wrote:
>>
>>
>>> Unaligned stores take alignment exceptions on POWER7 running in
>>> little-endian.
>>> This is a dumb little-endian base memcpy that prevents unaligned stores.
>>> It is replaced by the VMX memcpy at boot.
>>
>>
>>
>> Is this any faster than the generic version?
>
>
> The little-endian assembly code of the base memcpy is similar to the code
> emitted by gcc when compiling the generic memcpy in lib/string.c, and runs
> at the same speed.
> However, a little-endian assembly version of the base memcpy is required (as
> opposed to a C version), in order to use the self-modifying code
> instrumentation system.
> After the cpu feature CPU_FTR_ALTIVEC is detected at boot, the slow base
> memcpy is nop'ed out, and the fast memcpy_power7 is used instead.
>
> Philippe
>


More information about the Linuxppc-dev mailing list