[PATCH] powerpc: memcpy optimization for 64bit LE

Philippe Bergheaud felix at linux.vnet.ibm.com
Wed Nov 6 21:21:51 EST 2013


Michael Neuling wrote:
> Philippe Bergheaud <felix at linux.vnet.ibm.com> wrote:
> 
> 
>>Unaligned stores take alignment exceptions on POWER7 running in little-endian.
>>This is a dumb little-endian base memcpy that prevents unaligned stores.
>>It is replaced by the VMX memcpy at boot.
> 
> 
> Is this any faster than the generic version?

The little-endian assembly code of the base memcpy is similar to the code emitted by gcc when compiling the generic memcpy in lib/string.c, and runs at the same speed.
However, a little-endian assembly version of the base memcpy is required (as opposed to a C version), in order to use the self-modifying code instrumentation system.
After the cpu feature CPU_FTR_ALTIVEC is detected at boot, the slow base memcpy is nop'ed out, and the fast memcpy_power7 is used instead.

Philippe



More information about the Linuxppc-dev mailing list