memcpy regression

David Laight David.Laight at ACULAB.COM
Tue Sep 8 18:59:45 AEST 2015


> > What about run-time patching memcpy() after the caches are initialised?
> 
> Yeah, that's the solution we use on 64-bit.
> 
> It also means you can have cpu specific optimisations, which can be patched in
> or out using the cpu feature patching.

I've noticed x86 doing that.
For newer Intel parts it patches in 'rep movsb' but unfortunately
memcpy_io is always #defined to memcpy.

For uncached targets the hardware can't optimise rep movsb - so you
end up with byte accesses.
These work can be rather slower than expected.

This also affects userspace copies to mmap()ed PCIe space.

	David


More information about the Linuxppc-dev mailing list