David.Laight at ACULAB.COM
Tue Sep 8 18:59:45 AEST 2015
> > What about run-time patching memcpy() after the caches are initialised?
> Yeah, that's the solution we use on 64-bit.
> It also means you can have cpu specific optimisations, which can be patched in
> or out using the cpu feature patching.
I've noticed x86 doing that.
For newer Intel parts it patches in 'rep movsb' but unfortunately
memcpy_io is always #defined to memcpy.
For uncached targets the hardware can't optimise rep movsb - so you
end up with byte accesses.
These work can be rather slower than expected.
This also affects userspace copies to mmap()ed PCIe space.
More information about the Linuxppc-dev