performance: memcpy vs. __copy_tofrom_user

Scott Wood scottwood at freescale.com
Tue Oct 14 02:06:13 EST 2008


On Sun, Oct 12, 2008 at 09:32:07AM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2008-10-08 at 12:40 -0500, Scott Wood wrote:
> > 
> > The performance difference most likely comes from the fact that copy 
> > to/from user can assume that the memory is cacheable, while memcpy is 
> > occasionally used on cache-inhibited memory -- so dcbz isn't used.  We 
> > may be better off handling the alignment fault on those occasions, and 
> > we should use dcba on chips that support it.
> 
> Note that the kernel memcpy isn't supposed to be used for non-cacheable
> memory. That's what memcpy_to/fromio are for.

I agree that it *shouldn't*, but the presence of cacheble_memcpy (used
only by the EMAC driver, AFAICT) suggests that it was a concern.

> But Paul has a point that for small copies especially, the cost of
> the cache instructions outweigh their benefit.

Possibly, but what is the overall effect on the system of using them,
even if it hurts small copies slightly?  How many small copies are of
constant size, which could be diverted to another implementation at
compile-time?  Even run-time diversion may help, as the cost of a small
memcpy is only important if you do it many times, in which case the
branch will probably be correctly predicted.

Given the networking results Dominik posted, I think it's worth a look.

-Scott


More information about the Linuxppc-embedded mailing list