performance: memcpy vs. __copy_tofrom_user

Dominik Bozek domino at mikroswiat.pl
Thu Oct 9 21:12:21 EST 2008


Paul Mackerras wrote:

> When I looked at this last (which was a few years ago, I'll admit), I
> found that the vast majority of memcpy calls were for small copies,
> i.e. less than 128 bytes, whereas __copy_tofrom_user was often used
> for larger copies (usually 1 page).  So with memcpy the focus was more
> on keeping the startup costs low, while __copy_tofrom_user was
> optimized more for bandwidth.
>
> The other point is that the kernel memcpy doesn't consume a noticeable
> amount of CPU time (at least not on any workload I've seen), so it
> hasn't been a target for aggressive optimization.
>   


Actually I made couple of other tests on that mpc8313. Most of them are
to ugly to publish them, but... My problem is that I have to boost the
gigabit interface on the mpc8313. I made simple substitution and
__copy_tofrom_user was used instead of memcpy. I know, it's wrong, but I
speedup that way the network interface for about 10%.

I made also some calculation based on the results I had send. One
__copy_tofrom_user of 1500B compensate profit from 258 memcpy of 8B. But
of course this is the case of mpc8313 (333MHz core, DDR2 at 266MHz). On
other hardware it may work differently and to make any binding
conclusion we need to see some results done on other cpus.
Unfortunately, right now, I don't have any other ppc to make such test
and compare it.
Other hand my test do not cover all cases. I believe most of small
transfer involve data already cached. This is big point for current memcpy.

Maybe there is another solution. Method, agressive or "low cost setup",
will be chosen depend on the size of the copied block and fixed limit.
Limit shall be known at compile time and related to the chosen cpu/platfrom.

If someone ask for other tests. I had optimized memcpy for cases when
transfer size is known at compile time and... hard to say if the system
was "faster", but for sure I didn't notice any boost at network
interface. Maybe my optimization was bad. It's possible with me.

Dominik




More information about the Linuxppc-dev mailing list