Efficient memcpy()/memmove() for G2/G3 cores...

prodyut hazarika prodyuth at gmail.com
Thu Sep 4 06:33:07 EST 2008


Hi all,

>  These could probably go to glibc
> as new general purpose memxxx() routines. You will probably see
> a big increase once dcbz is added to the copy/memset functions.

glibc memxxx for powerpc are horribly inefficient. For optimal performance,
we should should dcbt instruction to establish the source address in cache, and
dcbz to establish the destination address in cache. We should do
dcbt and dcbz such that the touches happen a line ahead of the actual copy.

The problem which is see is that dcbt and dcbz instructions don't work on
non-cacheable memory (obviously!). But memxxx function are used for both
cached and non-cached memory. Thus this optimized memcpy should be smart enough
to figure out that both source and destination address fall in
cacheable space, and only then
used the optimized dcbt/dcbz instructions.

You can expect to see a significant jump in memxxx function after
using dcbt/dcbz.

Thanks,
Prodyut Hazarika



More information about the Linuxppc-dev mailing list