Efficient memcpy()/memmove() for G2/G3 cores...
David Jander
david.jander at protonic.nl
Mon Aug 25 19:31:01 EST 2008
Hello,
I was wondering if there is a good replacement for GLibc memcpy() functions,
that doesn't have horrendous performance on embedded PowerPC processors (such
as Glibc has).
I did some simple benchmarks with this implementation on our custom MPC5121
based board (Freescale e300 core, something like a PPC603e, G2, without VMX):
...
unsigned long int a,b,c,d;
unsigned long int a1,b1,c1,d1;
...
while (len >= 32)
{
a = plSrc[0];
b = plSrc[1];
c = plSrc[2];
d = plSrc[3];
a1 = plSrc[4];
b1 = plSrc[5];
c1 = plSrc[6];
d1 = plSrc[7];
plSrc += 8;
plDst[0] = a;
plDst[1] = b;
plDst[2] = c;
plDst[3] = d;
plDst[4] = a1;
plDst[5] = b1;
plDst[6] = c1;
plDst[7] = d1;
plDst += 8;
len -= 32;
}
...
And the results are more than telling.... by linking this with LD_PRELOAD,
some programs get an enourmous performance boost.
For example a small test program that copies frames into video memory (just
RAM) improved throughput from 13.2 MiB/s to 69.5 MiB/s.
I have googled for this issue, but most optimized versions of memcpy() and
friends seem to focus on AltiVec/VMX, which this processor does not have.
Now I am certain that most of the G2/G3 users on this list _must_ have a
better solution for this. Any suggestions?
Btw, the tests are done on Ubuntu/PowerPC 7.10, don't know if that matters
though...
Best regards,
--
David Jander
More information about the Linuxppc-dev
mailing list