MPC5200B memory performance

Daniel Schnell daniel.schnell at marel.com
Tue May 15 21:22:24 EST 2007


Hi,


I am doing some memory performance measurements on our custom MPC5200B
board which runs on 396 MHz internally and is connected to DDR RAM. The
RAM is driven with 132 MHz.

With the attached program (compile with -lrt) I am testing the memcpy()
throughput. In theory the memory throughput should be the double of the
memcpy() throughput if source and destination buffers are same size and
inside the DDR-RAM.

So one could make the simple calculation:

132 MHz * 32 Bit (address width) * 2 (DDR) ~ 1GBytes/sec brutto memory
throughput.

For a memcpy this should be then ~500MB/second.

Of course in real world scenarios we cannot reach the theoretical limit,
but be about 30 % near I guess.


I get the following values on my board:

bash-2.05b# ./memcpy_perf
Test (10000) memcpy of sizes (1024) ....
10000 memcpy. Time per memcpy: 1567 [nsec] (653 MB/sec)
 finished.
Test (10000) memcpy of sizes (2048) ....
10000 memcpy. Time per memcpy: 2939 [nsec] (696 MB/sec)
 finished.
Test (10000) memcpy of sizes (4096) ....
10000 memcpy. Time per memcpy: 5706 [nsec] (717 MB/sec)
 finished.
Test (10000) memcpy of sizes (8192) ....
10000 memcpy. Time per memcpy: 17077 [nsec] (479 MB/sec)
 finished.
Test (10000) memcpy of sizes (16384) ....
10000 memcpy. Time per memcpy: 133314 [nsec] (122 MB/sec)
 finished.
Test (1000) memcpy of sizes (32768) ....
1000 memcpy. Time per memcpy: 243417 [nsec] (134 MB/sec)
 finished.
Test (1000) memcpy of sizes (51200) ....
1000 memcpy. Time per memcpy: 403455 [nsec] (126 MB/sec)
 finished.
Test (1000) memcpy of sizes (102400) ....
1000 memcpy. Time per memcpy: 713316 [nsec] (143 MB/sec)
 finished.
Test (100) memcpy of sizes (1048576) ....
100 memcpy. Time per memcpy: 7210570 [nsec] (145 MB/sec)
 finished.
Test (10) memcpy of sizes (10485760) ....
10 memcpy. Time per memcpy: 78162400 [nsec] (134 MB/sec)
 finished.
Test (5) memcpy of sizes (52428800) ....
5 memcpy. Time per memcpy: 425281800 [nsec] (123 MB/sec)
 finished.



The first 4 values are because of the data cache. So here we are testing
cache performance. All other values will test the memory controller
interface.

All in all, I am not sure, why the memory access is so much slower than
I expected.
Which factors did I miss in my calculation ? Can anybody run this
program on its 5200B based board as a comparision ?


Best regards,

Daniel Schnell.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: memcpy_perf.c
Type: application/octet-stream
Size: 1979 bytes
Desc: memcpy_perf.c
Url : http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20070515/b50fee50/attachment.obj 


More information about the Linuxppc-embedded mailing list