<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-15"
http-equiv="Content-Type">
<title></title>
</head>
<body bgcolor="#ffffff" text="#000000">
<br>
<blockquote cite="mid200811131047.40542.arnd@arndb.de" type="cite">
<blockquote type="cite">
<blockquote type="cite">
<pre wrap="">spu-top: Context View
Cpu(s) load avg: 0.09, 0.13, 0.22
Spu(s) load avg: 3.43, 1.60, 1.18
Cpu(s): 0.1%us, 0.3%sys, 0.2%wait, 0.0%nice, 99.4%idle
Spu(s): 49.9%us, 0.1%sys, 0.0%wait, 50.0%idle
PID TID USERNAME S F %SPU SPE TIME BINARY
24429 24443 user002 U 0.2 4 14.230 mono
24429 24442 user002 U 0.2 5 14.231 mono
24429 24441 user002 U 0.2 6 14.232 mono
24429 24440 user002 U 0.2 7 14.232 mono
24429 24439 user002 L 0.0 -1 14.305 mono
24429 24438 user002 L 0.0 -1 14.305 mono
24429 24437 user002 L 0.0 -1 14.306 mono
24429 24436 user002 L 0.0 -1 14.308 mono
24429 24435 user002 U 0.2 0 14.342 mono
24429 24434 user002 U 0.2 1 14.343 mono
24429 24433 user002 U 0.2 2 14.345 mono
24429 24432 user002 U 0.2 3 14.347 mono
</pre>
</blockquote>
<pre wrap="">In fact, real workload is 100% for each SPU (except processes with "-1"),
and a total workload ( for 8 SPUs) is near 50 %:
</pre>
<blockquote type="cite">
<pre wrap="">Spu(s): 49.9%us, 0.1%sys, 0.0%wait, 50.0%idle
</pre>
</blockquote>
</blockquote>
<pre wrap=""><!---->
That still sounds reasonable, you probably do a lot of context switches.</pre>
</blockquote>
I think it's not true due to the following:<br>
at first, I have tried a well known matrix multiplication program <br>
(<a class="moz-txt-link-freetext" href="http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/architektur_und_leistungsanalyse_von_hochleistungsrechnern/cell/matmul/index_html/document_view?body_language=en">http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/architektur_und_leistungsanalyse_von_hochleistungsrechnern/cell/matmul/index_html/document_view?body_language=en</a>)<br>
on the boxes with the different SDKs.<br>
<br>
Here are the results:<br>
1) QS22 blade, Fedora Core 7, SDK 3.0<br>
<blockquote type="cite">$ uname -a<br>
Linux cell8i 2.6.22-5.20070920bsc #1 SMP Tue Sep 25 10:49:16 CEST 2007
ppc64 ppc64 ppc64 GNU/Linux</blockquote>
An application run:<br>
<blockquote type="cite">@cell8i matmul]$ ./matmul -m 6144 -s 1</blockquote>
spu-top during the run:<br>
<blockquote type="cite">spu-top: Context View<br>
Cpu(s) load avg: 0.3%, 0.1%, 0.0%<br>
spu-top: Context View<br>
Cpu(s) load avg: 0.4%, 0.1%, 0.0%<br>
Spu(s) load avg: 20.2%, 9.8%, 3.8%<br>
Cpu(s): 24.9%us, 0.4%sys, 0.0%wait, 0.0%nice, 74.7%idle<br>
Spu(s): 6.2%us, 0.0%sys, 0.0%wait, 93.8%idle<br>
<br>
PID TID USERNAME S F %SPU SPE TIME BINARY<br>
30078 30079 user002 U 100.0 7 3.234 matmul</blockquote>
2) QS22 blade, Fedora Core 9, SDK 3.1:<br>
<blockquote type="cite">$ uname -a<br>
Linux cell8i-3 2.6.25.14-108.20080910bsc.ppc64 #1 SMP Fri Sep 12
11:44:36 CEST 2008 ppc64 ppc64 ppc64 GNU/Linux</blockquote>
The same application run:<br>
<blockquote type="cite">cell8i-3 matmul]$ ./matmul -m 6144 -s 1</blockquote>
spu-top:<br>
<blockquote type="cite">spu-top: Context View<br>
Cpu(s) load avg: 0.56, 0.18, 0.06<br>
Spu(s) load avg: 0.10, 0.03, 0.01<br>
Cpu(s): 16.4%us, 0.3%sys, 0.0%wait, 0.0%nice, 83.3%idle<br>
Spu(s): 6.2%us, 0.0%sys, 0.0%wait, 93.8%idle<br>
<br>
PID TID USERNAME S F %SPU SPE TIME BINARY<br>
12887 12888 user002 U 0.0 7 4.328 matmul</blockquote>
Thanks.<br>
<br>
Yury<br>
<br>
PS <br>
There are also some kernel issues which are present on QS blades in
contrast <br>
with PlayStation 3 (please see the following letter ).<br>
<br>
<br>
<br>
</body>
</html>