<span class="q">From dmabench's README.txt:<br></span><div style="direction: ltr;"><span class="q"> --numspes n - specifies that n SPEs should<br>concurrently execute<br> the benchmark. The default is to execute the benchmark on a
<br> single SPE. When the benchmark is executed on more than one<br> SPE, the SPEs are synchronized so that the benchmark code<br> starts at roughly the same time on all SPEs.<br>
<br></span></div><div style="direction: ltr;"><span class="q">This paragraph is also telling you basically not to try to set n > 6<br>on a PLAYSTATION 3 (CELL BE in PS3 has the 8th SPE disabled for<br>redundancy reasons, the 7th SPE is not accessible by the guest OS).
<br><br>What's the fix ? Another debugging related environment variable ?<br><br>No, a simple change in the dmabench.c source file.<br><br></span></div><div style="direction: ltr;"><span class="q">38 #define MAX_SPES 6
<br><br></span></div><div style="direction: ltr;"><span class="q">Originally the value read 16 which would be the maximum you could<br>target with a dual CELL BE solution like you find in IBM's CELL<br>blades.<br><br>
The number of SPU threads, with libspe2 and posix threads, is not<br>limited by the number of physical SPU's and what the dmabench program<br>does check to make sure that the number read through the --numspes<br>option is valid: taking the minimum between that value and the
<br>MAX_SPES value.<br><br></span></div><div style="direction: ltr;"><span class="q"> 224 case 'n': /* numspes */<br> 225 num_spes = atoi(optarg);<br> 226 num_spes = MIN(num_spes, MAX_SPES);
<br> 227 break;<br><br></span></div><div style="direction: ltr;"><span class="q">MAX_SPES is used to set-up enough available memory and SPU's context<br>to allow the program to run even if you would want to test all 16
<br>SPE's in your system (if you had such a system of course).<br><br>No matter if you set --numspes to 1, 2, 3, 4, etc... the program will<br>allocate memory based on the value specified by MAX_SPES:<br><br></span></div>
<div style="direction: ltr;"><span class="q"> 80 /* Allocate space for parameters for each SPU */<br> 81 static dmabench_parms parms[MAX_SPES] __attribute__ ((aligned (16)));<br> 82<br> 83 /* Cache-line sized block for use in barrier calls. */
<br> 84 static unsigned int bar[CACHE_LINE_SIZE/sizeof(unsigned int)]<br>__attribu te__ ((aligned (128)));<br> 85<br> 86 /* Buffers in main memory that will be read or written by DMAs */<br> 87 static uint64_t tgt_buf[MAX_SPES][NUM_ITER*NREQS*BUFSIZE]
<br>__attribute__ ((aligned (4096)));<br><br>[...]<br><br> 181 int main(int argc, char *argv[])<br> 182 {<br> 183 spe_gang_context_ptr_t gang = NULL;<br> 184 spe_context_ptr_t ctx[MAX_SPES];<br>
185 void *ls[MAX_SPES];<br> 186 pthread_t thread[MAX_SPES];<br><br>[...]<br><br></span><span class="q" id="q_114bcdc5da7add27_16">If you can run it with a single SPE, you have enough memory to run<br>this program with 5-6 SPE's. We are not THAT low on memory on PS3.
<br><br>I am still not clear (read: have no clue really) why 4 SPE's is the<br>upper limit before the program hangs the system and just using 5 SPE's<br>instead does the "annoying" trick and why this error goes away when we
<br>reduce the MAX_SPES value to 6: all the memory we allocate is<br>statically allocated on the stack. It is not dynamically allocated<br>memory.<br><br>Thanks for listening to all my rants so far :).<br><br>Have a nice day,
<br><br>Goffredo Marocchi</span><br></div>