[Cbe-oss-dev] [RFC, PATCH] CELL Oprofile SPU profiling updated patch

Carl Love cel at us.ibm.com
Sat Feb 10 08:52:45 EST 2007


On Fri, 2007-02-09 at 13:36 -0600, Milton Miller wrote:
> On Feb 9, 2007, at 11:42 AM, Carl Love wrote:
> 
> > On Thu, 2007-02-08 at 09:48 -0600, Milton Miller wrote:
> >> This is part 2, with specific comments to the existing patch.
> >> I'm not subscribed to the list, so please cc.
> >>
> >> Thanks,
> >> milton
> >>
> >> miltonm at bga.com  Milton Miller
> 
> >>
> >> so samples[spu][entry] = spu_pc_lower;
> >>     samples[spu + SPUS_PER_ENTRY][entry] = spu_pc_upper
> >>
> >> hmm... upper and lower seem to be confused here ... the
> >> data is naturarly big endian at the source, so upper should
> >> be 0 and lower 1 .... or just make that spu_pc[2], use 0
> >> and 1 ... yes, a for (j = 0; j < WORDS_PER_ENTRY] loop
> >> assingnig to samples[spu + j * spus_per_word][entry] --- the
> >> compiler should unrol a fixed count of 2.
> 
> 
> >
> > I think you missed the layout here.  This has nothing to do with big or
> > little endian.
> 
> Actually, I think I have it exactly right, but are missing
> my point.   If you consider 0 to be the first, and 7 to be
> the last spu, then the order of the SPUs in the trace array
> is big endian.   I was referring to the order of the elemnts
> in the two words.
> 
> 
> > The trace buffer is a 128 bit wide, 1024 entry hardware
> > buffer.  It takes two 64 bit reads to get the entire 128 bits.  Hence,
> > one read gives you the lower 64 bits of the trace buffer.  The other
> > read gives you the upper 64 bits of the trace buffer.  The SPU PC are 
> > 18
> > bits wide but the lower 2 bits are always zero.  Only the upper 16 bits
> > of the 18 bit PC are stored in the trace buffer.  The hardware stores
> > the eight 16 bit PC values into the 128 bit entry.  The spu_pc_lower
> > variable holds the program counter for the lower 4 SPUs 0 to 3..  The
> > spu_pc_upper holds the program counters for the upper 4 SPUS, 4 to 7.
> >
> > In IBM notation the MSB of a word is bit 0, the LSB is n-1 where n is
> > the size of the word.
> >
> > The cbe_read_trace_buffer() function puts trace buffer bits 0:63 into
> > trace[0] and bits 64:127 into trace[1].  So the trace[0] is: bits 0:15
> > is SPU PC 0, 16:31 is SPU PC 1, 32:47 is SPU PC 2 and bits 48:63 is SPU
> > PC 3.  The layout of the trace[1] variable is the same except it has 
> > SPU
> > PC 4 through 7.
> 
> Yes, and trace[0] would be considered the most significant part of the
> trace word, which would normally be called the upper word.   If think
> of the spu number as big endian it makes sense.  If you think of it
> as little endian, then you have this upper vs lower conflict.
> 

OK, it sounds like when you see the variable name spu_pc_lower and
spu_pc_upper in the above context you are thinking of the trace buffer
words.  That was not the intention of the variable names.  The
lower/upper in the name refers to the range of spu numbers 0-3/4-7.
There was no intention of it having anything to do with the word order
or how the numbers mapped to the trace[0], trace[1] words.  If it did I
would have named the variable something like upper_word and lower_word.
So, given that the context and the use of upper and lower in the
variable names seems to be the issue, I need to change the code to
remove the names and use different names that will not contain the upper
and lower terms.  Do you agree? 

              Carl Love




More information about the cbe-oss-dev mailing list