[Cbe-oss-dev] The use of __spe_context_update_event() is inefficient for the debugger

Fri Feb 8 08:51:18 EST 2008

Hi,

Thanks very much for the reply. My comments are inline below...

Ulrich Weigand wrote:
> John DelSignore wrote:
> 
>> The libspe1 and libspe2 libraries in SDK 2.1 (and in other SDK 
>> versions too) contain a function named __spe_context_update_event() 
>> that I believe is designed to help the debugger know when an an spe 
>> context has been updated. The interface seems to be quite simple: 
>> Whenever an image is loaded into an SPU context, or whenever an SPU 
>> context is destroyed, libspe calls a function named 
>> "__spe_context_update_event".  As far as I can tell, this event is 
>> not specific to a particular SPU context, and simply means that 
>> "some" SPU context has changed.  It's up to the debugger to figure 
>> out which one.
> 
> That's right.  GDB uses this interface to keep its list of active
> SPE contexts up to date.
> 
>> An easy fix for this would be to pass the spe directory file 
>> descriptor (spe->base_private->fd_spe_dir) to the spe context update
>> event function. To preserve compatibility with older debuggers and 
>> allow for a transition to the new scheme, we'd probably want to add 
>> a function instead of replacing __spe_context_update_event(). and 
>> call both functions.
> 
> That would be fine with me.  In fact, GDB could probably use that
> as an optimization, too.
> 
> As a minor nit, I'd prefer to just define two symbols pointing to
> the same routine instead of actually executing two calls:
> 
> __attribute__ ((noinline)) void  __spe_context_update_event_id (int id)
> {
> }
> 
> /* Compatibility with older debuggers */
> void  __spe_context_update_event (void)
>   __attribute__ ((alias ("__spe_context_update_event_id")));

Yes, that seems better, and I don't see a problem with potentially two breakpoints at the same address since the debugger should be able to arrange to plant at most one magic breakpoint.

The debugger would look for "__spe_context_update_event_id", and if it is defined set a breakpoint there, and when the breakpoint is hit by the target process, it can look at R3 to get the id. Otherwise for compatibility, if "__spe_context_update_event_id" is *not* defined and "__spe_context_update_event" is defined, the when the breakpoint is hit, it cannot rely on the contents of R3 and must inspect all of the SPU contexts.

>> In libspe2 in SDK 2.1, spe_context_destroy() calls 
>> _base_spe_context_destroy(), which calls 
>> __spe_context_update_event(), *after* the spe context has been 
>> destroyed; the spufs "object-id" file is not reset to zero.  Note 
>> that by the time __spe_context_update_event() is called, the spe 
>> context has already been torn down, so there's nothing useful the 
>> debugger can do with this event.
> 
> I don't quite see that; in fact GDB *does* make use of this case.
> When the context goes away, GDB removes the SPE executable
> image from its list of loaded objects, and performs appropriate
> cleanup (e.g. disabling breakpoints/watchpoints that might still point
> there).  This is completely analogous to case of unloading a 
> dynamically loaded shared library ...

I see, that makes sense for GDB, but TotalView handles SPU thread address spaces differently. In the TotalView debugger, each SPU thread is modeled as a "thread with a discrete address space" separate from the PPU process's address space, so when the thread terminates, the address space object is destroyed along with the thread. Destroying the thread-with-discrete-address-space object has the side effect of disabling action points and decrementing the reference counts on the SPU ELF image objects that were loaded into the SPU thread's address space. I can now see why GDB would want that event, and why TotalView would not.

>> In particular, as far as I can 
>> tell, there is no way to "unload" an spe context such that the spe 
>> context continues to exist, but the address space of the spe context
>> is reset.  Calling __spe_context_update_event() when a context is 
>> destroyed just slows down the debugger, so I would suggest not 
>> calling __spe_context_update_event() or if adopted, 
>> __spe_context_update_event_fd_spe_dir(). But, if someone thinks that
>> the event is useful, calling __spe_context_update_event_fd_spe_dir()
>> with the fd_spe_dir of the context that was just destroyed would be OK.
> 
> That seems reasonable.

Yes, that would help speed things up because TotalView could then figure out it has nothing to do.

So, given that this is not a totally nutty request, is there a formal "enhancement request" procedure that must be followed, or is this something that this group will simply include in a future release?

Cheers, John D.