[Cbe-oss-dev] [RFC] [PATCH 2:8] SPU Gang Scheduling - deactivation

Luke Browning lukebr at linux.vnet.ibm.com
Wed Mar 5 03:50:42 EST 2008


On Tue, 2008-03-04 at 05:53 +0100, Arnd Bergmann wrote:
> On Monday 03 March 2008, Luke Browning wrote:
> > Fix problem with time slicing SPUs that are executing remote library routines
> > 
> > If time slicing catches a context while it is executing library code (ie. lazily
> > loaded), the context is unloaded and put on the runqueue.  Later, when the library 
> > routine completes, the controlling thread tries to restart the context by invoking
> > spu_run() which asserts that the context is not on run queue.  
> > 
> > This patch removes the BUG_ON() in __spu_update_sched_info() and adds logic to 
> > remove the context from the runqueue before activating it.
> > 
> > This fixes a pre-existing bug in the spufs scheduler and is not related to gangs.
> 
> As discussed before, this shouldn't happen, as a gang should not
> become scheduled if it has one thread in it that is not actually
> trying to run.

This was a problem that I encountered early on in the development before
Jeremy's patch, which I re-reviewed today.  I agree this is not a
problem in the existing code.  

With gangs, it is a little bit more complicated as it is not an all or
nothing proposition, or it doesn't have to be.  I implemented a gang
variable 'nrunnable' that ensures there is at least one ctxt in
spu_run() before the gang can be added to the runqueue, but I allow the
gang to run with some of contexts in user mode.

I believe this is the right trade off.  If we required a context switch
each time that a context returned to user mode, there would be a lot
more context switches, system overhead would increase, and physical spu
utilization would be lower - a lot lower.  We would, in effect, be
maximizing the weakest part of the system - the PPUs.  

I think this would place a great burden on the application developer to
fine tune his application to minimize the use of PPUs.  He would have to
be a lot more careful about how he utilized gangs as certain instruction
sequences would always cause the gang to be de-scheduled.  He couldn't
just use gangs as way to improve the locality of shared data.  I think
this would greatly limit the appeal of gangs.  

There is also an advantage in this weaker approach.  It allows us to
load a context before spu_run() is invoked.  In this case, spu_run can
be run in a fraction of the time as it just needs to set the npc and
start the context.  No loading is required.    

Finally, this is consistent with the execution state of the gang when it
is lazily loaded.  We allow it to occupy the spus while some of the
contexts are in user mode and others are in kernel mode.  

regards,
Luke




More information about the cbe-oss-dev mailing list