[Cbe-oss-dev] [RFC] [PATCH 0:8] SPU Gang Scheduling

Tue Mar 4 06:20:56 EST 2008

Implement spu gang scheduling.

Change spu activation, deactivation, and time slicing to be a gang 
level actions.  All contexts within a gang are scheduled at the same
time.  Placement is performed in a couple of steps.  Basically, the
spus are reserved for teh gang and spu_bind_context is called multiple
times with a different reserved spu each ime.  This is necessary to 
ensure that another activations doesn't steal one of your spus.

Timeslicing follows the same basic algorithm, except all of the 
gangs are time sliced first to make room for the gang(s) to be run
next.  These gangs are of indeterminate size, so you can't do it in
place like you could with traditional schedulers who rely on a 1:1
replacement algorithm.  There is a simple hueristic to present over
preemption.  A new runqueue statistic has been added to keep track 
of the total number of contexts on the runqueue. Once we have 
preempted that many spus we stop preempting and only decrment the 
context tick, which is still implemented at the context level. When
a context's tick goes to zero, the gang is preempted. 

SPU affinity has not been implemented yet.

A new runnable counter has been added to the gang structure to create
a synchronization point for gang start.  The counter is incremented,
when a context calls spu_run().  When all of the contexts have been
started, the gang is considered runnable.  Thereafter, it is only
considered non-runnable, if it is blocks on a major page fault. 

The start synchronization point is implemented by passing the first
N-1 contexts directly through spu_run to the spufs_wait() critical 
section where they are waiting on an spe event. They update their 
csa area, but they don't call spu_activate. The last thread through
calls spu_activate, which performs the activation for all of the
contexts in the gang.

Nearly all of the spu_run() critical section is the same.  It is still
context based and runs almost entirely under the context lock.  The gang 
lock is only taken when the context is in the SPU_SCHED_STATE, signifying 
that the context needs to be activated.  This is an important optimization
that avoids lock contention in the controlling thread.

regards, Luke