[Cbe-oss-dev] [CVS sw] * Updated version of the 6 previous affinity patches, with changes related

Luke Browning lukebr at linux.vnet.ibm.com
Fri Apr 6 04:12:45 EST 2007


On Thu, 2007-04-05 at 11:13 +0200, Christoph Hellwig wrote:
> On Sun, Mar 25, 2007 at 11:20:31PM -0300, Luke Browning wrote:
> > I don't really see an alternative.  You could change the code to scan
> > all spu_contexts structures in the system to find the active ones and
> > then force a context switch in a sort of top-down approach, but that
> > would introduce scalability problems.  The scheduler does it from the
> > bottom-up, so that it scales with the number of physical spus in the
> > system.  
> 
> Agreed.
> 
> > We should be able to manage the problem with a clearly defined locking
> > strategy. This patch a simple approach.  It guarantees that an spu->ctx
> > pointer will not change while the spu_prio->active_mutex[node] is held.
> > One can safely reference the context pointer, while the lock is held.
> 
> I think this patch is the right way to go short term.  All other
> spu->ctx dereferences come from walking the active list and already
> have the active_mutex (or active_lock after my latests patches)
> 
> > However, I do think that we can simplify the code by implementing an spu
> > attribute for "active" and "idle".  Currently, this is represented
> > through separate lists which are protected by separate locks, which adds
> > to the complexity of the code.  We actually have three global locks!  We
> > should be just changing a state field in the spu to indicate "active"
> > and "idle", instead of dequeueing it from List A with lock A1 and adding
> > it to list B with lock B1.  This makes asynchronous references like the
> > one managed by this patch more complicated.
> > 
> > Let me know if you want me to prototype it and I will even throw in your
> > non-spufs spu allocations for you.
> 
> I also agree with this statement.  But before we start to deal with this
> issue we should make our minds up about what non-spufs spu users we
> want to support.  Currently the split of the spu list management between
> spu_base and the scheduler in spufs is a real PITA, and complicates both
> the list managment and the locking for it.  In an ideal world I'd
> declare spufs the owner of all spu lists, and an imaginery non-spufs
> user would request it through the scheduler, probably in a similar way
> to how we currently handle isolated spus.

I agree.  I think this is a better structure as it makes it easier to
upgrade / deliver new functions as the base primitives are provided by
the module.  Users don't have to reboot. 
 
> 
> Given the small numbers of spes per node having just one list and a flag
> makes a lot of sense.  So we'd have
> 
> struct spu {
> 	...
> 	struct list_head list;
> 	enum { SPU_FREE, SPU_ACTIVE, SPU_RESERVED } state;
> 	...
> }
> 
> struct be_spu_info {
> 	struct list_head spu_list;
> 	spinlock_t spu_list_lock;
> 	...
> }
> 
>  - all users of the current all spus per node list stay the same
>    (currently they seem to lack locking because they never expect
>    spus to go away, maybe we should fix this assumption while we're at
>    it)
>  - all users of spu_full_list iterate over the nodes instead of using
>    the global list
>  - spu_alloc_node needs to grow a loop over spu_list and check for
>    a free one.  This means more code in the hot path, but I have patches
>    pending to never call spu_free/spu_alloc_node in the context switch
>    path which would more than mitigate this issue
>  - spu_alloc_spu already has the loop and needs the additional state
>    check.
>  - spu_free just sets the state to FREE
> 
>  - notify_spus_active needs an additional check for the active state
>  - same for find_victim
>  - same for spu_sched_exit
> 
>  - we'll probably need another state over the ones above for spus
>    that are in process of doing spu_{un,}bind_context

Looks good to me.  

I think I will upgrade my original patch and re-submit as I forgot the
signed-off line.

Regards, Luke




More information about the cbe-oss-dev mailing list