[Cbe-oss-dev] [CVS sw] * Updated version of the 6 previous affinity patches, with changes related
Luke Browning
lukebr at linux.vnet.ibm.com
Fri Apr 6 04:12:45 EST 2007
On Thu, 2007-04-05 at 11:13 +0200, Christoph Hellwig wrote:
> On Sun, Mar 25, 2007 at 11:20:31PM -0300, Luke Browning wrote:
> > I don't really see an alternative. You could change the code to scan
> > all spu_contexts structures in the system to find the active ones and
> > then force a context switch in a sort of top-down approach, but that
> > would introduce scalability problems. The scheduler does it from the
> > bottom-up, so that it scales with the number of physical spus in the
> > system.
>
> Agreed.
>
> > We should be able to manage the problem with a clearly defined locking
> > strategy. This patch a simple approach. It guarantees that an spu->ctx
> > pointer will not change while the spu_prio->active_mutex[node] is held.
> > One can safely reference the context pointer, while the lock is held.
>
> I think this patch is the right way to go short term. All other
> spu->ctx dereferences come from walking the active list and already
> have the active_mutex (or active_lock after my latests patches)
>
> > However, I do think that we can simplify the code by implementing an spu
> > attribute for "active" and "idle". Currently, this is represented
> > through separate lists which are protected by separate locks, which adds
> > to the complexity of the code. We actually have three global locks! We
> > should be just changing a state field in the spu to indicate "active"
> > and "idle", instead of dequeueing it from List A with lock A1 and adding
> > it to list B with lock B1. This makes asynchronous references like the
> > one managed by this patch more complicated.
> >
> > Let me know if you want me to prototype it and I will even throw in your
> > non-spufs spu allocations for you.
>
> I also agree with this statement. But before we start to deal with this
> issue we should make our minds up about what non-spufs spu users we
> want to support. Currently the split of the spu list management between
> spu_base and the scheduler in spufs is a real PITA, and complicates both
> the list managment and the locking for it. In an ideal world I'd
> declare spufs the owner of all spu lists, and an imaginery non-spufs
> user would request it through the scheduler, probably in a similar way
> to how we currently handle isolated spus.
I agree. I think this is a better structure as it makes it easier to
upgrade / deliver new functions as the base primitives are provided by
the module. Users don't have to reboot.
>
> Given the small numbers of spes per node having just one list and a flag
> makes a lot of sense. So we'd have
>
> struct spu {
> ...
> struct list_head list;
> enum { SPU_FREE, SPU_ACTIVE, SPU_RESERVED } state;
> ...
> }
>
> struct be_spu_info {
> struct list_head spu_list;
> spinlock_t spu_list_lock;
> ...
> }
>
> - all users of the current all spus per node list stay the same
> (currently they seem to lack locking because they never expect
> spus to go away, maybe we should fix this assumption while we're at
> it)
> - all users of spu_full_list iterate over the nodes instead of using
> the global list
> - spu_alloc_node needs to grow a loop over spu_list and check for
> a free one. This means more code in the hot path, but I have patches
> pending to never call spu_free/spu_alloc_node in the context switch
> path which would more than mitigate this issue
> - spu_alloc_spu already has the loop and needs the additional state
> check.
> - spu_free just sets the state to FREE
>
> - notify_spus_active needs an additional check for the active state
> - same for find_victim
> - same for spu_sched_exit
>
> - we'll probably need another state over the ones above for spus
> that are in process of doing spu_{un,}bind_context
Looks good to me.
I think I will upgrade my original patch and re-submit as I forgot the
signed-off line.
Regards, Luke
More information about the cbe-oss-dev
mailing list