[Cbe-oss-dev] [RFC] [PATCH 6:8] SPU Gang Scheduling - base functionality for gang scheduling

Christoph Hellwig hch at lst.de
Wed Mar 12 08:23:15 EST 2008


On Mon, Mar 03, 2008 at 04:50:06PM -0300, Luke Browning wrote:
> Index: spufs/include/asm-powerpc/spu.h
> ===================================================================
> --- spufs.orig/include/asm-powerpc/spu.h
> +++ spufs/include/asm-powerpc/spu.h
> @@ -136,6 +136,7 @@ struct spu {
>  	unsigned int slb_replace;
>  	struct mm_struct *mm;
>  	struct spu_context *ctx;
> +	struct spu_gang *gang;

Is there any codepath where spu->ctx->gang can't be used but
spu->gang would be valid?

> -/* ctx->sched_flags */
> +/* ctx->sched_flags and gang->sched_flags */
>  enum {
>  	SPU_SCHED_NOTIFY_ACTIVE,
>  	SPU_SCHED_WAS_ACTIVE,	/* was active upon spu_acquire_saved()  */
> +	SPU_SCHED_JUST_CREATED,
> +	SPU_SCHED_DEACTIVATED,
>  };

> +#define SPU_GANG_RUNNABLE(g) \
> +	(g->contexts && (atomic_read(&g->nstarted) == g->contexts))

In Linux inline functions are preffered over macros, so make this
a little spu_gang_runnable() helper instead.

> +struct spu_sched_stats {
> +	int inode;
> +	int inode_icnt;
> +	int inode_pcnt;
> +	int pnode;
> +	int pnode_icnt;
> +	int pnode_pcnt;
> +	int total_icnt;
> +	int total_pcnt;
> +};

please move the documentation of these fields from spu_get_run_stats to
here.  Also slightly more verbose and descriptive field names would be
nice.

> -	spin_lock(&spu_prio->runq_lock);
> -	rval = __node_allowed(ctx, node);
> -	spin_unlock(&spu_prio->runq_lock);
> +	spin_lock(&cbe_spu_info[node].list_lock);
> +	rval = __node_allowed(gang, node);
> +	spin_unlock(&cbe_spu_info[node].list_lock);

We could probably change this to the list_lock already in the patch
where you change that one from a mutex to the spinlock.

> +static int spu_get_idle(struct spu_gang *gang, int node)
>  {
> +	struct spu *spu;
> +	int n, lnode, count, mode;
> +	int ret = 0, found = 0;
>  
> +	spu_context_nospu_trace(spu_get_idle__enter, gang);
>  
> +	/* TO DO: SPU affinity scheduling. */
>  
> +	mode = SPU_RESERVE;
>  
> +spu_get_idle_top:
>  
> +	if (node == -1)
> +		lnode = cpu_to_node(raw_smp_processor_id());
> +	else
> +		lnode = node;
>  
> +	for (n = 0, count = 0;
> +	     n < MAX_NUMNODES && count < gang->contexts;
> +	     n++, lnode++) {
> +
> +		lnode = (lnode < MAX_NUMNODES) ? lnode : 0;
> +		if (!node_allowed(gang, lnode))
>  			continue;
>  
> +		spin_lock(&cbe_spu_info[lnode].list_lock);
> +		list_for_each_entry(spu, &cbe_spu_info[lnode].spus, cbe_list) {
> +			switch (mode) {
> +			case SPU_RESERVE :
> +				if ((spu->alloc_state == SPU_FREE) &&
> +				    (spu->gang == NULL)) {
> +					spu_init_channels(spu);
> +					spu->gang = gang;
> +					if (++count == gang->contexts)
> +						goto spu_get_idle_found;
> +				}
> +				break;
> +			case SPU_RESERVE_UNDO :
> +				if ((spu->alloc_state == SPU_FREE) &&
> +				    (spu->gang == gang)) {
> +					spu->gang = NULL;
> +					if (++count == found)
> +						goto spu_get_idle_found;
> +				}
> +				break;

We don't really need all these numa-affinitity stuff for the undo
case, so that could be moved into a little helper of it's own
isntead of the mode variable.  That should help making the code a little
less complex.

Complexity in general is my only worry with this otherwise nice patch.
There's a lot of functions growing very big and full of gotos that
should probably be split into helpers.  Also we're not getting a rather
complex lockign hierachy that probably should be documented at the top
of sched.c



More information about the cbe-oss-dev mailing list