[Cbe-oss-dev] [CVS sw] * Updated version of the 6 previous affinity patches, with changes related

Luke Browning lukebr at linux.vnet.ibm.com
Sat Mar 24 08:17:38 EST 2007


On Fri, 2007-03-16 at 16:17 +0100, Arnd Bergmann wrote:
> On Thursday 22 February 2007, Andre Detsch <adetsch at br.ibm.com> wrote:
> > ++static inline int sched_spu(struct spu *spu)
> > ++{
> > ++      return (!spu->ctx || !(spu->ctx->flags & SPU_CREATE_NOSCHED));
> > ++}
> > ++
> 
> While looking through the code in 2.6.21-rc3-arnd1, I stumbled over this
> chunk of code which was never reviewed afaict. Unfortunately, it
> introduced a nasty bug, since it accesses the context of another spu,
> without holding any lock whatsoever.

Here's a solution that may be more than you want.  It provides locking
around the assignment of the context to the spu structure.  It doesn't 
provide a full solution to the NOSCHED - Affinity problem, because there
is still a race condition in the allocation of NOSCHED context and the
assignment of the context to an spu for scheduling purposes.  It is a 
first step towards a solution.  I don't think that you can come up with
a solution to the larger problem based solely on atomic primitives.
This patch avoids the immediate problem of taking a data storage
interrupt for a non-protected reference. 

Haven't tested it yet.  Wanted to get feedback first.

Arnd, am I supposed to put my signed off here now. 

Luke


Index: linux-2.6.21-rc4/arch/powerpc/platforms/cell/spufs/sched.c
===================================================================
--- linux-2.6.21-rc4.orig/arch/powerpc/platforms/cell/spufs/sched.c
2007-03-23 16:50:04.000000000 -0300
+++ linux-2.6.21-rc4/arch/powerpc/platforms/cell/spufs/sched.c
2007-03-23 17:52:17.000000000 -0300
@@ -135,9 +135,10 @@
  * spu_add_to_active_list - add spu to active list
  * @spu:	spu to add to the active list
  */
-static void spu_add_to_active_list(struct spu *spu)
+static void spu_add_to_active_list(struct spu *spu, struct spu_context
*ctx)
 {
 	mutex_lock(&spu_prio->active_mutex[spu->node]);
+	spu->ctx = ctx;
 	list_add_tail(&spu->list, &spu_prio->active_list[spu->node]);
 	mutex_unlock(&spu_prio->active_mutex[spu->node]);
 }
@@ -151,6 +152,7 @@
 	int node = spu->node;
 
 	mutex_lock(&spu_prio->active_mutex[node]);
+	spu->ctx = NULL;
 	list_del_init(&spu->list);
 	mutex_unlock(&spu_prio->active_mutex[node]);
 }
@@ -212,7 +214,6 @@
 		atomic_inc(&be_spu_info[spu->node].reserved_spus);
 	if (!list_empty(&ctx->aff_list))
 		atomic_inc(&ctx->gang->aff_sched_count);
-	spu->ctx = ctx;
 	spu->flags = 0;
 	ctx->spu = spu;
 	ctx->ops = &spu_hw_ops;
@@ -230,7 +231,7 @@
 	spu->timestamp = jiffies;
 	spu_cpu_affinity_set(spu, raw_smp_processor_id());
 	spu_switch_notify(spu, ctx);
-	spu_add_to_active_list(spu);
+	spu_add_to_active_list(spu, ctx);
 	ctx->state = SPU_STATE_RUNNABLE;
 }
 
@@ -266,7 +267,6 @@
 	ctx->ops = &spu_backing_ops;
 	ctx->spu = NULL;
 	spu->flags = 0;
-	spu->ctx = NULL;
 }
 
 /**
@@ -652,6 +652,8 @@
 	struct spu *spu;
 
 	spu = NULL;
+
+	mutex_lock(&spu_prio->active_mutex[ref->node]);
 	if (offset >= 0) {
 		list_for_each_entry(spu, ref->aff_list.prev, aff_list) {
 			if (offset == 0)
@@ -667,6 +669,7 @@
 				offset++;
 		}
 	}
+	mutex_unlock(&spu_prio->active_mutex[ref->node]);
 	return spu;
 }
 





More information about the cbe-oss-dev mailing list