[Cbe-oss-dev] [PATCH] spusched: extend list_mutex coverage

Wed Jul 23 16:47:05 EST 2008

On Wed, Jul 23, 2008 at 10:48:19AM +1000, Jeremy Kerr wrote:
> Christoph,
> 
> > spu->ctx is only stable while we hold the list_mutex is held.
> 
> We only update spu->ctx with both list_mutex and ctx->state mutex, so 
> I'm a little unclear about how the dangling reference could happen as 
> we hold the victim's state_mutex in find_victim(), and the target 
> context's state_mutex in spusched_tick().

In both cases we take list_mutex check for spu->ctx, the unlock it
<-- race happens here --> then take the state_mutex. 

> > @@ -659,16 +659,15 @@ static struct spu *find_victim(struct sp
> >  				 * restart the search.
> >  				 */
> >  				mutex_unlock(&victim->state_mutex);
> > +				mutex_unlock(&cbe_spu_info[node].list_mutex);
> >  				victim = NULL;
> >  				goto restart;
> >  			}
> 
> This entire if-block should no longer be possible if we hold list_mutex

Indeed.

> > @@ -875,8 +880,13 @@ static noinline void spusched_tick(struc
> >  	struct spu_context *new = NULL;
> >  	struct spu *spu = NULL;
> >
> > -	if (spu_acquire(ctx))
> > -		BUG();	/* a kernel thread never has signals pending */
> > +	/*
> > +	 * If we can't lock the context give it another timeslice.
> > +	 */
> > +	if (!mutex_trylock(&ctx->state_mutex)) {
> > +		ctx->time_slice++;
> 
> No need to increment time_slice, as this will give the context two extra 
> timeslices.

Indeed, this is before we even decrement the time slice.