[Cbe-oss-dev] [PATCH 2/2] [POWERPC] spufs: fix invalid scheduling of forgotten contexts

Fri Mar 7 01:12:48 EST 2008

On Thu, 2008-03-06 at 10:56 -0300, Luke Browning wrote:
> On Thu, 2008-03-06 at 10:00 +1100, Jeremy Kerr wrote:
> > Hi Luke,
> > 
> > > If it could be lazily loaded, it can be on the runqueue as described
> > > above. Therefore, it is not a bug for the context to be on the
> > > runqueue at the time spu_forget() is called.
> > 
> > Yep, we've since removed this possiblity with the (initial) SCHED_IDLE 
> > change.
> > 
> > But I think I see where you're coming from here.
> > 
> > A context that is not in spu_run but in the run queue has been (by 
> > definition) a bug - this was due to our previous scheduler's 
> > requirement that only running contexts are loaded, which you've since 
> > improved with your asynchronous scheduler implementation. We still make 
> > assumptions based on this invariant (eg, the BUG_ON() in 
> > spu_update_sched_info()).
> > 
> > So, it's possible that we can now remove this assumption, but we need to 
> > do some careful checking of the code first, and remove the recent 
> > additions which check that a context is within spu_run before adding it 
> > to the run queue.
> > 
> > The down-side to this is that non-running contexts will still use up 
> > scheduler timeslices for no good reason, as they may be re-loaded by 
> > spusched_tick, excluding a context that has actual work to do.
> 
> I think we have lost sight of my original comment and I confused things
> somewhat but not reading your SCHED_IDLE patch correctly the first time.
> Let me try again.
> 
> My original point was that you are optimizing the call to
> spu_deactivate() based on the state of context.  If it is not
> SPU_STATE_RUNNING, you don't call spu_deactivate.  My point is that this
> optimization is not important from a performance perspective and it is
> better to just invoke spu_deactivate.  It makes the code easier to
> change in the future.  In a sense less fragile.  
> 
> If you just consider the one line of code by itself.  When the state of
> the context is SPU_STATE_SAVED, the context might be on the run queue or
> it might not be on the runqueue.  It depends on where the controlling
> thread is.  If the caller is the controlling thread, it is obviously not
> in spu_run() and the context cannot be on the runqueue. 
> 
> This assertion is based on one line of code that currently exists in the
> spusched_tick code which does not put the context back on the run queue
> if context is in user mode.  So, you are asserting in spu_forget that
> the caller is the controlling thread and is therefore not in spu_run()
> and you are making this assertion based on one line of code in the
> scheduler.
> 
> This one line of code is a design choice.  We could easily have chosen
> to put the context back on the run queue when the context is in user
> mode.  Note that it is put to the tail of the run queue, which has the
> effect of delaying the re-load the context as we are going to schedule
> something else.  Clearly, if there is nothing else on the run queue
> there is no harm in putting it back on the run queue as we have nothing
> to lose and we have something to gain.  The subsequent reload occurs in

I meant restart as the scheduler first loads it and the controlling
thread just writes a bits to the hardware registers. This eliminates the
cost of the loading a context as it is done by an idle PPE thread that
has nothing better to do.

> a few instructions as only need to write to a couple of registers.  On
> the other hand, if we do schedule something else, time is on our side as
> the controlling thread will likely start up again before we attempt to
> schedule the context from the run queue.  In this case, it will have
> less work to do as it would just put it on the run queue anyway.  The
> only time it is not optimal is if when the scheduler loads it instead of
> another context on the run queue that is truly runnable - ie. has a
> context in spu_run.
> 
> This latter problem could be worked around in the scheduler.  It could
> look at the same context flag that you implemented to handle in effect
> the reverse case.  So, now you have an algorithm that on the surface
> seems more promising than what we do now.  
> 
> Now, we are back to my original point.  The existing logic in spu_forget
> is based on a number of assumptions. 1) the caller is not in spu_run.

This should be 1) the controlling thread is not in spu_run, since the
caller is either the controlling thread or is the controlling thread is
somehow prevented from invoking spu_run().  

This model is really just based on the way our library works.  Could we
write a testcase where one thread calls spu_run and the spu stops
waiting for some kind of input operation, while the second thread closes
the file descriptor associated with the context.  Would this drive
spu_forget while the controlling thread was in spu_run?

> 2) the scheduler performs its internal queueing a certain way.
> 
> Now we know that 2 is going to change with gang scheduling, but that is
> probably not important.  It would have made my job easier if this had
> been coded to unconditionally call spu_deactivate, but that doesn't
> matter.  
> 
> Now this begs the question on spu_acquire_saved.  It can't really make
> the same assumptions as spu_forget.  It must unconditionally invoke
> spu_deactivate as it is an asynchronous reference and the assumption is
> that the context will be started again.  So, spu_forget is fundamentally
> different than spu_acquire_saved in spite of the comment in spu_forget
> that it is basically the same, except that is doesn't acquire the mutex
> lock interruptible.  The assumptions are completely different. 
> 
> So only spu_acquire_saved is really wrong.  It must unconditionally
> invoke spu_deactivate because it can't make any assertions where the
> controlling thread is.  But my original point was, why the trickery.  It
> doesn't matter in these two cases.  spu_deactivate() can return
> immediately if it doesn't need to do anything.  Lets try to provide some
> encapsulation around the scheduler.
> 
> Luke