[Cbe-oss-dev] [RFC] [PATCH 4:8] SPU Gang Scheduling - Add gang structure for standalone ctxts

Wed Mar 5 05:21:10 EST 2008

On Tue, 2008-03-04 at 06:40 +0100, Arnd Bergmann wrote:
> On Monday 03 March 2008, Luke Browning wrote:
> > +
> > +       for (n = 0, nspus = 0; n < MAX_NUMNODES; n++)
> > +               nspus += cbe_spu_info[n].n_spus -
> > +                       atomic_read(&cbe_spu_info[n].reserved_spus);
> > +       if (nspus < gang->contexts + 1)
> > +               goto out_free_gang;
> 
> In your description, you mentioned that the whole gang should
> get loaded on a single NUMA node, so I'd expect this check to
> test for the maximum number of SPUs on one node,not the total
> amount of them in the system.

That is the preference implemented by the placement code.  First,
placement tries to load on a single node, then it looks for available
spus across the system, then it seeks to preempt, ... This code doesn't
go far enough as I forgot to implement the other part of the algorithm.
We are missing the check in this code that limits the size of NOSCHED
gangs. I would probably place the burden on the reservation case, since
it is the minority case to perform an exhaustive scan of all normal
gangs even those in user mode to ensure that they can run. 

> I'm also not sure how much we can trust the  logic for number
> of reserved spus,  as they can change at any time when we get
> a new reserved spu later,  preventing a large gang from being
> loaded. Maybe we should rather ignore that number and let the
> user do stupid things like creating a gang of dozens contexts
> that can never possibly get scheduled?

I am not sure what we should do either, but the key is that we have to
detect the failure at creation time, and not spu_run.  Anyway, this is a
good candidate to be split out in a separate patch which I will do in
the next round.  

Luke