[Cbe-oss-dev] [PATCH 2/2] spu sched: static timeslicing for SCHED_RR contexts
Arnd Bergmann
arnd at arndb.de
Sat Feb 10 05:53:41 EST 2007
On Friday 09 February 2007 19:10, Christoph Hellwig wrote:
> -#define SPU_MIN_TIMESLICE (100 * HZ / 1000)
> +#define SPU_TIMESLICE (HZ)
>
Ah, so it's using a fixed time slice of one second?
I guess this is good enough for a start, but it's an
awfully long time if you want to see interactive behavior.
I did not follow the discussion about the time slices as
much as I probably should have, but for SCHED_OTHER, I think
we will need a dynamic time slice of some sort. My idea
was to then have a time slice whose length depends on the
static priority of the SCHED_OTHER task and we simply
round-robin through them, with some adaptations for tasks
that give up their slice voluntarily.
> static struct spu_prio_array *spu_prio;
> +static struct workqueue_struct *spu_sched_wq;
Having a single workqueue like this is _probably_ enough,
but I wonder if we get better behaviour by using per numa
node work queues. Maybe someone from our performance
team wants to analyse this.
> +void spu_start_tick(struct spu_context *ctx)
> +{
> + if (ctx->policy == SCHED_RR)
> + queue_delayed_work(spu_sched_wq, &ctx->sched_work, SPU_TIMESLICE);
> +}
> +
> +void spu_stop_tick(struct spu_context *ctx)
> +{
> + if (ctx->policy == SCHED_RR)
> + cancel_delayed_work(&ctx->sched_work);
> +}
Hmm, this means we wake up at separate times for each of the SPUs.
At a one second time slice, this shouldn't hurt, but since they all
have time slices of the same length, maybe we can find a way to handle
them simultaneously. Then again, given the time it takes to do a single
context switch, that may be how it turns out over time anyway, just
because the time slices expire while we're still in the workqueue.
> +void spu_sched_tick(struct work_struct *work)
> +{
> + struct spu_context *ctx =
> + container_of(work, struct spu_context, sched_work.work);
> + struct spu *spu;
> + int rearm = 1;
> +
> + mutex_lock(&ctx->state_mutex);
> + spu = ctx->spu;
Do we guarantee at this point that the spu is still in SPU_STATE_RUNNABLE?
I think you need to check this before dereferencing ctx->cpu, and bail
out if not.
> + if (spu) {
> + int best = sched_find_first_bit(spu_prio->bitmap);
> + if (best <= ctx->prio) {
> + spu_deactivate(ctx);
> + rearm = 0;
> + }
> + }
> + mutex_unlock(&ctx->state_mutex);
> +
> + if (rearm)
> + spu_start_tick(ctx);
> +}
> +
If I understand this right, this is the only place where a context gets
preempted, and that could be up to a full time slice after a higher
priority one gets scheduled, right? I really like how the preemption is
done automatically like this, but it may be nicer to the high-prio tasks
to find another way to schedule the work queue faster in this case.
Arnd <><
More information about the cbe-oss-dev
mailing list