[Cbe-oss-dev] another context switch question
Bob Nelson
rrnelson at linux.vnet.ibm.com
Wed Oct 10 06:42:29 EST 2007
On Tuesday 11 September 2007 03:22:51 pm Arnd Bergmann wrote:
> On Tuesday 11 September 2007, Luke Browning wrote:
> > long spufs_run_spu(struct file *file, struct spu_context *ctx,
> > u32 *npc, u32 *event)
> > {
> >
> > <...>
> >
> > do {
> > ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status));
> > spu = ctx->spu;
> >
> > spu will be NULL after a context switch. I suspect that this is not expected
> > and it should be initialized before the spufs_wait(). I believe the intent is
> > to identify the spu where the context last ran.
>
> Right, this is a bug, though it's very hard to hit in practice. I think the
> patch below is the right solution for this.
>
> Bob, can you try out that patch and make sure that oprofile still behaves
> the right way when it's applied?
OProfile seems to run fine with this patch installed on the 2.6.23-rc9 kernel. I ran through the normal test bucket and tried a few different things and everything appears normal.
>
> > if (unlikely(ret))
> > break;
> >
> > spuctx_switch_state(ctx, SPUCTX_UTIL_SYSTEM);
> >
> > if (unlikely(test_bit(SPU_SCHED_NOTIFY_ACTIVE,
> > &ctx->sched_flags))) {
> > clear_bit(SPU_SCHED_NOTIFY_ACTIVE, &ctx->sched_flags);
> > if (!(status & SPU_STATUS_STOPPED_BY_STOP)) {
> > spu_switch_notify(spu, ctx);
> > continue;
> > }
> > }
> >
> > More importantly, the context lock is held over spu_switch_notify() meaning
> > that the context cannot be saved or restored. I don't think the scheduler
> > should be dependent on another module. We should be able to context switch
> > out the context during these calls.
> >
> > Note that the callout occurs for every stop and context switch and that the
> > callout occurs before spe exceptions are handled. I assume that the handler
> > is not allowed to read / write the spe register state. The context may or
> > may not be runnable as you can have an event and a context switch occur at
> > the same time, so the handler doesn't know how to access the data. And what
> > would it do with the data anyway? So, I think we should change the code so
> > that the callout is only made for context switches. In that case, I would
> > move it after processing of the class 0 and 1 interrupts.
>
> Note the test for SPU_SCHED_NOTIFY_ACTIVE in there. The only time we get into
> this path is when a new consumer registers for schedule notifier events,
> e.g. when you load the oprofile kernel module. This is such a rare condition
> that I feel we should not optimize for that, but instead try to keep this
> code path as readable as possible.
>
> Arnd <><
> ---
> Subject: spufs: fix potential NULL pointer dereference in spufs_run_spu
>
> We must never use the ctx->spu member before checking for
> ctx->state == SPU_STATE_RUNNABLE, because it will be NULL
> for saved states. In the rare case that a context has been
> saved at the exact same time that we call the
> spu_switch_event_register() function, we should also not
> continue the loop in spufs_run_spu.
>
> Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>
>
> --- a/arch/powerpc/platforms/cell/spufs/run.c
> +++ b/arch/powerpc/platforms/cell/spufs/run.c
> @@ -296,7 +296,6 @@ static inline int spu_process_events(struct spu_context *ctx)
> long spufs_run_spu(struct spu_context *ctx, u32 *npc, u32 *event)
> {
> int ret;
> - struct spu *spu;
> u32 status;
>
> if (mutex_lock_interruptible(&ctx->run_mutex))
> @@ -336,11 +335,11 @@ long spufs_run_spu(struct spu_context *ctx, u32 *npc, u32 *event)
> ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status));
> if (unlikely(ret))
> break;
> - spu = ctx->spu;
> if (unlikely(test_and_clear_bit(SPU_SCHED_NOTIFY_ACTIVE,
> &ctx->sched_flags))) {
> - if (!(status & SPU_STATUS_STOPPED_BY_STOP)) {
> - spu_switch_notify(spu, ctx);
> + if (!(status & SPU_STATUS_STOPPED_BY_STOP) &&
> + (ctx->state == SPU_STATE_RUNNABLE)) {
> + spu_switch_notify(ctx->spu, ctx);
> continue;
> }
> }
> _______________________________________________
> cbe-oss-dev mailing list
> cbe-oss-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/cbe-oss-dev
>
More information about the cbe-oss-dev
mailing list