[Cbe-oss-dev] another context switch question

Wed Sep 12 06:22:51 EST 2007

On Tuesday 11 September 2007, Luke Browning wrote:
> long spufs_run_spu(struct file *file, struct spu_context *ctx,
>                    u32 *npc, u32 *event)
> {
> 
> 	<...>
> 
>         do {
>                 ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status));
>                 spu = ctx->spu;
> 
> spu will be NULL after a context switch.  I suspect that this is not expected
> and it should be initialized before the spufs_wait().  I believe the intent is
> to identify the spu where the context last ran.  

Right, this is a bug, though it's very hard to hit in practice. I think the
patch below is the right solution for this.

Bob, can you try out that patch and make sure that oprofile still behaves
the right way when it's applied?

>                 if (unlikely(ret))
>                         break;
> 
>                 spuctx_switch_state(ctx, SPUCTX_UTIL_SYSTEM);
> 
>                 if (unlikely(test_bit(SPU_SCHED_NOTIFY_ACTIVE,
>                                       &ctx->sched_flags))) {
>                         clear_bit(SPU_SCHED_NOTIFY_ACTIVE, &ctx->sched_flags);
>                         if (!(status & SPU_STATUS_STOPPED_BY_STOP)) {
>                                 spu_switch_notify(spu, ctx);
>                                 continue;
>                         }
>                 }
> 
> More importantly, the context lock is held over spu_switch_notify() meaning
> that the context cannot be saved or restored.  I don't think the scheduler 
> should be dependent on another module.  We should be able to context switch
> out the context during these calls. 
> 
> Note that the callout occurs for every stop and context switch and that the 
> callout occurs before spe exceptions are handled.  I assume that the handler
> is not allowed to read / write the spe register state.  The context may or 
> may not be runnable as you can have an event and a context switch occur at 
> the same time, so the handler doesn't know how to access the data.  And what 
> would it do with the data anyway?  So, I think we should change the code so
> that the callout is only made for context switches.  In that case, I would 
> move it after processing of the class 0 and 1 interrupts.

Note the test for SPU_SCHED_NOTIFY_ACTIVE in there. The only time we get into
this path is when a new consumer registers for schedule notifier events,
e.g. when you load the oprofile kernel module. This is such a rare condition
that I feel we should not optimize for that, but instead try to keep this
code path as readable as possible.

	Arnd <><
---
Subject: spufs: fix potential NULL pointer dereference in spufs_run_spu

We must never use the ctx->spu member before checking for
ctx->state == SPU_STATE_RUNNABLE, because it will be NULL
for saved states. In the rare case that a context has been
saved at the exact same time that we call the
spu_switch_event_register() function, we should also not
continue the loop in spufs_run_spu.

Signed-off-by: Arnd Bergmann <arnd.bergmann at de.ibm.com>

--- a/arch/powerpc/platforms/cell/spufs/run.c
+++ b/arch/powerpc/platforms/cell/spufs/run.c
@@ -296,7 +296,6 @@ static inline int spu_process_events(struct spu_context *ctx)
 long spufs_run_spu(struct spu_context *ctx, u32 *npc, u32 *event)
 {
 	int ret;
-	struct spu *spu;
 	u32 status;
 
 	if (mutex_lock_interruptible(&ctx->run_mutex))
@@ -336,11 +335,11 @@ long spufs_run_spu(struct spu_context *ctx, u32 *npc, u32 *event)
 		ret = spufs_wait(ctx->stop_wq, spu_stopped(ctx, &status));
 		if (unlikely(ret))
 			break;
-		spu = ctx->spu;
 		if (unlikely(test_and_clear_bit(SPU_SCHED_NOTIFY_ACTIVE,
 						&ctx->sched_flags))) {
-			if (!(status & SPU_STATUS_STOPPED_BY_STOP)) {
-				spu_switch_notify(spu, ctx);
+			if (!(status & SPU_STATUS_STOPPED_BY_STOP) &&
+			    (ctx->state == SPU_STATE_RUNNABLE)) {
+				spu_switch_notify(ctx->spu, ctx);
 				continue;
 			}
 		}