[Cbe-oss-dev] [PATCH 2/3] spufs context switch - make SPU_CONTEXT_SWITCH_PENDING atomic

Thu Apr 24 02:56:24 EST 2008

On Wed, 2008-04-23 at 14:05 +1000, Benjamin Herrenschmidt wrote:
> > I meant for the larx to be used with the stcx. They always go
> > together.
> 
> Alright, I might have misparsed what you were trying to do. Anyway, see
> below, I believe your changes aren't necessary.
> 
>   .../...
> 
> > synchronize_irq() is not atomic either! It suffers from the same
> > problem
> > that I was discussing above. It uses a simple load. Here's an excerpt
> > from that routine.
> 
> synchronize_irq will work to make sure no interrupt handler for that
> interrupt that was started before synchronize_irq is still running after
> it returns. You need to look more closely as it -is- atomic in that
> respect :-)

Oops.  You're right.

> 
> > /*
> > * Wait until we're out of the critical section. This might
> > * give the wrong answer due to the lack of memory barriers.
> > */
> > while (desc->status & IRQ_INPROGRESS)
> > cpu_relax();
> 
> That's just a relax path for when IN_PROGRESS -is- set. You'll notice
> that the actual exit condition for the enclosing loop is taking the desc
> lock, which guarantees that we are synchronized against changes to
> desc->status.
> 
> > Is synchronize_irq() appropriate? It just tests that the interrupt is
> > not currently being processed. The interrupt could be pending in the
> > hardware. The processor might be disabled, right. 
> 
> Right, it doesn't help vs. interrupts that have not reached the
> processor yet as I wrote in my email. But it allows to set state -and-
> be sure that any subsequent interrupt handler will read the modified
> state without using a lock. In our case, the interrupt handler tests for
> the mask bits, so should be safe. However, we do have other concerns as
> I expressed in my previous mail, as we may have an interrupt condition
> for a fault still pending all the way until we re-enable them for the
> sake of the context switch code itself.
> 
> > I assume you are talking about the context switch pending flag here,
> > which I think can miss an update as noted above. 
> 
> No. First, if we set the context switch pending flag prior to
> synchronize_irq, then we cannot miss an update, as I said above, as it
> guarantees ordering here.

But the context switch pending flag was not set prior to the invocation
of synchronize_irq.  I made that change in my patch.  Without my
changes, the update (step 7) occurs after the synchronize_irq (step 3).
Isn't this a bug?

I see your point now.  synchronize_irq() is being used to globally
promote a memory update wrt a specific interrupt handler by guaranteeing
that interrupt handler is not currently running.  Basically, it provides
an alternative serialization technique to traditional locking.  That
helps explain the oddity of this routine.  A comment in the code would
have helped.

I suppose the clearing of the context switch pending bit is also
globally promoted in a similar way.  First, the bit is cleared and then
enable_interrupts() is invoked which happens to take and release the
spu->spu_register lock.  The fact that this same lock is taken by the
interrupt handler (spu_irq_class_0 and spu_irq_class-1) prior to testing
the CS Pending flag guarantees that the clearing of this flag will be
properly seen.  Of course, it also helps that the hardware doesn't
generate an interrupt until the enable_interrupts() routine is invoked. 

Do you agree that there is still an error?  The flag needs to be set
before, not after the synchronize routine.

> Then, you'll notice that the interrupt handler tests the interrupt
> -mask- register and thus will not handle interrupts that have been
> masked, even if that mask happened prior to the interrupt message being
> sent to the IIC.
> 
> > No, I didn't solve this problem. There is a step in the algorithm
> > where
> > it is proposed that the O/S handle pending interrupts, but we didn't
> > implement this step.
> 
> Yup, I think we need to clear DSISR in there, though Jeremy and I are
> thinking about doing it somewhere else.
> 
> > The flag is just intended to protect the saving of the mfc queue. When
> > this is completed, the flag is cleared so that normal dma processing
> > can
> > be performed as required by the algorithm. 
> 
> But that's where the flag is not needed :-) let me recapitulate:
> 
> - A simple store flag / test flag is enough in the first place, because
> the flag would be stored before synchronize_irq is called. That
> guarantees that any interrupt handler still running that might have seen
> the old value of the flag has completed, and new occurence will see the
> new value of the flag.
> 
> - The flag is unnecessary alltogether I believe because we mask all the
> interrupts on the SPU before calling synchronize_irq and our interrupt
> handler checks for the mask before processing anything.

What if the exception occurs before the context switch and is handed off
to the controlling thread.  The controlling thread uses the values in
the csa, not the hardware registers, so I think you still need the flag
and the explicit locking in my patch as the controlling thread runs
completely independently.  

The CS pending flag is set and reset multiple times within the context
switch algorithm.  I extended the use of this flag to cover any place
where the CS algorithm is dealing with the MFC queue and I didn't want
the controlling thread to restart the dma.  

> > I didn't see clearing of the MFC DSISR in Jeremy's patch.
> 
> I think he hasn't posted that part yet. I want to do a little bit more
> experimentation there and try to make sure that is indeed enough to
> clear the condition in HW.
> 
> > Hmm.  I am not proposing bit ops.  I am just extending the existing
> > serialization strategy to protect an additional critical section - the
> > restore part of the algorithm where we load the mfc queue. More
> > importantly, I am proposing that we use a lock to protect the bit so
> > that it can be read reliably. That is what the patch is all about.
> 
> I see, my confusion here. In any way, I still believe my above points
> stand and that nothing is needed :-) 
> 
> > Here is what I meant. An interrupt is pending in the hardware and
> > spu_save is invoked and completes before the interrupt is delivered.
> > Then, spu_restore is invoked for the same context and the pending
> > interrupt is finally delivered. This could happen if the interrupt was
> > held pending in the hardware for ~100 usecs. I am sure there are a lot
> > of disabled critical sections that last long.
> 
> spu_save is actually made of several things. First spu_quiesce which
> will mask all interrupts and synchronize_irq. Then, the actual saving of
> the MFC takes place. Later, interrupts are re-enabled for the internal
> use of the save code for saving the local store etc...
> 
> I fail to understand your scenario about that "pending" interrupt. What
> can happen is:
> 
>  - The interrupt is seen and processed before we quiesce. All good.
> 
>  - The interrupt is still pending when we quiesce. That's the problem I
> was talking in my other email. This interrupt will come up as a
> spurrious later when we manipulate the MFC for other reasons and that
> can be deadly. That's why we need to solve that probably by clearing
> DSISR but I want to dbl check it.
> 
> I think that's it. I don't see what you mean about "interrupt held
> pending in the HW for 100usec" and how it relates to any of this.
> 
> Ben.
> 
>