[Cbe-oss-dev] 2.6.26-rc6 kernel bug on PS3?

Arnd Bergmann arnd at arndb.de
Thu Jun 26 18:11:47 EST 2008


On Thursday 26 June 2008, Jeremy Kerr wrote:
> > Application hangs after a while defunct and dmesg output looks like
> > d.txt. (See attached file: d.txt)
> 
> Looks like this is caused by doing the mmput() while the state mutex is 
> held in spu_forget.
> 
> In your case, the mm is the final reference to the context, so this is 
> resulting in a destroy_spu_context, which requires the state mutex :)
> 
> I'm going to take a look at the sequencing of spu_forget, but would like 
> to write a test case. Could you tell us about the exit path of your 
> program?

From my looking at the log, I first had the same impression, but 
I concluded that it's got to be more complicated than that:

mmput() in spu_forget should not result in destroying the context
that we're forgetting, because we still hold a reference on the
inode that will be released later in the final fput.
What this means is that doing the close in this context will
result in destroying another (already forgotten) context that
is still referenced by our mm.

Lockdep warns about this because it can't guarantee that you
don't have two instances of contexts that do this to each other.
I'm fairly sure it can't happen here, and the lockdep warning
is a red herring, probably another thread is holding the mutex
already.

	Arnd <><



More information about the Linuxppc-dev mailing list