[Cbe-oss-dev] [PATCH] fix recursive spu_acquire() in SPU pagefault handling

Arnd Bergmann arnd at arndb.de
Thu Apr 19 19:30:51 EST 2007


On Thursday 19 April 2007, Akinobu Mita wrote:
> I got deadlock by recursive spu_acquire():
> 
> spufs_run_spu()
> spu_process_events()
> spu_irq_class_1_bottom()
> spu_handle_mm_fault()
> handle_mm_fault()
> spufs_mem_mmap_nopfn()
> 
> This patch resolves the problem by releasing spu_context while 
> calling spu_irq_class_1_bottom(). because it may acquire spu_context again.
> 
> Signed-off-by: Akinobu Mita <mita at fixstars.com>

No, this is broken, because spu_irq_class_1_bottom accesses registers of the
spu itself, and once you give up the spu context mutex with spu_release,
it may gets scheduled away at any time.

I fixed this bug some time ago, the patch is in my tree as
http://www.kernel.org/pub/linux/kernel/people/arnd/patches/2.6.21-rc4-arnd1/broken-out/spufs-pagefault-rework.diff

> ---
>  arch/powerpc/platforms/cell/spufs/run.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> Index: 2.6-git-ps3/arch/powerpc/platforms/cell/spufs/run.c
> ===================================================================
> --- 2.6-git-ps3.orig/arch/powerpc/platforms/cell/spufs/run.c
> +++ 2.6-git-ps3/arch/powerpc/platforms/cell/spufs/run.c
> @@ -295,8 +295,11 @@ static inline int spu_process_events(str
>  	u64 pte_fault = MFC_DSISR_PTE_NOT_FOUND | MFC_DSISR_ACCESS_DENIED;
>  	int ret = 0;
>  
> -	if (spu->dsisr & pte_fault)
> +	if (spu->dsisr & pte_fault) {
> +		spu_release(ctx);
>  		ret = spu_irq_class_1_bottom(spu);
> +		spu_acquire(ctx);
> +	}
>  	if (spu->class_0_pending)
>  		ret = spu_irq_class_0_bottom(spu);
>  	if (!ret && signal_pending(current))



More information about the cbe-oss-dev mailing list