[Cbe-oss-dev] [PATCH] powerpc/cell/axon-msi: retry on missing interrupt

Michael Ellerman michael at ellerman.id.au
Fri Nov 21 11:50:36 EST 2008


On Mon, 2008-11-17 at 17:10 +0100, Arnd Bergmann wrote:
> The MSI capture logic for the axon bridge can sometimes
> lose interrupts in case of high DMA and interrupt load,
> when it signals an MSI interrupt to the MPIC interrupt
> controller while we are already handling another MSI.
> 
8< 8< 8<

> Index: linux-2.6/arch/powerpc/platforms/cell/axon_msi.c
> ===================================================================
> --- linux-2.6.orig/arch/powerpc/platforms/cell/axon_msi.c	2008-11-17 10:29:05.000000000 -0500
> +++ linux-2.6/arch/powerpc/platforms/cell/axon_msi.c	2008-11-17 10:29:08.000000000 -0500
> @@ -95,6 +95,7 @@
>  	struct axon_msic *msic = get_irq_data(irq);
>  	u32 write_offset, msi;
>  	int idx;
> +	int retry = 0;
>  
>  	write_offset = dcr_read(msic->dcr_host, MSIC_WRITE_OFFSET_REG);
>  	pr_debug("axon_msi: original write_offset 0x%x\n", write_offset);
> @@ -102,7 +103,7 @@
>  	/* write_offset doesn't wrap properly, so we have to mask it */
>  	write_offset &= MSIC_FIFO_SIZE_MASK;
>  
> -	while (msic->read_offset != write_offset) {
> +	while (msic->read_offset != write_offset && retry < 100) {
>  		idx  = msic->read_offset / sizeof(__le32);
>  		msi  = le32_to_cpu(msic->fifo_virt[idx]);
>  		msi &= 0xFFFF;
> @@ -110,13 +111,37 @@
>  		pr_debug("axon_msi: woff %x roff %x msi %x\n",
>  			  write_offset, msic->read_offset, msi);
>  
> +		if (msi < NR_IRQS && irq_map[msi].host == msic->irq_host) {
> +			generic_handle_irq(msi);
> +			msic->fifo_virt[idx] = cpu_to_le32(0xffffffff);
> +		} else {
> +			/*
> +			 * Reading the MSIC_WRITE_OFFSET_REG does not
> +			 * reliably flush the outstanding DMA to the
> +			 * FIFO buffer. Here we were reading stale
> +			 * data, so we need to retry.
> +			 */
> +			udelay(1);
> +			retry++;
> +			pr_debug("axon_msi: invalid irq 0x%x!\n", msi);
> +			continue;
> +		}
> +
> +		if (retry) {
> +			pr_debug("axon_msi: late irq 0x%x, retry %d\n",
> +				 msi, retry);
> +			retry = 0;
> +		}
> +
>  		msic->read_offset += MSIC_FIFO_ENTRY_SIZE;
>  		msic->read_offset &= MSIC_FIFO_SIZE_MASK;
> +	}
>  
> -		if (msi < NR_IRQS && irq_map[msi].host == msic->irq_host)
> -			generic_handle_irq(msi);
> -		else
> -			pr_debug("axon_msi: invalid irq 0x%x!\n", msi);
> +	if (retry) {
> +		printk(KERN_WARNING "axon_msi: irq timed out\n");
> +
> +		msic->read_offset += MSIC_FIFO_ENTRY_SIZE;
> +		msic->read_offset &= MSIC_FIFO_SIZE_MASK;
>  	}

By incrementing the offset we're dropping the irq. Would it be better to
just return, and hope that the next time we come in the MSI will have
landed in the fifo and then we can deliver it? It might be late, really
late I guess, but that might be better then dropping it altogether.

We'd still need an ultimate fallback case, where we drop it altogether.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.ozlabs.org/pipermail/cbe-oss-dev/attachments/20081121/035335c1/attachment.pgp>


More information about the cbe-oss-dev mailing list