PPC405EX based irq flooding with USB-OTG and usbserial device

Hunter Cobbs hunter.cobbs at gmail.com
Sat May 23 23:44:55 EST 2009


OK, here are the patches... one for ppc4xx_dma.h and one for the Makefile


----------- ppc4xx_dma.h snip --------
--- linux-2.6-denx/drivers/usb/gadget/dwc_otg/ppc4xx_dma.h    2008-05-07
09:13:33.000000000 -0500
+++ linux-2.6-denx_patched/drivers/usb/gadget/dwc_otg/ppc4xx_dma.h
2009-05-23 08:33:26.172500000 -0500
@@ -84,8 +84,13 @@
  * DMA Channel Control Registers
  */

-#if defined(CONFIG_44x) || defined(CONFIG_405EX) || defined(CONFIG_405EXr)
+/* The PPC44x series has a 64Bit DMA */
+#if defined(CONFIG_44x)
 #define    PPC4xx_DMA_64BIT
+#endif
+
+/* The PPC44x and PPC405EX/r have a reserved bit in DMA control register*/
+#if defined(CONFIG_44x) || defined(CONFIG_405EX) || defined(CONFIG_405EXr)
 #define DMA_CR_OFFSET 1
 #else
 #define DMA_CR_OFFSET 0
------------- end snip -----------



--------- Makefile snip ----------
--- linux-2.6-denx/drivers/usb/gadget/dwc_otg/Makefile    2008-05-07
09:13:33.000000000 -0500
+++ linux-2.6-denx_patched/drivers/usb/gadget/dwc_otg/Makefile    2009-05-23
08:38:04.182500000 -0500
@@ -22,7 +22,7 @@
 #KBUILD_CPPFLAGS    += -DOTG_EXT_CHG_PUMP

 # PLB DMA mode
-KBUILD_CPPFLAGS    += -Dlinux -DDWC_SLAVE -DOTG_PLB_DMA
-DOTG_PLB_DMA_TASKLET  #-DDWC_DEVICE_ONLY # -DDWC_HS_ELECT_TST  -DDWC_SLAVE
-DDWC_HOST_ONLY
+KBUILD_CPPFLAGS    += -Dlinux -DOTG_PLB_DMA -DOTG_PLB_DMA_TASKLET
 endif

 ifeq ($(CONFIG_460EX),y)
------------- end snip -----------


On Sat, May 23, 2009 at 7:44 AM, Hunter Cobbs <hunter.cobbs at gmail.com>wrote:

> Egads!  Forgot to respond to the list!
>
> My git checkout failed last night, so I'm downloading the resource cd, but
> I can tell you what I did before I get the actual patch done, and you can
> tell me if my logic is sound.
>
> First thing I thought when I saw this is WHY use IRQ based methods to
> access a USB controller with internal DMA transfers?  I tried in vain to
> enable this with the driver module parameters(which I dug up how to specify
> module parameters to built-in drivers from an old 2.2-series kernel
> discussion).  So, then I put on my boots and started slogging throught the
> driver.
>
> Getting frustrated with that line of execution, I turned up the verbosity
> on the kernel compile and noticed a warning in the dwc_otg compilation.
> Specifically that a left and right shift go out of bounds of the variables
> used.  The only place this occurs is in a section of code that is wrapped
> with DMA_64BIT.  Which made absolutely no sense because the DMA controller
> on the 405EX is only 32 bits wide.  On tracking this define down, I come to
> find out that someone made the assumption that the 44x and the 405EX/r all
> have the same DMA controller.  Which is incorrect, they both have the same
> control register definitions(the offset of 1 due to the MSBit being reserved
> and the register being in Big Endian mode); however, the 44x is 64bits and
> the 405 is 32bits.  So, I broke the DMA control down into two areas,
> data-width and control register offsets.
>
> When this still didn't fix the problem, I found yet another section that
> can force you to operate in slave(irq) mode only wrapped in yet another
> define.  When I search out that define (DWC_SLAVE I believe), I find it in
> the dwc_otg Makefile.
>
> Correcting both of these has enabled full DMA access to the USB, and I'm
> doing much better with my sierra wireless dev kit.
>
> On Sat, May 23, 2009 at 7:11 AM, Chuck Meade <chuckmeade at mindspring.com>wrote:
>
>> Hunter Cobbs wrote:
>> > Hello everyone,
>> >
>> > This is my first post to the PPC dev list as my company has just started
>> > developing a new project based on Linux.  The good news is, this post is
>> > not debug-related as much as it is an introduction and query while I
>> > download the latest DENX kernel(only place I know that has the DWC_OTG
>> > driver).
>> >
>> > I've been working with a Kilauea dev board and have had lots of trouble
>> > when I plug in a sierra-wireless modem dev kit on the USB.  It goes fine
>> > untill I actually try to communicate(pppd or minicom) with the little
>> > bugger and then my IRQs go through the roof.  And they only calm back
>> > down after I shut down my communicaiton channel.
>> >
>> > I've solved this issue with our board, and was wondering if it has since
>> > been fixed (I'm running 2.6.25-DENX).  I don't want to waste the board's
>> > time with a patch that is no longer necesarry.
>> >
>> > --
>> > Hunter Cobbs
>>
>> Hello Hunter,
>>
>> It would absolutely *not* be a waste of anyone's time.  I for one would
>> like
>> to see how you solved this.  I am dealing with the same problem, with the
>> same
>> setup.
>>
>> The underlying cause for this problem is the PPC405EX CPU's erratum
>> USBO_9.
>> The USB 2.0 PING protocol is supposed to handle a PING transaction in
>> the hardware -- note that in USB 2.0, a PING is the method used by the
>> sender to
>> determine if it can send.  If I remember correctly, erratum USBO_9 is
>> caused when
>> a NAK response from the PING transaction is handled not in hardware, but
>> instead
>> as an interrupt in software, and that NAK leads to a lot of processing.
>>  In the
>> 2.6.25 Denx Linux tree that I used, that processing ends up trying to
>> restart the
>> channel, restart the send, which leads to yet another PING/NAK sequence,
>> yet another
>> interrupt...
>>
>> The end result is that you get over 100,000 interrupts (with significant
>> interrupt
>> handling logic) per second, and the target can't do anything else.  I was
>> able
>> to get this interrupt count by looking at /proc/interrupts, then causing
>> this problem
>> for 20 seconds, then pulling out the USB modem physically (mine is on a
>> Express card)
>> to stop the interrupt storm, then checking /proc/interrupts again.
>>  Averaged over
>> 100,000 ints/sec.
>>
>> In contact with AMCC, they told us they are not respinning the CPU (at
>> least not
>> at this time) to fix this erratum.
>>
>> I have tried to solve the problem as suggested by the erratum, by not
>> allowing the
>> NAK interrupt handling to *directly* cause a retry of the send, but rather
>> to wait
>> until the next SOF interrupt (start of microframe, which happens 8,000
>> times per sec)
>> to restart it.  "Breaking the chain" like this does allow the board to
>> proceed, but
>> I think it is suboptimal, or at least unfortunate.
>>
>> One painful side effect of this workaround is that you cannot disable the
>> 8,000 SOF
>> interrupts/second, or at least some of them, since they are being used now
>> for another
>> purpose -- recovery from the erratum.
>>
>> The 8000 SOF ints being handled per second do cause a measurable drain on
>> the
>> CPU.  In some cursory testing we see a 10% slowdown of certain
>> transactions in
>> lmbench.
>>
>> So please send me your patch for the dwc_otg driver.  I am very interested
>> in what
>> you did, and if it perhaps is a better solution for the problem we both
>> are seeing
>> than what I implemented.
>>
>> Thanks in advance,
>> Chuck
>>
>>
>
>
> --
> Hunter Cobbs
>



-- 
Hunter Cobbs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20090523/cdbb50aa/attachment.htm>


More information about the Linuxppc-dev mailing list