[techfield] GPIO causing bus error

Sat Dec 22 03:27:26 EST 2007

Hi Chris,

I'm going to look at this problem from the FPGA hardware level because I
used to work for one of the FPGA companies.

I'm not familiar with your PPC440GX board, so some of my suggestions may
be difficult to implement or totally unreasonable, especially if it
requires soldering to an FPGA in a ball grid array or extermely fine
pitch pins.

(1) You should capture the configuration sequence on FPGA's JTAG pins
using a logic analyzer in functional mode.

In functional mode, you can capture an extermely long sequence of
configuration events.  Also, in the past, I've used this mode and found
that when the FPGA doesn't configure, usually there are too few or too
many clocks on the TCK line.

(2) Sometimes, rarely, the FPGA design itself can cause a boot up
problem.

Instead of using the real design, send a 'blank' design with no logic
implemented at all.  If this works, then it's the FPGA design itself
that is causing the boot problem.

(3) When the boot process happens, what is the power sequence of the
FPGA?

Most FPGA's out there like a nice smooth power profile that ramps up
quickly.  Check and see if the profile is quick and smooth vs. spikey
and erratic.  Also, sometimes configuration data gets sent before the
FPGA is ready to receive data.  Try delaying the sending of
configuration data by a millisecond or so.

(4) Manually delay the configuration of the FPGA.

In other words, let the system boot, but modify the code to allow the
FPGA to configure only after a button is pushed.  In theory, if the FPGA
power has properply initialized the FPGA, you could keep the system this
way forever until a 'button' is pushed to configure the FPGA.  if this
works, this tends to imply that there is a timing issue.  If it doesn't
work, it's possible that the FPGA's JTAG tap is actually in a state that
won't allow configuration to complete, such as non shift-dr or non
shift-ir state.

(5) If your FPGA is using one of the SVF-based software configuration
methods via JTAG, make sure you are using the latest SVF player and
latest software for generating the FPGA bitstream.  The configuration
method may have changed.  The FPGA silicon you are using may be newer
than the configuration algorithm that has been implemented.

I hope this helps!

Regards,
Bernie Elayda
the ex-X guy

________________________________

From: owner-techfield at windriver.com
[mailto:owner-techfield at windriver.com] On Behalf Of Wyse, Chris
Sent: Friday, December 21, 2007 7:55 AM
To: linuxppc-dev at ozlabs.org; linuxppc-embedded at ozlabs.org; +techfield;
+linux-embedded; +linux-eng; linux-kernel; Wessel, Jason;
support at amcc.com
Cc: Touron, Emmanuel; Read, Tricia; Ayer, Charles; Slimm, Rob
Subject: [techfield] GPIO causing bus error

Hi,

I'm having trouble with an unusual problem.  I'm working on relatively
new hardware, so it's possible that there could be a hardware issue
involved. 

I have an FPGA on my PPC440GX board that gets loaded via JTAG during the
kernel boot process (Linux 2.6.10).  It uses the 440GX GPIO lines to
send the necessary JTAG commands to the FPGA to perform the initial
load.  This process is USUALLY functional, but on some of the boards
(which we produce), the GPIO write fails with a bus error.  On the
boards that fail, it only occurs after a cold boot, and only if the
board has been powered off for a few minutes.  A quick hard reboot will
not generate the problem.  When I issue the failing write to the GPIO
lines, some of the SDRAM gets corrupted.  I don't appear to be taking
any interrupts that might have corrupted the RAM.

I've checked the TLB entries, and it maps correctly to the PPC register
area.  Additionally, I can read and write to other registers using the
same TLB mapping WITHOUT any error.  I can also READ the GPIO lines
without an error - the error is only on the write.   I've checked the
SDR0_PFC0 bits to make sure everything is set properly (it is).  The bus
error indicates "PLB Timeout Error Status Master 2, Master 2 slave error
occurred" (Master 2 is the write-only data cache unit (DCU)) and "Write
Error Interrupt Master 2, Write error detected - master 2 interrupt
request is active".  I'm not sure why there would be any error in the
DCU, since the region I'm writing to is cache inhibited and guarded.

If I issue a soft reset of the GPIO subsystem, I can read and write to
the GPIO lines again.

The error does not occur on the first write to the GPIO.  I go through
the failing routine several times before it fails.  However, when it
fails, it consistently fails at the same spot, after the same number of
passes through the code.

I'm using RGMII ethernet on EMAC2 (Group 4), but the GPIO lines that I'm
using are not the Trace/GPIO lines (26-31) so I believe that they should
work fine (and they usually do).  Also, the errata mentions that
SDR0_PFC0[G11E] has no effect - but I'm not using GPIO 11 anyway.

Here are some relevant register values after the error:

SDR0_PFC0 :     0x083FFE00
POB0_BESR0:     0x00008400
POB0_BEARH:     0x00000001
POB0_BEARL:     0x40000701
GPIO0_OR  :     0x000400C0
GPIO0_TCR :     0x00278AE0
GPIO0_ODR :     0x00000000
GPIO0_IR  :     0x00000000

I've attached two log files, that contain most of the 440 registers, one
for before the error and one after.  In the log files, the bus error has
been cleared, so use the values shown above.

I'm looking for some suggestions on what to try to debug/resolve this
issue.  I'm open to both hardware and software based suggestions.  Any
help would be greatly appreciated.

Chris Wyse
Senior Member of Technical Staff
Embedded Technologies
860-978-0849 cell/office
413-778-9101 fax
http://www.windriver.com <http://www.windriver.com/> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20071221/5827e694/attachment.htm