Parsing a bus fault message?

Scott Wood scottwood at freescale.com
Wed Sep 29 04:45:54 EST 2010


On Tue, 28 Sep 2010 08:31:54 -0700
"Ira W. Snyder" <iws at ovro.caltech.edu> wrote:

> On Tue, Sep 28, 2010 at 09:26:51AM -0500, david.hagood at gmail.com wrote:
> > Alternatively, can somebody see a hint in the message that I don't know
> > enough to pick out? At this point, my code is trying to memcpy() from the
> > PCIe bus (mapped via the outbound ATMU) to local memory, so the fault is
> > either a) the ATMU is not accessible b) the ATMU is accessible but not
> > mapped (which I would have thought the ioremap call I made would have
> > handled) or c) the chip is not able to bus master on the PCI bus.

Check the LAWs, the outbound ATMU, and the PCI device's BAR.  Make sure
the address goes where you're expecting at each level.

> > Machine check in kernel mode.
> > Caused by (from SRR1=149030): Transfer error ack signal
> 
> ^^^ this is the line that contains some critical info
> 
> In the 86xx CPU manual, you should be able to find information about the
> SRR1 register. Decoding the hex SRR1=0x149030 may help.
> 
> The kernel is telling you this is a TEA (transfer error acknowledge)
> error. I've only seen this when I get an unhandled timeout on the local
> bus. For example, a FPGA that has died in the middle of a request.

I've seen it when you access a physical address that has no device
backing it up.

> On the PCI bus, I haven't seen this error. The 83xx PCI controller is
> smart enough to return 0xffffffff when reading a non-existent device.

I believe that behavior is configurable.

-Scott



More information about the Linuxppc-dev mailing list