linas at austin.ibm.com
Wed Mar 30 09:39:07 EST 2005
On Fri, Mar 25, 2005 at 05:44:03PM -0500, Nathan Glasser was heard to remark:
> I'm using ppc64 (p630), kernel 2.4.x (RH 3.0 patch x).
> I'm working on a proprietary driver for a proprietary device.
> The device needs to access some host memory in order to perform
> a DMA transfer. It can only access 32-bits.
> I'm allocating memory using pci_alloc_consistent. I'm passing
> the "dma handle" to the device in the place where the bus address
> would usually go (I formerly used virt_to_bus for x86).
> It seems that after the device performs the DMA, any further
> access to MMIO board registers results in a system crash (such accesses
> work fine prior to the device DMA). Here is the panic message on the
> serial console.
> RTAS: 2 --------- RTAS event begin
> RTAS 0: 00000000 00000000
> RTAS: 2 --------- RTAS event end
> Kernel panic: EEH: MMIO failure (2) on device:pci12e4,1000 /pci at 400000000111/pci at 2,6/pci12e4,1000 at 1
> It was suggested to me that the DMA was to a bad address, and that this
> caused the device to be isolated. I didn't know the system could do that,
> but it makes sense to me.
The EEH MMIO failure will be triggered by a large variety of PCI error
-- parity errors on data/ address
-- DMA to a bad address
-- various PCI-X spec errors, including timed out split completions.
-- low voltage on pci bus, poor electrical contacts.
Judging from your description, you are probably looking at a bad DMA;
but you can try reseating the PCI card anyway, just for good luck.
The RTAS message is supposed to be a good bit longer; among other things
it will sometimes contain a raw dump of the pci controller state. If I
had that, I *might* be able to decode the details of what the pci
controller didn't like (including the faulting address, if that's what
I presume the truncated RTAS blob is due to some RH 3.0 bug; is there a
chance you can try with a newer RH 3, or RH4 or kernel.org, so as to get
the detailed report?
p.s. Nathan F, did you ever get an error decoder working for the pci chipsets?
More information about the Linuxppc64-dev