MPC8315 PCI express lockup

David Laight David.Laight at ACULAB.COM
Thu May 31 01:10:37 EST 2012


(I apologise for this not having much to do with linux...)

We have a system with an MPC8315 ppc running linux 2.6.32
that uses the PCI express interface in RC mode to interface
to an Altera FPGA.
This uses both PIO and the PEX DMA interfaces (locally
written dma driver).
Under normal circumstances this all works fine.

However under some circumstances (eg DMA reads from
addresses that don't have actual slaves on the fpga [1])
the dma transfer requests don't complete.
There are no obvious error bits set in the hisr or csmisr
registers and the csb_status shows 'dma in progress'.
The dma transfer itself can be cancelled (by setting the
SUS bit in the dma_ctrl register), and the relevant
status bit are set to show the transfer has been aborted.

Once in this state all further PCI express transfers fail.
DMA requests timeout (driver gives up waiting for completion)
and PIO requests fault (Oops: Machine check, sig 7 [#1])
locking the kernel solid.

This looks very much like the MPC8315's errata PEX7
except that I don't see the CSMISR[RST] bit set.
I'm not at all sure the recovery for that errata is
actually writable! I'm certainly not going to write it
just to find out if it would help.
In any case it is quite likely that the driver's ISR will
try to do a PIO read while a dma transfer is timing out.

We can look at the fpga side and possibly find out
what it is doing, but it would be useful to know more
about the status on the ppc side.

I presume there is a way of doing a 'probe' type memory
cycle that won't panic on a fault?
Although that may not help me keep linux running as the
ISR needs to to a PCIe write to remove the level-sensitive
interrupt.

Any thought on ways to progress?

	David

[1] PIO reads are ok and just return 0xffffffff.




More information about the Linuxppc-dev mailing list