PCIe Access - achieve bursts without DMA
dwh at ovro.caltech.edu
Sat Feb 1 10:18:30 EST 2014
>> I'm currently trying to benchmark access speeds to our PCIe-connected IP-cores
>> located inside our FPGA. On x86-based systems I was able to achieve bursts for
>> both read and write access. On PPC32, using an e500v2, I had no success at all
>> so far.
Whenever I want to benchmark PCI/PCIe performance I do the
1. Peripheral board DMA (board-to-board)
Use two of your FPGA boards in a chassis and DMA between them.
In a PCI system, you can put the cards on the same bus segment and
then between a bridge and see how that affects things. In your case,
the PCIe traffic will all be via the root-complex/switch, so
you should get the same performance regardless of which PCIe slot
This is likely the "best you can do" as far as bursts go.
2. Peripheral board DMA to host memory.
In this case I typically insmod a simple driver on the host that
gives me a page of memory, and then DMA into and out of that
memory, using the DMA controller on the peripheral.
3. Host (root complex) DMA.
If your host has a DMA controller, then program it per (2).
As far as "verification" of your custom peripheral board FPGA IP is
concerned, if I was a customer, and you had data for (1) and (2),
I'd be pretty happy (and could care less about (2), since its so
Since its an FPGA-based IP. I'd also expect to see a PCIe simulation
with Bus Functional Models showing what the optimal performance of
your IP was, and then how it nicely matches with the measurements
in (1). If you do not have a PCIe logic analyzer, both Xilinx and
Altera have Chipscope/SignalTap logic analyzers that can be used
for tracing traffic at the TLP layer inside the FPGA.
Just some thoughts ...
More information about the Linuxppc-dev