PPC440GX ethernet oddities

Jeff Mock jeff at mock.com
Mon Nov 27 15:05:49 EST 2006


I'm having a slightly strange behavior with PPC440GX ethernet, I'm
looking for a little advice where I can poke around to see what's going on.

I have a custom 440GX board, I use the two RGMII gigabit interfaces to
two Vistesse PHYs.  This works nicely.

The board has a large FPGA signal processor that is DMA'ing data into
main memory, the PPC sends data from main memory out the ethernet
interfaces.  This all works well.  For testing purposes I'm DMA'ing a
pseudo random sequence at 80MB/s, sending this over ethernet on a TCP
socket to a server machine and checking the sequence at the receiving
end.  So far so good.  Runs for days on several prototype machines.

As part of the DMA diagnostic program I keep track of the maximum
occupied capacity of the main memory ring buffer holding data from my
FPGA device driver.  This lets me keep track of how close I get to a
buffer overflow seeing as I'm running the gigabit ethernet port close to
the edge at 80MB/s.  The ring buffer will typically reach a maximum
level of 512kB.  This is how far the network connection gets behind the
realtime DMA from the FPGAs.

Here's the weird part.  On one of the four prototype boxes, if I plug
the second ethernet port into gigabit switch and get a link light (2nd
interface is not enabled under linux), the DMA behavior will change and
I can see the ring buffer get as large as 25MB (up from 512kB!)

Only one of my four boxes shows this strange behavior, and only when the
second ethernet port is connected to an ethernet switch. Everything
still works properly, my 80MB/s pseudo random sequence is still
generated by the FPGAs and checked by a server on the other end of the
network connection.  I let the ring buffer get as large as 64MB before
failing, but the large ring buffer says that the network connection
sometimes gets as much as 25MB behind the FPGA DMA, or 25/80 = 0.3125
seconds, which seems kind of crazy.

I look at "ifconfig" (busybox ifconfig) and I see no errors on the
ethernet interface.  I'm guessing there might be some design problem or
maybe just a problem with this one particular board that is causing
errors that occasionally slows down the TCP connection, perhaps
crosstalk between the two RGMII interfaces or maybe some interaction
between the magnetics on the two ports, but I can't figure out where to
look to measure errors on the physical ethernet interface.

Can someone give me a hint about where to look for this problem? This is
a 2.6.15 kernel.

Thanks for reading, I went on a bit long...
jeff




More information about the Linuxppc-embedded mailing list