mpc8260 fcc enet transmit time out

hubert.loewenguth hubert.loewenguth at thales-bm.com
Fri Jan 27 19:40:36 EST 2006


hello David and the community

So happy to see that I'm not alone against this matter :)

/I've not been able to work on the problem for some time (development
schedules and all that jazz)...

/Same situation :), but I will try your solution next week and send you if it fix the problem /
/

Hubert loewenguth


Hunter, David a écrit :

>One day, hubert.loewenguth at thales-bm.com wrote:
>  
>
>>Everything works fine, but, if I do successive plugs/unplugs during 
>>important data transfert, The driver enter into an infinite loop:
>>...
>>Is there anybody having encounter the same problem?
>>Is there anybody having done some test of  numerous plug/unplug
>>    
>>
>during 
>  
>
>>important data transfert with a half-duplex connection on mpc8260?
>>Is there anybody having an idea to help me ?
>>    
>>
>
>I have seen many symptoms involving the "NETDEV WATCHDOG: eth0: transmit
>timed out" message, but so far I do not have a code fix for any of them.
>:(
>
>We (my employer) use an MPC8270 (mask 2K49M) and LXT971A PHY, with Linux
>2.4.18.  In our case we do have MII PHY interrupt.  Like you, when I get
>the transmit timeout, it repeats forever.  But I do not see the problem
>when doing successive plugs/unplugs of the Ethernet cable.  Instead, I
>get timeout during normal board operation, without human interaction.
>
>In one customer site where our MPC8270 board is used, the customer uses
>100 Mb half duplex Ethernet.  During many weeks of normal operation,
>several times the board did experience transmit timeout.  One of the
>times, this was output:
>
><-------- DUMP STARTS HERE ---------->
>NETDEV WATCHDOG: eth0: transmit timed out
>eth0: transmit timed out.
> Ring data dump: cur_tx c01aa380 (full) cur_rx c01aa220.
> Tx @base c01aa308 :
>9c00 0051 070f79a2
>1c00 0056 070f7da2
>1c00 0056 070f7ea2
>1c00 0051 070f7ba2
>1c80 003f 070f51c2
>9c00 0056 070f50c2
>9c00 0051 070f52c2
>9c00 0056 070f53c2
>9c00 0056 070f55c2
>9c00 0051 070f54c2
>dc00 0038 070f56c2
>9c00 0056 070f57c2
>9c00 0051 070f58c2
>9c00 0056 070f59c2
>9c00 0056 070f5ac2
>bc00 0056 070f7ca2
> Rx @base c01aa208 :
>9c00 0040 0046f000
><--- snip: BD status are all 9c00 -->
>9c00 0040 00461000
>9c00 0040 00461800
>9c00 0040 00460000
>bc00 0040 00460800
><---------- DUMP ENDS HERE ---------->
>
>Note that one TxBD has the status 0x1c80, indicating late collision
>(BD_ENET_TX_LC).  This is an unusual condition in Ethernet, but recovery
>should still be possible.  Like you, I suspect errata CPM 119, but I
>have not tried the patch yet.  (Development schedules and all that
>jazz.)
>
>As a workaround, we placed a 10/100 Mb hub between the board and the
>customer's network, which negotiated the PHY up to 100 Mb full duplex.
>The transmit timeout problem has not been seen since (to the best of my
>knowledge.)
>
>Back in the lab I have been able to reproduce the transmit timeout on a
>100 Mb full duplex network.  Like you, I added printk output where
>fcc_enet_interrupt tests each BD_ENET_TX_* flag.  In one case, I saw
>this:
>
><-------- DUMP STARTS HERE ---------->
>eth0: BDP=c01aa370: Carrier lost
>eth0: BDP=c01aa370: Carrier lost
>eth0: BDP=c01aa330: Carrier lost
>eth0: BDP=c01aa360: Carrier lost
>eth0: BDP=c01aa348: Carrier lost
>eth0: BDP=c01aa310: Carrier lost
>eth0: BDP=c01aa318: Carrier lost
><---- Carrier lost repeats 61 more times, random BDP ---->
>eth0: BDP=c01aa348: Underrun
>eth0: Restarting transmitter!!!
>
>NETDEV WATCHDOG: eth0: transmit timed out
>eth0: transmit timed out.
><-------- DUMP ENDS HERE ---------->
>
>The Underrun message means TxBD status bit BD_ENET_TX_UN (0x0002) was
>set.  The last Tx ring data dump in your post shows the same thing.
>That scares me, mainly because I don't know what it means.  Does it mean
>the SDMA transfer didn't end on time?  I dunno.  And what the heck is
>carrier lost during TX in full duplex mode?  It makes sense for half
>duplex mode like your situation, but I can't make sense of it for full
>duplex.  Further, the underrun case has only happened once; in most
>other cases, I get a transmit timeout wih absolutely no TxBD error bits
>whatsoever, and no indication that a TX restart was even attempted.
>That's even scarier.  I also did try repeated plug/unplug of Ethernet
>during peak normal operation (probably 5-10 Mb traffic) on the 100 Mb
>full duplex network, but after 11 successive plugs I did not see any
>timeouts.
>
>I'm starting to wonder if I have a cache coherency problem.  The buffer
>descriptors are in main RAM and the data cache is turned on...  Its just
>a thought I picked up reading some prior posts that I can't rightly
>recall.
>
>I noted that the MPC8280 manual (online from Freescale) does now detail
>the transmitter recovery procedure (section 30.10.1 FCC Transmit
>Errors), and it's not nearly as simple as what fcc_enet.c implements in
>any kernel version.  Despite CPM37, they don't toggle GFMR[ENT] in
>combination with the RESTART_TX command.  Also, in 30.12.1 FCC
>Transmitter Full Sequence, a command (either RESTART_TX or INIT_TRX)
>must be issued after GFMR[ENT] is cleared but _before_ it is set.  You
>might try changing fcc_enet_interrupt to do this:
>
>	    if (must_restart) {
>		volatile cpm8260_t *cp;
>
>		cep->fccp->fcc_gfmr &= ~FCC_GFMR_ENT;
>
>		cp = cpmp;
>		cp->cp_cpcr =
>		    mk_cr_cmd(cep->fip->fc_cpmpage,
>cep->fip->fc_cpmblock,
>		    		0x0c, CPM_CR_RESTART_TX) | CPM_CR_FLG;
>		while (cp->cp_cpcr & CPM_CR_FLG);
>
>		cep->fccp->fcc_gfmr |=  FCC_GFMR_ENT;
>	    }
>
>I've not been able to work on the problem for some time (development
>schedules and all that jazz), but I'll post my solution if I find one.
>
>-Dave
>
>
>DISCLAIMER:
>Important Notice *************************************************
>This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unintended recipients are prohibited from taking action on the basis of information in this e-mail.E-mail messages may contain computer viruses or other defects, may not be accurately replicated on other systems, or may be intercepted, deleted or interfered with without the knowledge of the sender or the intended recipient. If you are not comfortable with the risks associated with e-mail messages, you may decide not to use e-mail to communicate with IPC. IPC reserves the right, to the extent and under circumstances permitted by applicable law, to retain, monitor and intercept e-mail messages to and from its systems.
>_______________________________________________
>Linuxppc-embedded mailing list
>Linuxppc-embedded at ozlabs.org
>https://ozlabs.org/mailman/listinfo/linuxppc-embedded
>
>  
>




More information about the Linuxppc-embedded mailing list