NCSI eth0 (ftgmac100): transmit queue 0 timed out error

Samuel Mendoza-Jonas sam at mendozajonas.com
Thu Dec 20 15:10:11 AEDT 2018


On Thu, 2018-12-20 at 11:43 +0800, xiuzhi wrote:
> I patch the 0001-DO-NOT-MERGE-NCSI-state-machine-debugging.patch to debug the ncsi net from https://gerrit.openbmc-project.xyz/#/c/openbmc/openbmc/+/6545/
> The  motherboard without the ncsi  switch worked , the log:

Ha, nice find.

> [   23.923250] ftgmac100 1e660000.ethernet eth0: NCSI: 'bad' packet ignored for type 0x95
> [   23.938842] ftgmac100 1e660000.ethernet eth0: ncsi: starting config machine   
> [   23.968028] ftgmac100 1e660000.ethernet eth0: ncsi: config complete - starting monitor
> [   23.976100] ftgmac100 1e660000.ethernet eth0: ncsi: report link (false) from next channel
> The mothboard with the switch failed, the log:
> root at haiguang1:~# dmesg|grep -i net
> [    0.039194] NET: Registered protocol family 16
> [    0.211611] NET: Registered protocol family 2
> [    0.213769] NET: Registered protocol family 1
> [    2.094889] NET: Registered protocol family 38
> [    3.617315] ftgmac100 1e660000.ethernet: Generated random MAC address 9e:24:06:20:1c:bf
> [    3.625469] ftgmac100 1e660000.ethernet: Using NCSI interface
> [    3.632401] ftgmac100 1e660000.ethernet eth0: irq 19, mapped at 39aaaa23
> [    3.639822] ftgmac100 1e680000.ethernet: Generated random MAC address 12:5a:54:09:a6:a6
> [    3.754214] Broadcom BCM54612E 1e680000.ethernet--1:00: attached PHY driver [Broadcom BCM54612E] (mii_bus:phy_addr=1e680000.ethernet--1:00)
> [    3.768773] ftgmac100 1e680000.ethernet eth1: irq 20, mapped at 7ea18756
> [    4.019980] Driver for 1-wire Dallas network protocol.
> [    4.283070] NET: Registered protocol family 10
> [    4.306534] NET: Registered protocol family 17
> [    4.326455] console [netcon0] enabled
> [    4.330138] netconsole: network logging started
> [   19.153517] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
> [   20.562851] ftgmac100 1e660000.ethernet eth0: NCSI: No channel found with link

Nothing extra appears to trip up from that extra logging, so I don't
think it's a driver issue at the moment. I'm not familiar with this
'switch' setup but we can probably check if the driver is seeing anything
at all. OpenBMC should include the 'ncsi-netlink' utility. If you run an
'info' command with that and check the logs you should be able to see if
it at least pulls out the NC version info.
Alternately I have a version for debugging available here which you can
compile and copy to the system: https://github.com/sammj/ncsi-netlink


> [   84.087503] WARNING: CPU: 0 PID: 1174 at /usr/src/kernel/net/sched/sch_generic.c:461 dev_watchdog+0x230/0x24c
> [   84.097560] NETDEV WATCHDOG: eth0 (ftgmac100): transmit queue 0 timed out
> -------
> [   19.609459] aspeed-video 1e700000.video: Failed to start video engine
> [   20.562851] ftgmac100 1e660000.ethernet eth0: NCSI: No channel found with link
> [   84.082800] ------------[ cut here ]------------
> [   84.087503] WARNING: CPU: 0 PID: 1174 at /usr/src/kernel/net/sched/sch_generic.c:461 dev_watchdog+0x230/0x24c
> [   84.097560] NETDEV WATCHDOG: eth0 (ftgmac100): transmit queue 0 timed out
> [   84.104474] CPU: 0 PID: 1174 Comm: dbus-daemon Tainted: G        W         4.18.16-d822bbc00e9259c72112f0dab412625a30aaabb0 #1
> [   84.115997] Hardware name: Generic DT based system
> [   84.120874] [<80109bec>] (unwind_backtrace) from [<801075fc>] (show_stack+0x20/0x24)
> [   84.128780] [<801075fc>] (show_stack) from [<806474a8>] (dump_stack+0x20/0x28)
> [   84.136163] [<806474a8>] (dump_stack) from [<80116fac>] (__warn+0xdc/0x104)
> [   84.143267] [<80116fac>] (__warn) from [<80117028>] (warn_slowpath_fmt+0x54/0x74)
> [   84.150824] [<80117028>] (warn_slowpath_fmt) from [<8055f7c8>] (dev_watchdog+0x230/0x24c)
> [   84.159172] [<8055f7c8>] (dev_watchdog) from [<80157b38>] (call_timer_fn+0x3c/0x120)
> [   84.167087] [<80157b38>] (call_timer_fn) from [<80157ccc>] (expire_timers+0xb0/0xbc)
> [   84.174976] [<80157ccc>] (expire_timers) from [<80157dc8>] (run_timer_softirq+0xa4/0x198)
> [   84.183308] [<80157dc8>] (run_timer_softirq) from [<8010223c>] (__do_softirq+0xcc/0x2f0)
> [   84.191475] [<8010223c>] (__do_softirq) from [<8011b278>] (irq_exit+0xfc/0x110)
> [   84.198950] [<8011b278>] (irq_exit) from [<8014ba14>] (__handle_domain_irq+0x60/0xb8)
> [   84.206917] [<8014ba14>] (__handle_domain_irq) from [<80102164>] (avic_handle_irq+0x68/0x70)
> [   84.215501] [<80102164>] (avic_handle_irq) from [<80101db4>] (__irq_usr+0x54/0x80)
> [   84.223189] Exception stack(0x974e1fb0 to 0x974e1ff8)
> [   84.228291] 1fa0:                                     ffff0fff 00000020 00000019 45e210f0
> [   84.236587] 1fc0: 00000000 00000000 020789d0 45e204d0 00000020 020789c8 00000018 7e90c800
> [   84.244871] 1fe0: 45ffd210 7e90c438 45d52090 ffff0fe0 60000010 ffffffff
> [   84.251581] ---[ end trace 935ea877fd2be47d ]---
> 
> Hi Sam,
>     The eth0 works  with  same kernel code and config on the motherboard without the NCSI on-off switch to connect the host pysical network card .
> This new version motherboard only add the switch . Does this switch cause  electrical signals time delay growth?
>   The ncsi eth0 has never connect successful on the new motherboard on OpenBmc while it works on AMI BMC. I guess there is something wrong with the driver.
> Best,
> xiuzhi
> On Wed, 2018-12-19 at 15:49 +0800, xiuzhi wrote:
> > 
> > Hi all,
> >    I set the  eth0/MAC#1 of Ast2500 to NCSI mode.
> > There is a on-off switch between Ast2500 MAC#1 and host Network. 
> >  The hardware electrical signal is good and can ping to other machines on AMI BMC
> > .dts:
> >  &mac0 {
> >         status = "okay";
> >   
> >         pinctrl-names = "default";
> >         pinctrl-0 = <&pinctrl_rgii_default>;
> >         use-ncsi;
> > };
> > 
> > demsg:
> >  [   18.204116] 8021q: adding VLAN 0 to HW filter on device eth0                                     
> > [   18.884647] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready                                  
> > [   19.391004] aspeed-video 1e700000.video: timed out on 1st mode detect                           
> > [   19.397549] aspeed-video 1e700000.video: Failed to start video engine                           
> > [   20.481030] ftgmac100 1e660000.ethernet eth0: NCSI: No channel found with link
> 
> This likely means the Network Controller didn't respond to the NCSI
> driver's probe process. If that happens even after a power cycle then
> hopefully one of the BMC people can weigh in on how to make sure it's set
> up correctly. If it only happens intermittently then it may be related to
> the NCSI microcode.
> 
> Cheers,
> Sam
> 
> >                   
> > [   20.961742] ftgmac100 1e680000.ethernet eth1: Link is Up - 10Mbps/Half - flow control off       
> > [   20.969995] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready                             
> > [  103.120946] ------------[ cut here ]------------                                                
> > [  103.125630] WARNING: CPU: 0 PID: 1329 at /usr/src/kernel/net/sched/sch_generic.c:461 dev_watchdc
> > [  103.135678] NETDEV WATCHDOG: eth0 (ftgmac100): transmit queue 0 timed out                       
> > [  103.142580] CPU: 0 PID: 1329 Comm: ipmid Tainted: G        W         4.18.16-71414b7d863e8ee6131
> > [  103.153562] Hardware name: Generic DT based system                                              
> > [  103.158432] [<80109bec>] (unwind_backtrace) from [<801075fc>] (show_stack+0x20/0x24)            
> > [  103.166310] [<801075fc>] (show_stack) from [<80646f48>] (dump_stack+0x20/0x28)                  
> > [  103.173665] [<80646f48>] (dump_stack) from [<80116fac>] (__warn+0xdc/0x104)                     
> > [  103.180662] [<80116fac>] (__warn) from [<80117028>] (warn_slowpath_fmt+0x54/0x74)               
> > [  103.188331] [<80117028>] (warn_slowpath_fmt) from [<8055f3c0>] (dev_watchdog+0x230/0x24c)       
> > [  103.196663] [<8055f3c0>] (dev_watchdog) from [<80157b38>] (call_timer_fn+0x3c/0x120)            
> > [  103.204547] [<80157b38>] (call_timer_fn) from [<80157ccc>] (expire_timers+0xb0/0xbc)            
> > [  103.212427] [<80157ccc>] (expire_timers) from [<80157dc8>] (run_timer_softirq+0xa4/0x198)       
> > [  103.220644] [<80157dc8>] (run_timer_softirq) from [<8010223c>] (__do_softirq+0xcc/0x2f0)        
> > [  103.228863] [<8010223c>] (__do_softirq) from [<8011b278>] (irq_exit+0xfc/0x110)                 
> > [  103.236334] [<8011b278>] (irq_exit) from [<8014ba14>] (__handle_domain_irq+0x60/0xb8)           
> > [  103.244311] [<8014ba14>] (__handle_domain_irq) from [<80102164>] (avic_handle_irq+0x68/0x70)    
> > [  103.252876] [<80102164>] (avic_handle_irq) from [<801019ec>] (__irq_svc+0x6c/0x90)              
> > [  103.260461] Exception stack(0x97ddf870 to 0x97ddf8b8)                                           
> > [  103.265630] f860:                                     000001b0 0063e8dc 969e0ee4 00000401       
> > [  103.273907] f880: 969e0000 00000000 00000000 00000600 fbd51971 0000008b 969e06dc 97ddf8ec       
> > [  103.282174] f8a0: 000001b0 97ddf8c0 048ecf6b 8036faac 20000013 ffffffff                         
> > [  103.288843] [<801019ec>] (__irq_svc) from [<8036faac>] (lzma_main+0x6ac/0x8e8)                  
> > [  103.296186] [<8036faac>] (lzma_main) from [<803702e0>] (xz_dec_lzma2_run+0x5f8/0x840)           
> > [  103.304139] [<803702e0>] (xz_dec_lzma2_run) from [<8036e9b8>] (xz_dec_run+0x378/0xa9c)          
> > [  103.312215] [<8036e9b8>] (xz_dec_run) from [<802a0fec>] (squashfs_xz_uncompress+0x84/0x228)     
> > [  103.320606] [<802a0fec>] (squashfs_xz_uncompress) from [<802a0f14>] (squashfs_decompress+0x68/0)
> > [  103.329774] [<802a0f14>] (squashfs_decompress) from [<8029cc00>] (squashfs_read_data+0x3d4/0x6f)
> > [  103.338754] [<8029cc00>] (squashfs_read_data) from [<8029d220>] (squashfs_cache_get+0x170/0x348)
> > [  103.347660] [<8029d220>] (squashfs_cache_get) from [<8029d7b0>] (squashfs_read_metadata+0xa4/0x)
> > [  103.356832] [<8029d7b0>] (squashfs_read_metadata) from [<8029f128>] (squashfs_read_inode+0x98/0)
> > [  103.366083] [<8029f128>] (squashfs_read_inode) from [<8029fa14>] (squashfs_iget+0x6c/0x9c)      
> > [  103.374473] [<8029fa14>] (squashfs_iget) from [<8029fd74>] (squashfs_lookup+0x330/0x484)        
> > [  103.382691] [<8029fd74>] (squashfs_lookup) from [<802362e0>] (__lookup_slow+0x94/0x150)         
> > [  103.390736] [<802362e0>] (__lookup_slow) from [<802363dc>] (lookup_slow+0x40/0x54)              
> > [  103.398425] [<802363dc>] (lookup_slow) from [<8023940c>] (lookup_one_len_unlocked+0x78/0x84)    
> > [  103.406986] [<8023940c>] (lookup_one_len_unlocked) from [<802f3eb8>] (ovl_lookup_single+0x34/0x)
> > [  103.416153] [<802f3eb8>] (ovl_lookup_single) from [<802f42a4>] (ovl_lookup_layer+0x134/0x184)   
> > [  103.424790] [<802f42a4>] (ovl_lookup_layer) from [<802f52f8>] (ovl_lookup+0x3c8/0x7bc)          
> > [  103.432823] [<802f52f8>] (ovl_lookup) from [<8023a6e4>] (path_openat+0xbac/0x10c8)              
> > [  103.440422] [<8023a6e4>] (path_openat) from [<8023ac80>] (do_filp_open+0x80/0xf0)               
> > [  103.448027] [<8023ac80>] (do_filp_open) from [<8022973c>] (do_sys_open+0x178/0x21c)             
> > [  103.455816] [<8022973c>] (do_sys_open) from [<80229828>] (sys_openat+0x1c/0x20)                 
> > [  103.463247] [<80229828>] (sys_openat) from [<80101000>] (ret_fast_syscall+0x0/0x54)             
> > [  103.471002] Exception stack(0x97ddffa8 to 0x97ddfff0)                                           
> > [  103.476136] ffa0:                   00d9b3e0 7ebaf574 ffffff9c 00dbb7c8 00080000 00000000       
> > [  103.484409] ffc0: 00d9b3e0 7ebaf574 4e451908 00000142 00000000 00000002 00000000 7ebaf52c       
> > [  103.492671] ffe0: 4e450908 7ebaf4c0 4e426e10 4e439b80                                           
> > [  103.497743] ---[ end trace 935ea877fd2be47d ]---         
> > 
> > Best,
> > Xiuzhi        
> 
>  




More information about the openbmc mailing list