Wedge400 (AST2520) OpenBMC stuck at reboot

Tao Ren rentao.bupt at
Thu Sep 22 08:08:42 AEST 2022

Hi there,

Recently I noticed a few Wedge400 (AST2520A2) units stuck after "reboot"
command. It's hard to reproduce (affecting ~1 out of 1,000 units), but
once it happens, I have to power cycle the chassis to recover OpenBMC.

I checked aspeed_wdt.c and manually played with watchdog registers, but
everything looks normal to me. Did anyone hit the similar error before?
Any suggestions which area I should look into?

Below are the last few lines of logs before OpenBMC hangs:

bmc-oob login:
INIT: Sending processes configured via /etc/inittab the TERM signal
Stopping OpenBSD Secure Shell server: sshdstopped /usr/sbin/sshd (pid 7397 1189)
Stopping ntpd: done
stopping rsyslogd ... done
Stopping random number generator daemon.
Deconfiguring network interfaces... done.
Sending all processes the TERM signal...
rackmond[1747]: Got request exit[  528.383133] watchdog: watchdog0: watchdog did not stop!
Sending all processes the KILL signal...
Unmounting remote filesystems...
Deactivating swap...
Unmounting local filesystems...
Rebooting... [  529.725009] reboot: Restarting system



