fsi/sbefifo problems on bmc

Ivan Mikhaylov i.mikhaylov at yadro.com
Thu Jun 27 02:03:15 AEST 2019


Hello Chris, Eddie, we're in the process of bringup of a P9 machine with openbmc
and we have a problem related to fsi and sbefifo as it appears.

There are some debug data from journalctl about what's happening with sbefifo
and fsi.


Jun 25 09:49:08 nicole phosphor-host-state-manager[1147]: Host State transaction
request
Jun 25 09:49:10 nicole kernel: sbefifo 00:00:00:06: DOWN FIFO Timeout !
status=00100000
Jun 25 09:49:10 nicole systemd[1]: Starting Soft power off of the host...
Jun 25 09:49:10 nicole systemd[1]: Created slice system-
phosphor\x2dreboot\x2dhost.slice.
Jun 25 09:49:10 nicole systemd[1]: Stopped target Host instance 0 crashed.
Jun 25 09:49:10 nicole systemd[1]: Stopped target Quiesce Target.
Jun 25 09:49:10 nicole systemd[1]: Reached target Stop Host0 (Pre).
Jun 25 09:49:10 nicole ipmid[1131]: Command in process, no attention
Jun 25 09:49:23 nicole phosphor-host-state-manager[1147]: Host State transaction
request
Jun 25 09:49:24 nicole systemd[1695]: systemd-hostnamed.service:
PrivateNetwork=yes is configured, but the kernel does not support network
namespaces, ignoring.
Jun 25 09:49:24 nicole systemd[1]: Started Hostname Service.
Jun 25 09:49:27 nicole ipmid[1131]: Host control timeout hit!
Jun 25 09:49:27 nicole ipmid[1131]: Failed to deliver host command
Jun 25 09:49:27 nicole ipmid[1131]: Failed to deliver host command
Jun 25 09:49:27 nicole phosphor-softpoweroff[1655]: Timeout on host attention,
continue with power down
Jun 25 09:49:27 nicole systemd[1]:
xyz.openbmc_project.Ipmi.Internal.SoftPowerOff.service: Succeeded.
Jun 25 09:49:27 nicole systemd[1]: Started Soft power off of the host.
Jun 25 09:49:27 nicole systemd[1]: Reached target Host0 (Stopping).
Jun 25 09:49:27 nicole systemd[1]: Reached target Host0 (Stopped).
Jun 25 09:49:27 nicole systemd[1]: Reached target Power0 Off (Pre).
Jun 25 09:49:27 nicole systemd[1]: Starting Wait for Power0 to turn off...
Jun 25 09:49:27 nicole systemd[1]: Started Stop Power0.
Jun 25 09:49:28 nicole power_control.exe[1051]: PowerControl: setting power up
SOFTWARE_PGOOD to 0
Jun 25 09:49:28 nicole kernel: sbefifo 00:00:00:06: Failed to read UP fifo
status during reset , rc=-19
Jun 25 09:49:28 nicole kernel: occ-hwmon occ-hwmon.1: failed to get OCC poll
response: -110
Jun 25 09:49:28 nicole kernel: occ-hwmon: probe of occ-hwmon.1 failed with error
-110
Jun 25 09:49:28 nicole kernel:  slave at 00:00: error reading slave registers
Jun 25 09:49:28 nicole power_control.exe[1051]: PowerControl: setting power up
BMC_POWER_UP to 0
Jun 25 09:49:28 nicole systemd[1]: fsi-scan at 0.service: Main process exited,
code=killed, status=15/TERM
Jun 25 09:49:28 nicole systemd[1]: fsi-scan at 0.service: Failed with result
'signal'.

On the first run we have no problems with fsi and sbefifo and no issues with
fifo or problems with switching from SOFTWARE_PGOOD to BMC_POWER_UP. In the
consequenced reboots we have unresponsive host and only manual powercycle helps.

>From my point of view it seems like the fsi slave became unresponsive which
resulted in the fifo problem and in problems with other stuff.
We're looking for some guidance on how to debug that. Maybe some hard fsi reset
via devmem could help?
Also maybe some debug output from fsi will help to understand what's going on?
I saw that there is 'trace_enabled' in coldfire's code but there is no option to
enable it, or is there another way to do that from some right place?

Thanks,

Ivan.



More information about the openbmc mailing list