How to deal some fatal error causing from host in openbmc
Bills, Jason M
jason.m.bills at linux.intel.com
Fri Mar 13 02:32:22 AEDT 2020
On 3/11/2020 11:40 PM, zhang_cy1989 wrote:
> Dear All
> There are some fatal errors in host side.
> Ex:
> Uncorrectable ECC/ other uncorrectable memory error
> Unrecoverable hard-disk device failure...
> PCIE AER and so on.
> How dose BMC get all reasons of those fatal errors?
> BIOS gives those informations to BMC by ipmi?
For Intel platforms, most of those errors (ECC, PCIe, etc.) are handled
and reported by BIOS over IPMI.
> Or like peci in intel platform?
For errors that hang the host (IERR, ERR[2] timeout, etc.) the BMC
detects it by GPIO and uses PECI to get additional info about the error.
>
> What recipes can I refer to in openbmc?
You can see the current Intel host-error-monitor application here:
https://github.com/Intel-BMC/host-error-monitor.
> Wating for your help!
> Thanks.
> Felix
>
>
More information about the openbmc
mailing list