<div dir="ltr">Hi team, <div><br></div><div>i've got a question regarding correctable errors logging in OpenBMC on P8 platform. When a machine completely crashed due to checkstop, eSEL is generated and we have some data for post-mortem analysis, but do we have any support for correctable errors like ECC, or correctable MCs? </div><div><br></div><div>Also is there any way to inject correctable error manually (like we can inject a dimm UE with putscom)? This may be really valuable for RAS complience verification. </div><div><br></div><div> Actually i have real use case right now. One of SPEC2006 tests (456.hmmer) causes following error on a P8 test server (OpenBMC) which doesn't cause any eSEL events: <br></div><div><tt><br>
</tt><tt>[12323.883272] Harmless Hypervisor Maintenance interrupt
[Recovered]</tt><tt><br>
</tt><tt>[12323.883323] Error detail: Processor Recovery done</tt><tt><br>
</tt><tt>[12323.883361] HMER: 2040400000000000</tt><tt><br>
</tt><tt>[12323.883392] Harmless Hypervisor Maintenance interrupt
[Recovered]</tt><tt><br>
</tt><tt>[12323.883442] Error detail: Processor Recovery done</tt><tt><br>
</tt><tt>[12323.883482] HMER: 2040400000000000</tt><tt><br>
</tt><tt>[15281.455845] hmmer_base.Linu[78208]: unhandled signal 11
at 0000000000000004 nip 00000000100304a4 lr 000000001003d9ac code
30001</tt><br></div><div><tt><br></tt></div><div>So i really do not have any data to root cause this. Could it be a software error? </div><div>Anyway this is detected by OPAL and wondering why no any eSEL generated? </div><div><br></div><div>thanks in advance, </div><div><br></div><div>regards,</div><div>Sergey </div></div>