[RFC] powerpc/powernv/mce: Don't silently restart the machine
Stewart Smith
stewart at linux.vnet.ibm.com
Wed Feb 21 15:54:20 AEDT 2018
Balbir Singh <bsingharora at gmail.com> writes:
> On MCE the current code will restart the machine with
> ppc_md.restart(). This case was extremely unlikely since
> prior to that a skiboot call is made and that resulted in
> a checkstop for analysis.
>
> With newer skiboots, on P9 we don't checkstop the box by
> default, instead we return back to the kernel to extract
> useful information at the time of the MCE. While we still
> get this information, this patch converts the restart to
> a panic(), so that if configured a dump can be taken and
> we can track and probably debug the potential issue causing
> the MCE.
I agree with the patch, although I'd be nervous stating that skiboot is
going to keep this behaviour. In *theory* we should only ever get a
platform error when there's actually something that isn't the kernel's
fault.
Like any firmware promise though, it's slightly less reliable than one
from a politician.
I'd say that in this case deferring to policy on what to do in event of
panic() is the right thing.
--
Stewart Smith
OPAL Architect, IBM.
More information about the Linuxppc-dev
mailing list