powerpc/powernv/mce: Don't silently restart the machine
Michael Ellerman
mpe at ellerman.id.au
Wed Feb 28 20:49:32 AEDT 2018
Balbir Singh <bsingharora at gmail.com> writes:
> On MCE the current code will restart the machine with
> ppc_md.restart(). This case was extremely unlikely since
> prior to that a skiboot call is made and that resulted in
> a checkstop for analysis.
>
> With newer skiboots, on P9 we don't checkstop the box by
> default, instead we return back to the kernel to extract
> useful information at the time of the MCE. While we still
> get this information, this patch converts the restart to
> a panic(), so that if configured a dump can be taken and
> we can track and probably debug the potential issue causing
> the MCE.
>
> Signed-off-by: Balbir Singh <bsingharora at gmail.com>
> Reviewed-by: Nicholas Piggin <npiggin at gmail.com>
> ---
> arch/powerpc/platforms/powernv/opal.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
> index 69b5263fc9e3..b510a6f41b00 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -500,9 +500,12 @@ void pnv_platform_error_reboot(struct pt_regs *regs, const char *msg)
^^^^^^^^^^^^^^^
Why don't we use the msg ..
> * opal to trigger checkstop explicitly for error analysis.
> * The FSP PRD component would have already got notified
> * about this error through other channels.
> + * 4. We are running on a newer skiboot that by default does
> + * not cause a checkstop, drops us back to the kernel to
> + * extract context and state at the time of the error.
> */
>
> - ppc_md.restart(NULL);
> + panic("PowerNV Unrecovered Machine Check");
^
Here.
Because we can get here from a HMI so it's confusing to print "Machine
Check" in that case, and we have the msg already.
So just:
> + panic(msg);
cheers
More information about the Linuxppc-dev
mailing list