[PATCH v2 3/3] powerpc: machine check interrupt is a non-maskable interrupt

Christophe LEROY christophe.leroy at c-s.fr
Tue Oct 9 15:46:30 AEDT 2018



Le 09/10/2018 à 06:32, Nicholas Piggin a écrit :
> On Mon, 8 Oct 2018 17:39:11 +0200
> Christophe LEROY <christophe.leroy at c-s.fr> wrote:
> 
>> Hi Nick,
>>
>> Le 19/07/2017 à 08:59, Nicholas Piggin a écrit :
>>> Use nmi_enter similarly to system reset interrupts. This uses NMI
>>> printk NMI buffers and turns off various debugging facilities that
>>> helps avoid tripping on ourselves or other CPUs.
>>>
>>> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>>> ---
>>>    arch/powerpc/kernel/traps.c | 9 ++++++---
>>>    1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
>>> index 2849c4f50324..6d31f9d7c333 100644
>>> --- a/arch/powerpc/kernel/traps.c
>>> +++ b/arch/powerpc/kernel/traps.c
>>> @@ -789,8 +789,10 @@ int machine_check_generic(struct pt_regs *regs)
>>>    
>>>    void machine_check_exception(struct pt_regs *regs)
>>>    {
>>> -	enum ctx_state prev_state = exception_enter();
>>>    	int recover = 0;
>>> +	bool nested = in_nmi();
>>> +	if (!nested)
>>> +		nmi_enter();
>>
>> This alters preempt_count, then when die() is called
>> in_interrupt() returns true allthough the trap didn't happen in
>> interrupt, so oops_end() panics for "fatal exception in interrupt"
>> instead of gently sending SIGBUS the faulting app.
> 
> Thanks for tracking that down.
> 
>> Any idea on how to fix this ?
> 
> I would say we have to deliver the sigbus by hand.
> 
>      if ((user_mode(regs)))
>          _exception(SIGBUS, regs, BUS_MCEERR_AR, regs->nip);
>      else
>          die("Machine check", regs, SIGBUS);
> 

And what about all the other things done by 'die()' ?

And what if it is a kernel thread ?

In one of my boards, I have a kernel thread regularly checking the HW, 
and if it gets a machine check I expect it to gently stop and the die 
notification to be delivered to all registered notifiers.

Until before this patch, it was working well.

Christophe


More information about the Linuxppc-dev mailing list