[PATCH v7 39/42] powerpc: move NMI entry/exit code into wrapper

Nicholas Piggin npiggin at gmail.com
Sat Feb 6 13:46:01 AEDT 2021


Excerpts from Michael Ellerman's message of February 6, 2021 9:38 am:
> Nicholas Piggin <npiggin at gmail.com> writes:
>> Excerpts from Michael Ellerman's message of February 4, 2021 8:15 pm:
>>> Nicholas Piggin <npiggin at gmail.com> writes:
>>>> This moves the common NMI entry and exit code into the interrupt handler
>>>> wrappers.
>>>>
>>>> This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
>>>> also MCE interrupts on 64e, by adding missing parts of the NMI entry to
>>>> them.
>>>>
>>>> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>>>> ---
>>>>  arch/powerpc/include/asm/interrupt.h | 28 ++++++++++++++++++++++
>>>>  arch/powerpc/kernel/mce.c            | 11 ---------
>>>>  arch/powerpc/kernel/traps.c          | 35 +++++-----------------------
>>>>  arch/powerpc/kernel/watchdog.c       | 10 ++++----
>>>>  4 files changed, 38 insertions(+), 46 deletions(-)
>>> 
>>> This is unhappy when injecting SLB multi-hits:
>>> 
>>>   root at p86-2:~# echo PPC_SLB_MULTIHIT > /sys/kernel/debug/provoke-crash/DIRECT
>>>   [  312.496026][ T1344] kernel BUG at arch/powerpc/include/asm/interrupt.h:152!
>>>   [  312.496037][ T1344] Oops: Exception in kernel mode, sig: 5 [#1]
>>>   [  312.496045][ T1344] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>>
>> pseries hash. Blast!
> 
> The worst kind.
> 
>>> 147 static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
>>> 148 {
>>> 149 	if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
>>> 150 			!firmware_has_feature(FW_FEATURE_LPAR) ||
>>> 151 			radix_enabled() || (mfmsr() & MSR_DR))
>>> 152 		nmi_exit();
>>> 
>>> 
>>> So presumably it's:
>>> 
>>> #define __nmi_exit()						\
>>> 	do {							\
>>> 		BUG_ON(!in_nmi());				\
>>
>> Yes that would be it, pseries machine check enables MMU half way through 
>> so only one side of this triggers.
>>
>> The MSR_DR check is supposed to catch the other NMIs that run with MMU 
>> on (perf, watchdog, etc). Suppose it could test TRAP(regs) explicitly
>> although I wonder if we should also do this to keep things balanced
> 
> Yeah I think I like that. I'll give it a test.

The msr restore? Looking closer, pseries_machine_check_realmode may have
expected mce_handle_error to enable the MMU, because irq_work_queue uses
some per-cpu variables I think.

So the code might have to be rearranged a bit more than the patch below.

Thanks,
Nick

> 
> cheers
> 
> 
>> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
>> index 149cec2212e6..f57ca0c570be 100644
>> --- a/arch/powerpc/platforms/pseries/ras.c
>> +++ b/arch/powerpc/platforms/pseries/ras.c
>> @@ -719,6 +719,7 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
>>  
>>  static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
>>  {
>> +       unsigned long msr;
>>         struct pseries_errorlog *pseries_log;
>>         struct pseries_mc_errorlog *mce_log = NULL;
>>         int disposition = rtas_error_disposition(errp);
>> @@ -747,9 +748,12 @@ static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
>>          *       SLB multihit is done by now.
>>          */
>>  out:
>> -       mtmsr(mfmsr() | MSR_IR | MSR_DR);
>> +       msr = mfmsr();
>> +       mtmsr(msr | MSR_IR | MSR_DR);
>>         disposition = mce_handle_err_virtmode(regs, errp, mce_log,
>>                                               disposition);
>> +       mtmsr(msr);
>> +
>>         return disposition;
>>  }
>>  
> 


More information about the Linuxppc-dev mailing list