[PATCH v7 39/42] powerpc: move NMI entry/exit code into wrapper
Nicholas Piggin
npiggin at gmail.com
Sat Feb 6 13:46:01 AEDT 2021
Excerpts from Michael Ellerman's message of February 6, 2021 9:38 am:
> Nicholas Piggin <npiggin at gmail.com> writes:
>> Excerpts from Michael Ellerman's message of February 4, 2021 8:15 pm:
>>> Nicholas Piggin <npiggin at gmail.com> writes:
>>>> This moves the common NMI entry and exit code into the interrupt handler
>>>> wrappers.
>>>>
>>>> This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and
>>>> also MCE interrupts on 64e, by adding missing parts of the NMI entry to
>>>> them.
>>>>
>>>> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>>>> ---
>>>> arch/powerpc/include/asm/interrupt.h | 28 ++++++++++++++++++++++
>>>> arch/powerpc/kernel/mce.c | 11 ---------
>>>> arch/powerpc/kernel/traps.c | 35 +++++-----------------------
>>>> arch/powerpc/kernel/watchdog.c | 10 ++++----
>>>> 4 files changed, 38 insertions(+), 46 deletions(-)
>>>
>>> This is unhappy when injecting SLB multi-hits:
>>>
>>> root at p86-2:~# echo PPC_SLB_MULTIHIT > /sys/kernel/debug/provoke-crash/DIRECT
>>> [ 312.496026][ T1344] kernel BUG at arch/powerpc/include/asm/interrupt.h:152!
>>> [ 312.496037][ T1344] Oops: Exception in kernel mode, sig: 5 [#1]
>>> [ 312.496045][ T1344] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>>
>> pseries hash. Blast!
>
> The worst kind.
>
>>> 147 static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
>>> 148 {
>>> 149 if (!IS_ENABLED(CONFIG_PPC_BOOK3S_64) ||
>>> 150 !firmware_has_feature(FW_FEATURE_LPAR) ||
>>> 151 radix_enabled() || (mfmsr() & MSR_DR))
>>> 152 nmi_exit();
>>>
>>>
>>> So presumably it's:
>>>
>>> #define __nmi_exit() \
>>> do { \
>>> BUG_ON(!in_nmi()); \
>>
>> Yes that would be it, pseries machine check enables MMU half way through
>> so only one side of this triggers.
>>
>> The MSR_DR check is supposed to catch the other NMIs that run with MMU
>> on (perf, watchdog, etc). Suppose it could test TRAP(regs) explicitly
>> although I wonder if we should also do this to keep things balanced
>
> Yeah I think I like that. I'll give it a test.
The msr restore? Looking closer, pseries_machine_check_realmode may have
expected mce_handle_error to enable the MMU, because irq_work_queue uses
some per-cpu variables I think.
So the code might have to be rearranged a bit more than the patch below.
Thanks,
Nick
>
> cheers
>
>
>> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
>> index 149cec2212e6..f57ca0c570be 100644
>> --- a/arch/powerpc/platforms/pseries/ras.c
>> +++ b/arch/powerpc/platforms/pseries/ras.c
>> @@ -719,6 +719,7 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
>>
>> static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
>> {
>> + unsigned long msr;
>> struct pseries_errorlog *pseries_log;
>> struct pseries_mc_errorlog *mce_log = NULL;
>> int disposition = rtas_error_disposition(errp);
>> @@ -747,9 +748,12 @@ static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
>> * SLB multihit is done by now.
>> */
>> out:
>> - mtmsr(mfmsr() | MSR_IR | MSR_DR);
>> + msr = mfmsr();
>> + mtmsr(msr | MSR_IR | MSR_DR);
>> disposition = mce_handle_err_virtmode(regs, errp, mce_log,
>> disposition);
>> + mtmsr(msr);
>> +
>> return disposition;
>> }
>>
>
More information about the Linuxppc-dev
mailing list