[PATCH v2] powerpc/mce: Fix access error in mce handler
Daniel Axtens
dja at axtens.net
Fri Sep 17 16:39:45 AEST 2021
Hi Ganesh,
> We queue an irq work for deferred processing of mce event
> in realmode mce handler, where translation is disabled.
> Queuing of the work may result in accessing memory outside
> RMO region, such access needs the translation to be enabled
> for an LPAR running with hash mmu else the kernel crashes.
>
> After enabling translation in mce_handle_error() we used to
> leave it enabled to avoid crashing here, but now with the
> commit 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before
> returning from handler") we are restoring the MSR to disable
> translation.
>
> Hence to fix this enable the translation before queuing the work.
[snip]
> Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from handler")
That patch changes arch/powerpc/powerpc/platforms/pseries/ras.c just
below this comment:
/*
* Enable translation as we will be accessing per-cpu variables
* in save_mce_event() which may fall outside RMO region, also
* leave it enabled because subsequently we will be queuing work
* to workqueues where again per-cpu variables accessed, besides
* fwnmi_release_errinfo() crashes when called in realmode on
* pseries.
* Note: All the realmode handling like flushing SLB entries for
* SLB multihit is done by now.
*/
That suggests per-cpu variables need protection. In your patch, you
enable translations just around irq_work_queue:
> + /* Queue irq work to process this event later. Before
> + * queuing the work enable translation for non radix LPAR,
> + * as irq_work_queue may try to access memory outside RMO
> + * region.
> + */
> + if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
> + msr = mfmsr();
> + mtmsr(msr | MSR_IR | MSR_DR);
> + irq_work_queue(&mce_event_process_work);
> + mtmsr(msr);
> + } else {
> + irq_work_queue(&mce_event_process_work);
> + }
However, just before that in the function, there are a few things that
access per-cpu variables via the local_paca, e.g.:
memcpy(&local_paca->mce_info->mce_event_queue[index],
&evt, sizeof(evt));
Do we need to widen the window where translations are enabled in order
to protect accesses to local_paca?
Kind regards,
Daniel
More information about the Linuxppc-dev
mailing list