[PATCH v2] powerpc/mce: Fix access error in mce handler

Ganesh ganeshgr at linux.ibm.com
Fri Sep 17 18:40:03 AEST 2021


On 9/17/21 12:09 PM, Daniel Axtens wrote:

> Hi Ganesh,
>
>> We queue an irq work for deferred processing of mce event
>> in realmode mce handler, where translation is disabled.
>> Queuing of the work may result in accessing memory outside
>> RMO region, such access needs the translation to be enabled
>> for an LPAR running with hash mmu else the kernel crashes.
>>
>> After enabling translation in mce_handle_error() we used to
>> leave it enabled to avoid crashing here, but now with the
>> commit 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before
>> returning from handler") we are restoring the MSR to disable
>> translation.
>>
>> Hence to fix this enable the translation before queuing the work.
> [snip]
>
>> Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from handler")
> That patch changes arch/powerpc/powerpc/platforms/pseries/ras.c just
> below this comment:
>
>      /*
>       * Enable translation as we will be accessing per-cpu variables
>       * in save_mce_event() which may fall outside RMO region, also
>       * leave it enabled because subsequently we will be queuing work
>       * to workqueues where again per-cpu variables accessed, besides
>       * fwnmi_release_errinfo() crashes when called in realmode on
>       * pseries.
>       * Note: All the realmode handling like flushing SLB entries for
>       *       SLB multihit is done by now.
>       */
>
> That suggests per-cpu variables need protection. In your patch, you
> enable translations just around irq_work_queue:

The comment is bit old, most of it doesn't make any sense now, yes per-cpu
variables cannot be accessed in realmode, but with commit 923b3cf00b3f
("powerpc/mce: Remove per cpu variables from MCE handlers") we moved all of
them to paca.

>> +	/* Queue irq work to process this event later. Before
>> +	 * queuing the work enable translation for non radix LPAR,
>> +	 * as irq_work_queue may try to access memory outside RMO
>> +	 * region.
>> +	 */
>> +	if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
>> +		msr = mfmsr();
>> +		mtmsr(msr | MSR_IR | MSR_DR);
>> +		irq_work_queue(&mce_event_process_work);
>> +		mtmsr(msr);
>> +	} else {
>> +		irq_work_queue(&mce_event_process_work);
>> +	}
> However, just before that in the function, there are a few things that
> access per-cpu variables via the local_paca, e.g.:
>
> memcpy(&local_paca->mce_info->mce_event_queue[index],
>         &evt, sizeof(evt));
>
> Do we need to widen the window where translations are enabled in order
> to protect accesses to local_paca?

paca will be within Real Mode Area, so it can be accessed with translate off.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210917/bc69a6e7/attachment.htm>


More information about the Linuxppc-dev mailing list