[PATCH] powerpc/pseries: Fix MCE handling on pseries

Ganesh ganeshgr at linux.ibm.com
Wed Mar 18 01:35:39 AEDT 2020



On 3/17/20 3:31 PM, Nicholas Piggin wrote:
> Ganesh's on March 16, 2020 9:47 pm:
>>
>> On 3/14/20 9:18 AM, Nicholas Piggin wrote:
>>> Ganesh Goudar's on March 14, 2020 12:04 am:
>>>> MCE handling on pSeries platform fails as recent rework to use common
>>>> code for pSeries and PowerNV in machine check error handling tries to
>>>> access per-cpu variables in realmode. The per-cpu variables may be
>>>> outside the RMO region on pSeries platform and needs translation to be
>>>> enabled for access. Just moving these per-cpu variable into RMO region
>>>> did'nt help because we queue some work to workqueues in real mode, which
>>>> again tries to touch per-cpu variables.
>>> Which queues are these? We should not be using Linux workqueues, but the
>>> powerpc mce code which uses irq_work.
>> Yes, irq work queues accesses memory outside RMO.
>> irq_work_queue()->__irq_work_queue_local()->[this_cpu_ptr(&lazy_list) | this_cpu_ptr(&raised_list)]
> Hmm, okay.
>
>>>> Also fwnmi_release_errinfo()
>>>> cannot be called when translation is not enabled.
>>> Why not?
>> It crashes when we try to get RTAS token for "ibm, nmi-interlock" device
>> tree node. But yes we can avoid it by storing it rtas_token somewhere but haven't
>> tried it, here is the backtrace I got when fwnmi_release_errinfo() called from
>> realmode handler.
> Okay, I actually had problems with that messing up soft-irq state too
> and so I sent a patch to get rid of it, but that's the least of your
> problems really.
>
>>>> This patch fixes this by enabling translation in the exception handler
>>>> when all required real mode handling is done. This change only affects
>>>> the pSeries platform.
>>> Not supposed to do this, because we might not be in a state
>>> where the MMU is ready to be turned on at this point.
>>>
>>> I'd like to understand better which accesses are a problem, and whether
>>> we can fix them all to be in the RMO.
>> I faced three such access problems,
>>    * accessing per-cpu data (like mce_event,mce_event_queue and mce_event_queue),
>>      we can move this inside RMO.
>>    * calling fwnmi_release_errinfo().
>>    * And queuing work to irq_work_queue, not sure how to fix this.
> Yeah. The irq_work_queue one is the biggest problem.
>
> This code "worked" prior to the series unifying pseries and powernv
> machine check handlers, 9ca766f9891d ("powerpc/64s/pseries: machine
> check convert to use common event code") and friends. But it does in
> basically the same way as your fix (i.e., it runs this early handler
> in virtual mode), but that's not really the right fix.
>
> Consider: you get a SLB multi hit on a kernel address due to hardware or
> software error. That access causes a MCE, but before the error can be
> decode to save and flush the SLB, you turn on relocation and that
> causes another SLB multi hit...

We turn on relocation only after all the realmode handling/recovery is done
like SLB flush and reload, All we do after we turn relocation on is saving
mce event to array and queuing the work to irq_workqueue.
So we are good to turn it on here.

> I think the irq_work subsystem will have to be changed to use an array
> unfortunately.
>
> Thanks,
> Nick
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20200317/821919b0/attachment.htm>


More information about the Linuxppc-dev mailing list