<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><br>
</p>
<div class="moz-cite-prefix">
<pre>On 9/6/21 6:03 PM, Michael Ellerman wrote:</pre>
</div>
<blockquote type="cite" cite="mid:87y289natb.fsf@mpe.ellerman.id.au">
<pre class="moz-quote-pre" wrap="">Ganesh Goudar <a class="moz-txt-link-rfc2396E" href="mailto:ganeshgr@linux.ibm.com"><ganeshgr@linux.ibm.com></a> writes:
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">We queue an irq work for deferred processing of mce event
in realmode mce handler, where translation is disabled.
Queuing of the work may result in accessing memory outside
RMO region, such access needs the translation to be enabled
for an LPAR running with hash mmu else the kernel crashes.
So enable the translation before queuing the work.
Without this change following trace is seen on injecting machine
check error in an LPAR running with hash mmu.
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
What type of error are you injecting?</pre>
</blockquote>
<pre>SLB multihit in kernel mode.</pre>
<blockquote type="cite" cite="mid:87y289natb.fsf@mpe.ellerman.id.au">
<pre class="moz-quote-pre" wrap="">
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
CPU: 5 PID: 1883 Comm: insmod Tainted: G OE 5.14.0-mce+ #137
NIP: c000000000735d60 LR: c000000000318640 CTR: 0000000000000000
REGS: c00000001ebff9a0 TRAP: 0300 Tainted: G OE (5.14.0-mce+)
MSR: 8000000000001003 <SF,ME,RI,LE> CR: 28008228 XER: 00000001
CFAR: c00000000031863c DAR: c00000027fa8fe08 DSISR: 40000000 IRQMASK: 0
GPR00: c0000000003186d0 c00000001ebffc40 c000000001b0df00 c0000000016337e8
GPR04: c0000000016337e8 c00000027fa8fe08 0000000000000023 c0000000016337f0
GPR08: 0000000000000023 c0000000012ffe08 0000000000000000 c008000001460240
GPR12: 0000000000000000 c00000001ec9a900 c00000002ac4bd00 0000000000000000
GPR16: 00000000000005a0 c0080000006b0000 c0080000006b05a0 c000000000ff3068
GPR20: c00000002ac4bbc0 0000000000000001 c00000002ac4bbc0 c008000001490298
GPR24: c008000001490108 c000000001636198 c008000001470090 c008000001470058
GPR28: 0000000000000510 c008000001000000 c008000008000019 0000000000000019
NIP [c000000000735d60] llist_add_batch+0x0/0x40
LR [c000000000318640] __irq_work_queue_local+0x70/0xc0
Call Trace:
[c00000001ebffc40] [c00000001ebffc0c] 0xc00000001ebffc0c (unreliable)
[c00000001ebffc60] [c0000000003186d0] irq_work_queue+0x40/0x70
[c00000001ebffc80] [c00000000004425c] machine_check_queue_event+0xbc/0xd0
[c00000001ebffcf0] [c00000000000838c] machine_check_early_common+0x16c/0x1f4
Fixes: 74c3354bc1d89 ("powerpc/pseries/mce: restore msr before returning from handler")
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
Please explain in more detail why that commit caused this breakage.</pre>
</blockquote>
<pre>After enabling translation in mce_handle_error() we used to leave it enabled to avoid
crashing here, but now with this commit we are restoring the MSR to disable translation.
Missed to mention it in commit log, I will add it.</pre>
<blockquote type="cite" cite="mid:87y289natb.fsf@mpe.ellerman.id.au">
<pre class="moz-quote-pre" wrap="">
</pre>
<blockquote type="cite">
<pre class="moz-quote-pre" wrap="">diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 47a683cd00d2..9d1e39d42e3e 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -249,6 +249,7 @@ void machine_check_queue_event(void)
{
int index;
struct machine_check_event evt;
+ unsigned long msr;
if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
return;
@@ -262,8 +263,19 @@ void machine_check_queue_event(void)
memcpy(&local_paca->mce_info->mce_event_queue[index],
&evt, sizeof(evt));
- /* Queue irq work to process this event later. */
- irq_work_queue(&mce_event_process_work);
+ /* Queue irq work to process this event later. Before
+ * queuing the work enable translation for non radix LPAR,
+ * as irq_work_queue may try to access memory outside RMO
+ * region.
+ */
+ if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
+ msr = mfmsr();
+ mtmsr(msr | MSR_IR | MSR_DR);
+ irq_work_queue(&mce_event_process_work);
+ mtmsr(msr);
+ } else {
+ irq_work_queue(&mce_event_process_work);
+ }
}
</pre>
</blockquote>
<pre class="moz-quote-pre" wrap="">
We already went to virtual mode and queued (different) irq work in
arch/powerpc/platforms/pseries/ras.c:mce_handle_error()
We also called save_mce_event() which also might have queued irq work,
via machine_check_ue_event().
So it really feels like something about the design is wrong if we have
to go to virtual mode again and queue more irq work here.
I guess we can probably merge this as a backportable fix, doing anything
else would be a bigger change.</pre>
</blockquote>
<pre>I agree.
</pre>
<blockquote type="cite" cite="mid:87y289natb.fsf@mpe.ellerman.id.au">
<pre class="moz-quote-pre" wrap="">
Looking at ras.c there's the comment:
* Enable translation as we will be accessing per-cpu variables
* in save_mce_event() which may fall outside RMO region, also
But AFAICS it's only irq_work_queue() that touches anything percpu?</pre>
</blockquote>
<pre>Yeah, we left the comment unchanged after doing some modifications around it,
It needs to be updated, ill send a separate patch for it.
</pre>
<blockquote type="cite" cite="mid:87y289natb.fsf@mpe.ellerman.id.au">
<pre class="moz-quote-pre" wrap="">
So maybe we should just not be using irq_work_queue(). It's a pretty
thin wrapper around set_dec(1), perhaps we just need to hand-roll some
real-mode friendly way of doing that.</pre>
</blockquote>
<pre>You mean, have separate queue and run the work from timer handler?
</pre>
<blockquote type="cite" cite="mid:87y289natb.fsf@mpe.ellerman.id.au">
<pre class="moz-quote-pre" wrap="">
cheers
</pre>
</blockquote>
</body>
</html>