[PATCH 4/4] powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest.
Mahesh J Salgaonkar
mahesh at linux.vnet.ibm.com
Tue Jun 17 18:44:41 EST 2014
On 2014-06-17 16:23:58 Tue, Paul Mackerras wrote:
> On Wed, Jun 11, 2014 at 02:18:21PM +0530, Mahesh J Salgaonkar wrote:
> > From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
> >
> > Currently we forward MCEs to guest which have been recovered by guest.
> > And for unhandled errors we do not deliver the MCE to guest. It looks like
> > with no support of FWNMI in qemu, guest just panics whenever we deliver the
> > recovered MCEs to guest. Also, the existig code used to return to host for
> > unhandled errors which was casuing guest to hang with soft lockups inside
> > guest and makes it difficult to recover guest instance.
> >
> > This patch now forwards all fatal MCEs to guest causing guest to crash/panic.
> > And, for recovered errors we just go back to normal functioning of guest
> > instead of returning to host.
>
> ... having corrupted possibly live values that the guest had in SRR0/1.
>
> Ideally the guest should have cleared MSR[RI] before putting values in
> SRR0/1, so perhaps you could check that and return to the guest
> without giving it a machine check if MSR[RI] is set. But if MSR[RI]
> is clear, the guest is unfixably corrupted because the machine check
> overwrote SRR0/1, and the only thing we can do, in the absence of
> FWNMI support, is give the guest a machine check interrupt and let it
> crash.
Yes agree. I have patch (below) ready for the same, will test/verify and send it
out soon.
Thanks,
-Mahesh.
-------------
Deliver machine check with MSR(RI=0) to guest as MCE
From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
---
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 868347e..c9c56ee 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2257,7 +2257,6 @@ machine_check_realmode:
mr r3, r9 /* get vcpu pointer */
bl kvmppc_realmode_machine_check
nop
- cmpdi r3, 0 /* Did we handle MCE ? */
ld r9, HSTATE_KVM_VCPU(r13)
li r12, BOOK3S_INTERRUPT_MACHINE_CHECK
/*
@@ -2270,13 +2269,18 @@ machine_check_realmode:
* The old code used to return to host for unhandled errors which
* was causing guest to hang with soft lockups inside guest and
* makes it difficult to recover guest instance.
+ *
+ * if we receive machine check with MSR(RI=0) then deliver it to
+ * guest as machine check causing guest to crash.
*/
- ld r10, VCPU_PC(r9)
ld r11, VCPU_MSR(r9)
+ andi. r10, r11, MSR_RI /* check for unrecoverable exception */
+ beq 1f /* Deliver a machine check to guest */
+ ld r10, VCPU_PC(r9)
+ cmpdi r3, 0 /* Did we handle MCE ? */
bne 2f /* Continue guest execution. */
/* If not, deliver a machine check. SRR0/1 are already set */
- li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
- ld r11, VCPU_MSR(r9)
+1: li r10, BOOK3S_INTERRUPT_MACHINE_CHECK
bl kvmppc_msr_interrupt
2: b fast_interrupt_c_return
More information about the Linuxppc-dev
mailing list