[RFC PATCH 2/2] KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9

Suraj Jitindar Singh sjitindarsingh at gmail.com
Wed Jan 3 10:15:21 AEDT 2018


On Fri, 2017-12-08 at 17:11 +1100, Paul Mackerras wrote:
> POWER9 has hardware bugs relating to transactional memory and thread
> reconfiguration (changes to hardware SMT mode).  Specifically, the
> core
> does not have enough storage to store a complete checkpoint of all
> the
> architected state for all four threads.  The DD2.2 version of POWER9
> includes hardware modifications designed to allow hypervisor software
> to implement workarounds for these problems.  This patch implements
> those workarounds in KVM code so that KVM guests see a full, working
> transactional memory implementation.
> 
> The problems center around the use of TM suspended state, where the
> CPU has a checkpointed state but execution is not transactional.  The
> workaround is to implement a "fake suspend" state, which looks to the
> guest like suspended state but the CPU does not store a checkpoint.
> In this state, any instruction that would cause a transition to
> transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
> checkpointed state (treclaim) causes a "soft patch" interrupt (vector
> 0x1500) to the hypervisor so that it can be emulated.  The trechkpt
> instruction also causes a soft patch interrupt.
> 
> On POWER9 DD2.2, we avoid returning to the guest in any state which
> would require a checkpoint to be present.  The trechkpt in the guest
> entry path which would normally create that checkpoint is replaced by
> either a transition to fake suspend state, if the guest is in suspend
> state, or a rollback to the pre-transactional state if the guest is
> in
> transactional state.  Fake suspend state is indicated by a flag in
> the
> PACA plus a new bit in the PSSCR.  The new PSSCR bit is write-only
> and
> reads back as 0.
> 
> On exit from the guest, if the guest is in fake suspend state, we
> still
> do the treclaim instruction as we would in real suspend state, in
> order
> to get into non-transactional state, but we do not save the resulting
> register state since there was no checkpoint.
> 
> Emulation of the instructions that cause a softpath interrupt is
> handled
> in two paths.  If the guest is in real suspend mode, we call
> kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
> transitioning to transactional state.  This is called before we do
> the treclaim in the guest exit path; because we haven't done
> treclaim,
> we can get back to the guest with the transaction still active.
> If the instruction is a case that kvmhv_p9_tm_emulation_early()
> doesn't
> handle, or if the guest is in fake suspend state, then we proceed to
> do the complete guest exit path and subsequently call
> kvmhv_p9_tm_emulation() in host context with the MMU on.  This
> handles all the cases including the cases that generate program
> interrupts (illegal instruction or TM Bad Thing) and facility
> unavailable interrupts.
> 
> The emulation is reasonably straightforward and is mostly concerned
> with checking for exception conditions and updating the state of
> registers such as MSR and CR0.  The treclaim emulation takes care to
> ensure that the TEXASR register gets updated as if it were the guest
> treclaim instruction that had done failure recording, not the
> treclaim
> done in hypervisor state in the guest exit path.
> 
> Signed-off-by: Paul Mackerras <paulus at ozlabs.org>
> 

With the following patch applied on top of the TM emulation code I was
able to get at least a basic test to run on the guest on real hardware.

[snip]

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c7fe377ff6bc..adf2da6b2211 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -3049,6 +3049,7 @@ BEGIN_FTR_SECTION
        li      r0, PSSCR_FAKE_SUSPEND
        andc    r3, r3, r0
        mtspr   SPRN_PSSCR, r3
+       ld      r9, HSTATE_KVM_VCPU(r13)
        b       1f
 2:
 END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_EMUL)
@@ -3273,8 +3274,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_EMUL)
        b       9b              /* and return */
 10:    stdu    r1, -PPC_MIN_STKFRM(r1)
        /* guest is in transactional state, so simulate rollback */
+       mr      r3, r4
        bl      kvmhv_emulate_tm_rollback
        nop
+       ld      r4, HSTATE_KVM_VCPU(r13) /* our vcpu pointer has been
trashed */
        addi    r1, r1, PPC_MIN_STKFRM
        b       9b
 #endif


More information about the Linuxppc-dev mailing list