[RFC PATCH 2/2] KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9
Suraj Jitindar Singh
sjitindarsingh at gmail.com
Wed Jan 3 10:15:21 AEDT 2018
On Fri, 2017-12-08 at 17:11 +1100, Paul Mackerras wrote:
> POWER9 has hardware bugs relating to transactional memory and thread
> reconfiguration (changes to hardware SMT mode). Specifically, the
> core
> does not have enough storage to store a complete checkpoint of all
> the
> architected state for all four threads. The DD2.2 version of POWER9
> includes hardware modifications designed to allow hypervisor software
> to implement workarounds for these problems. This patch implements
> those workarounds in KVM code so that KVM guests see a full, working
> transactional memory implementation.
>
> The problems center around the use of TM suspended state, where the
> CPU has a checkpointed state but execution is not transactional. The
> workaround is to implement a "fake suspend" state, which looks to the
> guest like suspended state but the CPU does not store a checkpoint.
> In this state, any instruction that would cause a transition to
> transactional state (rfid, rfebb, mtmsrd, tresume) or would use the
> checkpointed state (treclaim) causes a "soft patch" interrupt (vector
> 0x1500) to the hypervisor so that it can be emulated. The trechkpt
> instruction also causes a soft patch interrupt.
>
> On POWER9 DD2.2, we avoid returning to the guest in any state which
> would require a checkpoint to be present. The trechkpt in the guest
> entry path which would normally create that checkpoint is replaced by
> either a transition to fake suspend state, if the guest is in suspend
> state, or a rollback to the pre-transactional state if the guest is
> in
> transactional state. Fake suspend state is indicated by a flag in
> the
> PACA plus a new bit in the PSSCR. The new PSSCR bit is write-only
> and
> reads back as 0.
>
> On exit from the guest, if the guest is in fake suspend state, we
> still
> do the treclaim instruction as we would in real suspend state, in
> order
> to get into non-transactional state, but we do not save the resulting
> register state since there was no checkpoint.
>
> Emulation of the instructions that cause a softpath interrupt is
> handled
> in two paths. If the guest is in real suspend mode, we call
> kvmhv_p9_tm_emulation_early() to handle the cases where the guest is
> transitioning to transactional state. This is called before we do
> the treclaim in the guest exit path; because we haven't done
> treclaim,
> we can get back to the guest with the transaction still active.
> If the instruction is a case that kvmhv_p9_tm_emulation_early()
> doesn't
> handle, or if the guest is in fake suspend state, then we proceed to
> do the complete guest exit path and subsequently call
> kvmhv_p9_tm_emulation() in host context with the MMU on. This
> handles all the cases including the cases that generate program
> interrupts (illegal instruction or TM Bad Thing) and facility
> unavailable interrupts.
>
> The emulation is reasonably straightforward and is mostly concerned
> with checking for exception conditions and updating the state of
> registers such as MSR and CR0. The treclaim emulation takes care to
> ensure that the TEXASR register gets updated as if it were the guest
> treclaim instruction that had done failure recording, not the
> treclaim
> done in hypervisor state in the guest exit path.
>
> Signed-off-by: Paul Mackerras <paulus at ozlabs.org>
>
With the following patch applied on top of the TM emulation code I was
able to get at least a basic test to run on the guest on real hardware.
[snip]
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index c7fe377ff6bc..adf2da6b2211 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -3049,6 +3049,7 @@ BEGIN_FTR_SECTION
li r0, PSSCR_FAKE_SUSPEND
andc r3, r3, r0
mtspr SPRN_PSSCR, r3
+ ld r9, HSTATE_KVM_VCPU(r13)
b 1f
2:
END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_EMUL)
@@ -3273,8 +3274,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_P9_TM_EMUL)
b 9b /* and return */
10: stdu r1, -PPC_MIN_STKFRM(r1)
/* guest is in transactional state, so simulate rollback */
+ mr r3, r4
bl kvmhv_emulate_tm_rollback
nop
+ ld r4, HSTATE_KVM_VCPU(r13) /* our vcpu pointer has been
trashed */
addi r1, r1, PPC_MIN_STKFRM
b 9b
#endif
More information about the Linuxppc-dev
mailing list