[PATCH] KVM: PPC: Book3S HV nestedv2: Cancel pending HDEC exception
Nicholas Piggin
npiggin at gmail.com
Thu Mar 21 00:43:28 AEDT 2024
On Wed Mar 13, 2024 at 5:26 PM AEST, Vaibhav Jain wrote:
> This reverts commit 180c6b072bf360b686e53d893d8dcf7dbbaec6bb ("KVM: PPC:
> Book3S HV nestedv2: Do not cancel pending decrementer exception") which
> prevented cancelling a pending HDEC exception for nestedv2 KVM guests. It
> was done to avoid overhead of a H_GUEST_GET_STATE hcall to read the 'HDEC
> expiry TB' register which was higher compared to handling extra decrementer
> exceptions.
>
> This overhead of reading 'HDEC expiry TB' register has been mitigated
> recently by the L0 hypervisor(PowerVM) by putting the value of this
> register in L2 guest-state output buffer on trap to L1. From there the
> value of this register is cached, made available in kvmhv_run_single_vcpu()
> to compare it against host(L1) timebase and cancel the pending hypervisor
> decrementer exception if needed.
Ah, I figured out the problem here. Guest entry never clears the
queued dec, because it's level triggered on the DEC MSB so it
doesn't go away when it's delivered. So upstream code is indeed
buggy and I think I take the blame for suggesting this nestedv2
workaround.
I actually don't think that is necessary though, we could treat it
like other interrupts. I think that would solve the problem without
having to test dec here.
I am wondering though, what workload slows down that this patch
was needed in the first place. We'd only get here after a cede
returns, then we'd dequeue the dec and stop having to GET_STATE
it here.
Thanks,
Nick
>
> Fixes: 180c6b072bf3 ("KVM: PPC: Book3S HV nestedv2: Do not cancel pending decrementer exception")
> Signed-off-by: Vaibhav Jain <vaibhav at linux.ibm.com>
> ---
> arch/powerpc/kvm/book3s_hv.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 0b921704da45..e47b954ce266 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4856,7 +4856,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> * entering a nested guest in which case the decrementer is now owned
> * by L2 and the L1 decrementer is provided in hdec_expires
> */
> - if (!kvmhv_is_nestedv2() && kvmppc_core_pending_dec(vcpu) &&
> + if (kvmppc_core_pending_dec(vcpu) &&
> ((tb < kvmppc_dec_expires_host_tb(vcpu)) ||
> (trap == BOOK3S_INTERRUPT_SYSCALL &&
> kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
More information about the Linuxppc-dev
mailing list