[PATCH 3/3] KVM: PPC: Book3S HV: Clear pending decr exceptions on nested guest entry

Laurent Vivier lvivier at redhat.com
Thu Jun 20 17:57:05 AEST 2019


On 20/06/2019 03:46, Suraj Jitindar Singh wrote:
> If we enter an L1 guest with a pending decrementer exception then this
> is cleared on guest exit if the guest has writtien a positive value into
> the decrementer (indicating that it handled the decrementer exception)
> since there is no other way to detect that the guest has handled the
> pending exception and that it should be dequeued. In the event that the
> L1 guest tries to run a nested (L2) guest immediately after this and the
> L2 guest decrementer is negative (which is loaded by L1 before making
> the H_ENTER_NESTED hcall), then the pending decrementer exception
> isn't cleared and the L2 entry is blocked since L1 has a pending
> exception, even though L1 may have already handled the exception and
> written a positive value for it's decrementer. This results in a loop of
> L1 trying to enter the L2 guest and L0 blocking the entry since L1 has
> an interrupt pending with the outcome being that L2 never gets to run
> and hangs.
> 
> Fix this by clearing any pending decrementer exceptions when L1 makes
> the H_ENTER_NESTED hcall since it won't do this if it's decrementer has
> gone negative, and anyway it's decrementer has been communicated to L0
> in the hdec_expires field and L0 will return control to L1 when this
> goes negative by delivering an H_DECREMENTER exception.
> 
> Fixes: 95a6432ce903 "KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests"
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh at gmail.com>
> ---
>  arch/powerpc/kvm/book3s_hv.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 719fd2529eec..4a5eb29b952f 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -4128,8 +4128,15 @@ int kvmhv_run_single_vcpu(struct kvm_run *kvm_run,
>  
>  	preempt_enable();
>  
> -	/* cancel pending decrementer exception if DEC is now positive */
> -	if (get_tb() < vcpu->arch.dec_expires && kvmppc_core_pending_dec(vcpu))
> +	/*
> +	 * cancel pending decrementer exception if DEC is now positive, or if
> +	 * entering a nested guest in which case the decrementer is now owned
> +	 * by L2 and the L1 decrementer is provided in hdec_expires
> +	 */
> +	if (kvmppc_core_pending_dec(vcpu) &&
> +			((get_tb() < vcpu->arch.dec_expires) ||
> +			 (trap == BOOK3S_INTERRUPT_SYSCALL &&
> +			  kvmppc_get_gpr(vcpu, 3) == H_ENTER_NESTED)))
>  		kvmppc_core_dequeue_dec(vcpu);
>  
>  	trace_kvm_guest_exit(vcpu);
> 

Patches 2 and 3: tested I can boot and run an L2 nested guest with qemu
v4.0.0 and caps-large-decr=on in the case we have had a hang previously.

Tested-by: Laurent Vivier <lvivier at redhat.com>



More information about the Linuxppc-dev mailing list