[PATCH v3] KVM: PPC: Tick accounting should defer vtime accounting 'til after IRQ handling

Laurent Vivier lvivier at redhat.com
Thu Oct 28 23:48:35 AEDT 2021


On 27/10/2021 16:21, Nicholas Piggin wrote:
> From: Laurent Vivier <lvivier at redhat.com>
> 
> Commit 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest
> context before enabling irqs") moved guest_exit() into the interrupt
> protected area to avoid wrong context warning (or worse). The problem is
> that tick-based time accounting has not yet been updated at this point
> (because it depends on the timer interrupt firing), so the guest time
> gets incorrectly accounted to system time.
> 
> To fix the problem, follow the x86 fix in commit 160457140187 ("Defer
> vtime accounting 'til after IRQ handling"), and allow host IRQs to run
> before accounting the guest exit time.
> 
> In the case vtime accounting is enabled, this is not required because TB
> is used directly for accounting.
> 
> Before this patch, with CONFIG_TICK_CPU_ACCOUNTING=y in the host and a
> guest running a kernel compile, the 'guest' fields of /proc/stat are
> stuck at zero. With the patch they can be observed increasing roughly as
> expected.
> 
> Fixes: e233d54d4d97 ("KVM: booke: use __kvm_guest_exit")
> Fixes: 112665286d08 ("KVM: PPC: Book3S HV: Context tracking exit guest context before enabling irqs")
> Cc: <stable at vger.kernel.org> # 5.12
> Signed-off-by: Laurent Vivier <lvivier at redhat.com>
> [np: only required for tick accounting, add Book3E fix, tweak changelog]
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> ---
> Since v2:
> - I took over the patch with Laurent's blessing.
> - Changed to avoid processing IRQs if we do have vtime accounting
>    enabled.
> - Changed so in either case the accounting is called with irqs disabled.
> - Added similar Book3E fix.
> - Rebased on upstream, tested, observed bug and confirmed fix.
> 
>   arch/powerpc/kvm/book3s_hv.c | 30 ++++++++++++++++++++++++++++--
>   arch/powerpc/kvm/booke.c     | 16 +++++++++++++++-
>   2 files changed, 43 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 2acb1c96cfaf..7b74fc0a986b 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -3726,7 +3726,20 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc)
>   
>   	kvmppc_set_host_core(pcpu);
>   
> -	guest_exit_irqoff();
> +	context_tracking_guest_exit();
> +	if (!vtime_accounting_enabled_this_cpu()) {
> +		local_irq_enable();
> +		/*
> +		 * Service IRQs here before vtime_account_guest_exit() so any
> +		 * ticks that occurred while running the guest are accounted to
> +		 * the guest. If vtime accounting is enabled, accounting uses
> +		 * TB rather than ticks, so it can be done without enabling
> +		 * interrupts here, which has the problem that it accounts
> +		 * interrupt processing overhead to the host.
> +		 */
> +		local_irq_disable();
> +	}
> +	vtime_account_guest_exit();
>   
>   	local_irq_enable();
>   
> @@ -4510,7 +4523,20 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
>   
>   	kvmppc_set_host_core(pcpu);
>   
> -	guest_exit_irqoff();
> +	context_tracking_guest_exit();
> +	if (!vtime_accounting_enabled_this_cpu()) {
> +		local_irq_enable();
> +		/*
> +		 * Service IRQs here before vtime_account_guest_exit() so any
> +		 * ticks that occurred while running the guest are accounted to
> +		 * the guest. If vtime accounting is enabled, accounting uses
> +		 * TB rather than ticks, so it can be done without enabling
> +		 * interrupts here, which has the problem that it accounts
> +		 * interrupt processing overhead to the host.
> +		 */
> +		local_irq_disable();
> +	}
> +	vtime_account_guest_exit();
>   
>   	local_irq_enable();
>   
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 977801c83aff..8c15c90dd3a9 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -1042,7 +1042,21 @@ int kvmppc_handle_exit(struct kvm_vcpu *vcpu, unsigned int exit_nr)
>   	}
>   
>   	trace_kvm_exit(exit_nr, vcpu);
> -	guest_exit_irqoff();
> +
> +	context_tracking_guest_exit();
> +	if (!vtime_accounting_enabled_this_cpu()) {
> +		local_irq_enable();
> +		/*
> +		 * Service IRQs here before vtime_account_guest_exit() so any
> +		 * ticks that occurred while running the guest are accounted to
> +		 * the guest. If vtime accounting is enabled, accounting uses
> +		 * TB rather than ticks, so it can be done without enabling
> +		 * interrupts here, which has the problem that it accounts
> +		 * interrupt processing overhead to the host.
> +		 */
> +		local_irq_disable();
> +	}
> +	vtime_account_guest_exit();
>   
>   	local_irq_enable();
>   
> 

I'm wondering if we should put the context_tracking_guest_exit() just after the 
"srcu_read_unlock(&vc->kvm->srcu, srcu_idx);" as it was before 61bd0f66ff92 ("KVM: PPC: 
Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN")?

Thanks,
Laurent



More information about the Linuxppc-dev mailing list