[PATCH v2 0/4] implement vcpu preempted check

Paolo Bonzini pbonzini at redhat.com
Wed Jul 6 22:28:06 AEST 2016



On 06/07/2016 14:08, Wanpeng Li wrote:
> 2016-07-06 18:44 GMT+08:00 Paolo Bonzini <pbonzini at redhat.com>:
>>
>>
>> On 06/07/2016 08:52, Peter Zijlstra wrote:
>>> On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote:
>>>> change fomr v1:
>>>>      a simplier definition of default vcpu_is_preempted
>>>>      skip mahcine type check on ppc, and add config. remove dedicated macro.
>>>>      add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner.
>>>>      add more comments
>>>>      thanks boqun and Peter's suggestion.
>>>>
>>>> This patch set aims to fix lock holder preemption issues.
>>>>
>>>> test-case:
>>>> perf record -a perf bench sched messaging -g 400 -p && perf report
>>>>
>>>> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
>>>> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>>>>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>>>>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>>>>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>>>>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>>>>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
>>>>
>>>> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
>>>> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
>>>> These spin_on_onwer variant also cause rcu stall before we apply this patch set
>>>
>>> Paolo, could you help out with an (x86) KVM interface for this?
>>
>> If it's just for spin loops, you can check if the version field in the
>> steal time structure has changed.
> 
> Steal time will not be updated until ahead of next vmentry except
> wrmsr MSR_KVM_STEAL_TIME. So it can't represent it is preempted
> currently, right?

Hmm, you're right.  We can use bit 0 of struct kvm_steal_time's flags to
indicate that pad[0] is a "VCPU preempted" field; if pad[0] is 1, the
VCPU has been scheduled out since the last time the guest reset the bit.
 The guest can use an xchg to test-and-clear it.  The bit can be
accessed at any time, independent of the version field.

Paolo


More information about the Linuxppc-dev mailing list