[PATCH] powerpc/time: use get_tb instead of get_vtb in running_clock

hejianet hejianet at gmail.com
Thu Jul 13 16:55:34 AEST 2017


Hi Ben
I add some printk logs in watchdog_timer_fn in the guest
[   16.025222] get_vtb=8236291881, get_tb=13756711357, get_timestamp=4
[   20.025624] get_vtb=9745285807, get_tb=15804711283, get_timestamp=7
[   24.025042] get_vtb=11518119641, get_tb=17852711085, get_timestamp=10
[   28.024074] get_vtb=13192704319, get_tb=19900711071, get_timestamp=13
[   32.024086] get_vtb=14856516982, get_tb=21948711066, get_timestamp=16
[   36.024075] get_vtb=16569127618, get_tb=23996711078, get_timestamp=20
[   40.024138] get_vtb=17008865823, get_tb=26044718418, get_timestamp=20
[   44.023993] get_vtb=17020637241, get_tb=28092716383, get_timestamp=20
[   48.023996] get_vtb=17022857170, get_tb=30140718472, get_timestamp=20
[   52.023996] get_vtb=17024268541, get_tb=32188718432, get_timestamp=20
[   56.023996] get_vtb=17036577783, get_tb=34236718077, get_timestamp=20
[   60.023996] get_vtb=17037829743, get_tb=36284718437, get_timestamp=20
[   64.023992] get_vtb=17039846747, get_tb=38332716609, get_timestamp=20
[   68.023991] get_vtb=17041448345, get_tb=40380715903, get_timestamp=20

The get_timestamp(use get_vtb(),unit is second) is slower down compared
with printk time. You also can obviously watch the get_vtb increment is
slowly less than get_tb increment.

Without this patch, I thought there might be some softlockup warnings
missed in guest.

-Jia

On 13/07/2017 6:45 AM, Benjamin Herrenschmidt wrote:
> On Wed, 2017-07-12 at 23:01 +0800, Jia He wrote:
>> Virtual time base(vtb) is a register which increases only in guest.
>> Any exit from guest to host will stop the vtb(saved and restored by kvm).
>> But if there is an IO causes guest exits to host, the guest's watchdog
>> (watchdog_timer_fn -> is_softlockup -> get_timestamp -> running_clock)
>> needs to also include the time elapsed in host. get_vtb is not correct in
>> this case.
>>
>> Also, the TB_OFFSET is well saved and restored by qemu after commit [1].
>> So we can use get_tb here.
> 
> That completely defeats the purpose here... This was done specifically
> to exploit the VTB which doesn't count in hypervisor mode.
> 
>>
>> [1] http://git.qemu.org/?p=qemu.git;a=commit;h=42043e4f1
>>
>> Signed-off-by: Jia He <hejianet at gmail.com>
>> ---
>>   arch/powerpc/kernel/time.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
>> index fe6f3a2..c542dd3 100644
>> --- a/arch/powerpc/kernel/time.c
>> +++ b/arch/powerpc/kernel/time.c
>> @@ -695,16 +695,15 @@ notrace unsigned long long sched_clock(void)
>>   unsigned long long running_clock(void)
>>   {
>>   	/*
>> -	 * Don't read the VTB as a host since KVM does not switch in host
>> -	 * timebase into the VTB when it takes a guest off the CPU, reading the
>> -	 * VTB would result in reading 'last switched out' guest VTB.
>> +	 * Use get_tb instead of get_vtb for guest since the TB_OFFSET has been
>> +	 * well saved/restored when qemu does suspend/resume.
>>   	 *
>>   	 * Host kernels are often compiled with CONFIG_PPC_PSERIES checked, it
>>   	 * would be unsafe to rely only on the #ifdef above.
>>   	 */
>>   	if (firmware_has_feature(FW_FEATURE_LPAR) &&
>>   	    cpu_has_feature(CPU_FTR_ARCH_207S))
>> -		return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
>> +		return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
>>   
>>   	/*
>>   	 * This is a next best approximation without a VTB.
> 


More information about the Linuxppc-dev mailing list