PROBLEM: monotonic clock going backwards on ppc64

Michael Ellerman mpe at ellerman.id.au
Sat Mar 2 00:24:57 AEDT 2019


Hi Jakub,

[Cc += Timekeeping maintainers]

"Jakub Drnec" <jaydee at email.cz> writes:
> Hi all,
>
> I think I observed a potential problem, is this the correct place to report it? (CC me, not on list)
>
> [1.] One line summary: monotonic clock can be made to decrease on ppc64
> [2.] Full description:
> Setting the realtime clock can sometimes make the monotonic clock go back by over a hundred years.
> Decreasing the realtime clock across the y2k38 threshold is one reliable way to reproduce.
> Allegedly this can also happen just by running ntpd, I have not managed to reproduce that other
> than booting with rtc at >2038 and then running ntp.
> When this happens, anything with timers (e.g. openjdk) breaks rather badly.

Thanks for the report.

> The problem seems to be in vDSO code in arch/powerpc/kernel/vdso64/gettimeofday.S.

You're right, the wall-to-monotonic offset (wtom_clock_sec) is a signed
32-bit value, so that seems like it's going to have problems.

If I do `date -s 2037-1-1` I see:

[   26.024061] update_vsyscall: tk->wall_to_monotonic.tv_sec -2114341175
[   26.042633] update_vsyscall: vdso_data->wtom_clock_sec    -2114341175

Which looks sane.

But then 2040-1-1 shows:

[   32.617020] update_vsyscall: tk->wall_to_monotonic.tv_sec -2208949168
[   32.632642] update_vsyscall: vdso_data->wtom_clock_sec     2086018128

ie. the larger negative offset has overflowed and become positive.

But then when we go back to 2037 we get a negative offset again and
monotonic time appears to go backward and things are unhappy.

I don't know this code well, but the patch below *appears* to work. I'll
have a closer look on Monday.

cheers


diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index 1afe90ade595..139133ec21d5 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -82,7 +82,7 @@ struct vdso_data {
 	__u32 icache_block_size;		/* L1 i-cache block size     */
 	__u32 dcache_log_block_size;		/* L1 d-cache log block size */
 	__u32 icache_log_block_size;		/* L1 i-cache log block size */
-	__s32 wtom_clock_sec;			/* Wall to monotonic clock */
+	__s64 wtom_clock_sec;			/* Wall to monotonic clock */
 	__s32 wtom_clock_nsec;
 	struct timespec stamp_xtime;	/* xtime as at tb_orig_stamp */
 	__u32 stamp_sec_fraction;	/* fractional seconds of stamp_xtime */
diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S
index a4ed9edfd5f0..1f324c28705b 100644
--- a/arch/powerpc/kernel/vdso64/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso64/gettimeofday.S
@@ -92,7 +92,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 	 * At this point, r4,r5 contain our sec/nsec values.
 	 */
 
-	lwa	r6,WTOM_CLOCK_SEC(r3)
+	ld	r6,WTOM_CLOCK_SEC(r3)
 	lwa	r9,WTOM_CLOCK_NSEC(r3)
 
 	/* We now have our result in r6,r9. We create a fake dependency
@@ -125,7 +125,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 	bne     cr6,75f
 
 	/* CLOCK_MONOTONIC_COARSE */
-	lwa     r6,WTOM_CLOCK_SEC(r3)
+	ld	r6,WTOM_CLOCK_SEC(r3)
 	lwa     r9,WTOM_CLOCK_NSEC(r3)
 
 	/* check if counter has updated */


> [3.] Keywords: gettimeofday, ppc64, vdso
> [4.] Kernel information
> [4.1.] Kernel version: any (tested on 4.19)
> [4.2.] Kernel .config file: any
> [5.] Most recent kernel version which did not have the bug: not a regression
> [6.] Output of Oops..: not applicable
> [7.] Example program which triggers the problem
> --- testcase.c
> #include <stdio.h>
> #include <time.h>
> #include <stdlib.h>
> #include <unistd.h>
>
> long get_time() {
>   struct timespec tp;
>   if (clock_gettime(CLOCK_MONOTONIC, &tp) != 0) {
>     perror("clock_gettime failed");
>     exit(1);
>   }
>   long result = tp.tv_sec + tp.tv_nsec / 1000000000;
>   return result;
> }
>
> int main() {
>   printf("monitoring monotonic clock...\n");
>   long last = get_time();
>   while(1) {
>     long now = get_time();
>     if (now < last) {
>       printf("clock went backwards by %ld seconds!\n",
>         last - now);
>     }
>     last = now;
>     sleep(1);
>   }
>   return 0;
> }
> ---
> when running
> # date -s 2040-1-1
> # date -s 2037-1-1
> program outputs: clock went backwards by 4294967295 seconds!
>
> [8.] Environment: any ppc64, currently reproducing on qemu-system-ppc64le running debian unstable
> [X.] Other notes, patches, fixes, workarounds:
> The problem seems to be in vDSO code in arch/powerpc/kernel/vdso64/gettimeofday.S.
> (possibly because some values used in the calculation are only 32 bit?)
> Slightly silly workaround: 
> nuke the "cmpwi cr1,r3,CLOCK_MONOTONIC" in __kernel_clock_gettime
> Now it always goes through the syscall fallback which does not have the same problem.
>
> Regards,
> Jakub Drnec


More information about the Linuxppc-dev mailing list