[PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage()
Michael Ellerman
michael at ellerman.id.au
Thu Sep 24 08:23:35 AEST 2015
On 23 September 2015 16:05:02 GMT+10:00, Michael Neuling <mikey at neuling.org> wrote:
>The 32 and 64 bit variants of __get_datapage() use a "bcl; mflr" to
>determine the loaded address of the VDSO. The current version of these
>attempt to use the special bcl variant which avoids pushing to the
>link stack.
>
>Unfortunately it uses bcl+8 rather than the required bcl+4. Hence the
>current code results in link stack corruption and the resulting
>performance degradation (due to branch mis-prediction).
>
>This patch moves us to bcl+4 by moving __kernel_datapage_offset
>out of __get_datapage().
>
>With this patch, running the below benchmark we get a bump in
>performance on POWER8 for gettimeofday() (which uses
>__get_datapage()).
>
>64bit gets ~4% improvement:
> Without patch:
> # ./tb
> time = 0.180321
> With patch:
> # ./tb
> time = 0.187408
>
>32bit gets ~9% improvement:
> Without patch:
> # ./tb
> time = 0.276551
> With patch:
> # ./tb
> time = 0.252767
>
>Testcase tb.c (stolen from Anton)
> /* gcc -O2 tb.c -o tb */
> #include <sys/time.h>
> #include <stdio.h>
>
> int main()
> {
> int i;
>
> struct timeval tv_start, tv_end;
>
> gettimeofday(&tv_start, NULL);
>
> for(i = 0; i < 10000000; i++) {
> gettimeofday(&tv_end, NULL);
> }
>
> printf("time = %.6f\n", tv_end.tv_sec - tv_start.tv_sec +
>(tv_end.tv_usec - tv_start.tv_usec) * 1e-6);
>
> return 0;
> }
You know where test cases are supposed to go.
I know it's not a pass/fail test, but it's still useful. If it's in the tree it will get run as part of automated test runs and we will have a record of the result over time.
cheers
--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.
More information about the Linuxppc-dev
mailing list