[PATCH v3 00/20] Speculative page faults
Laurent Dufour
ldufour at linux.vnet.ibm.com
Sat Sep 30 01:27:09 AEST 2017
Hi Andrew,
On 28/09/2017 22:38, Andrew Morton wrote:
> On Thu, 28 Sep 2017 14:29:02 +0200 Laurent Dufour <ldufour at linux.vnet.ibm.com> wrote:
>
>>> Laurent's [0/n] provides some nice-looking performance benefits for
>>> workloads which are chosen to show performance benefits(!) but, alas,
>>> no quantitative testing results for workloads which we may suspect will
>>> be harmed by the changes(?). Even things as simple as impact upon
>>> single-threaded pagefault-intensive workloads and its effect upon
>>> CONFIG_SMP=n .text size?
>>
>> I forgot to mention in my previous email the impact on the .text section.
>>
>> Here are the metrics I got :
>>
>> .text size UP SMP Delta
>> 4.13-mmotm 8444201 8964137 6.16%
>> '' +spf 8452041 8971929 6.15%
>> Delta 0.09% 0.09%
>>
>> No major impact as you could see.
>
> 8k text increase seems rather a lot actually. That's a lot more
> userspace cacheclines that get evicted during a fault...
>
> Is the feature actually beneficial on uniprocessor?
This is useless on uniprocessor, and I will disable it on x86 when !SMP
by not defining __HAVE_ARCH_CALL_SPF.
So the speculative page fault handler will not be built but the vm
sequence counter and the SCRU stuff will still be there. I may also make
it disabled through macro when __HAVE_ARCH_CALL_SPF is not defined, but
this may obfuscated the code a bit...
On ppc64, as this feature requires book3s, it can't be built without SMP
support.
I rebuild the code on my x86 guest with the following patch applied:
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -260,7 +260,7 @@ enum page_cache_mode {
/*
* Advertise that we call the Speculative Page Fault handler.
*/
-#ifdef CONFIG_X86_64
+#if defined(CONFIG_X86_64) && defined(CONFIG_SMP)
#define __HAVE_ARCH_CALL_SPF
#endif
And this time I got the following size on UP :
UP
4.13-mmotm 8444201
'' +spf 8447945 (previously 8452041)
+3744
If I disable all the vm_sequence operations and the SRCU stuff this
would lead to 0.
Thanks,
Laurent.
More information about the Linuxppc-dev
mailing list