[PATCH 00/16] Remove hash page table slot tracking from linux PTE
Aneesh Kumar K.V
aneesh.kumar at linux.vnet.ibm.com
Tue Oct 31 00:14:47 AEDT 2017
"Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com> writes:
> I looked at the perf data and with the test, we are doing larger number
> of hash faults and then around 10k flush_hash_range. Can the small
> improvement in number be due to the fact that we are not storing slot
> number when doing an insert now?. Also in the flush path we are now not
> using real_pte_t.
>
With THP disabled I am finding below.
Without patch
35.62% a.out [kernel.vmlinux] [k] clear_user_page
8.54% a.out [kernel.vmlinux] [k] __lock_acquire
3.86% a.out [kernel.vmlinux] [k] native_flush_hash_range
3.38% a.out [kernel.vmlinux] [k] save_context_stack
2.98% a.out a.out [.] main
2.59% a.out [kernel.vmlinux] [k] lock_acquire
2.29% a.out [kernel.vmlinux] [k] mark_lock
2.23% a.out [kernel.vmlinux] [k] native_hpte_insert
1.87% a.out [kernel.vmlinux] [k] get_mem_cgroup_from_mm
1.71% a.out [kernel.vmlinux] [k] rcu_lockdep_current_cpu_online
1.68% a.out [kernel.vmlinux] [k] lock_release
1.47% a.out [kernel.vmlinux] [k] __handle_mm_fault
1.41% a.out [kernel.vmlinux] [k] validate_sp
With patch
35.40% a.out [kernel.vmlinux] [k] clear_user_page
8.82% a.out [kernel.vmlinux] [k] __lock_acquire
3.66% a.out a.out [.] main
3.49% a.out [kernel.vmlinux] [k] save_context_stack
2.77% a.out [kernel.vmlinux] [k] lock_acquire
2.45% a.out [kernel.vmlinux] [k] mark_lock
1.80% a.out [kernel.vmlinux] [k] get_mem_cgroup_from_mm
1.80% a.out [kernel.vmlinux] [k] native_hpte_insert
1.79% a.out [kernel.vmlinux] [k] rcu_lockdep_current_cpu_online
1.78% a.out [kernel.vmlinux] [k] lock_release
1.73% a.out [kernel.vmlinux] [k] native_flush_hash_range
1.53% a.out [kernel.vmlinux] [k] __handle_mm_fault
That is we are now spending less time in native_flush_hash_range.
-aneesh
More information about the Linuxppc-dev
mailing list