[tip:locking/core] [futex] cb8c4312af: will-it-scale.per_process_ops -3.2% regression

kernel test robot oliver.sang at intel.com
Sun Oct 8 18:08:09 AEDT 2023



Hello,

kernel test robot noticed a -3.2% regression of will-it-scale.per_process_ops on:


commit: cb8c4312afca1b2dc64107e7e7cea81911055612 ("futex: Add sys_futex_wait()")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git locking/core

testcase: will-it-scale
test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory
parameters:

	nr_task: 16
	mode: process
	test: futex4
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang at intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202310081429.a30c99f2-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231008/202310081429.a30c99f2-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/futex4/will-it-scale

commit: 
  43adf84495 ("futex: FLAGS_STRICT")
  cb8c4312af ("futex: Add sys_futex_wait()")

43adf844951084c2 cb8c4312afca1b2dc64107e7e7c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.339e+08            -3.2%  1.296e+08        will-it-scale.16.processes
   8367312            -3.2%    8102637        will-it-scale.per_process_ops
 1.339e+08            -3.2%  1.296e+08        will-it-scale.workload
      0.61            -0.0        0.59        perf-stat.i.branch-miss-rate%
  72599095            -2.7%   70647352        perf-stat.i.branch-misses
      0.80            -1.8%       0.79        perf-stat.i.cpi
 2.073e+10            +3.8%  2.152e+10        perf-stat.i.dTLB-loads
  1.72e+10            +2.2%  1.757e+10        perf-stat.i.dTLB-stores
  66739031            -5.4%   63102078        perf-stat.i.iTLB-load-misses
   2080892            +2.4%    2131032        perf-stat.i.iTLB-loads
 8.203e+10            +1.6%  8.337e+10        perf-stat.i.instructions
      1231            +7.3%       1321        perf-stat.i.instructions-per-iTLB-miss
      1.24            +1.8%       1.27        perf-stat.i.ipc
    222.58            +2.4%     227.82        perf-stat.i.metric.M/sec
      0.61            -0.0        0.59        perf-stat.overall.branch-miss-rate%
      0.80            -1.8%       0.79        perf-stat.overall.cpi
      1229            +7.5%       1321        perf-stat.overall.instructions-per-iTLB-miss
      1.24            +1.8%       1.27        perf-stat.overall.ipc
    184025            +4.9%     193123        perf-stat.overall.path-length
  72373935            -2.7%   70427711        perf-stat.ps.branch-misses
 2.066e+10            +3.8%  2.144e+10        perf-stat.ps.dTLB-loads
 1.714e+10            +2.2%  1.751e+10        perf-stat.ps.dTLB-stores
  66517376            -5.5%   62888454        perf-stat.ps.iTLB-load-misses
   2073911            +2.4%    2123876        perf-stat.ps.iTLB-loads
 8.175e+10            +1.6%  8.309e+10        perf-stat.ps.instructions
 2.464e+13            +1.6%  2.504e+13        perf-stat.total.instructions
     29.29 ±  2%     -29.3        0.00        perf-profile.calltrace.cycles-pp.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     12.17 ±  2%     -12.2        0.00        perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
      9.21 ±  2%      -9.2        0.00        perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
      6.61 ±  2%      -6.6        0.00        perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.futex_wait.do_futex
      2.03 ±  2%      -0.1        1.88 ±  3%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
      0.00            +2.0        1.98 ±  4%  perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00            +4.0        3.96 ±  3%  perf-profile.calltrace.cycles-pp.futex_q_unlock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00            +4.1        4.09 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
      0.00            +4.4        4.35 ±  3%  perf-profile.calltrace.cycles-pp.futex_hash.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait
      0.00            +6.1        6.14 ±  3%  perf-profile.calltrace.cycles-pp.__get_user_nocheck_4.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait
      0.00            +8.5        8.52 ±  3%  perf-profile.calltrace.cycles-pp.futex_get_value_locked.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00           +11.3       11.27 ±  3%  perf-profile.calltrace.cycles-pp.futex_q_lock.futex_wait_setup.__futex_wait.futex_wait.do_futex
      0.00           +27.4       27.44 ±  3%  perf-profile.calltrace.cycles-pp.futex_wait_setup.__futex_wait.futex_wait.do_futex.__x64_sys_futex
      0.00           +31.3       31.33 ±  3%  perf-profile.calltrace.cycles-pp.__futex_wait.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
     29.80 ±  2%      -1.9       27.91 ±  3%  perf-profile.children.cycles-pp.futex_wait_setup
     12.68 ±  2%      -0.9       11.74 ±  3%  perf-profile.children.cycles-pp.futex_q_lock
      7.49 ±  2%      -0.6        6.93 ±  3%  perf-profile.children.cycles-pp.__get_user_nocheck_4
      4.38 ±  2%      -0.4        3.96 ±  3%  perf-profile.children.cycles-pp.futex_q_unlock
      4.74 ±  2%      -0.4        4.35 ±  3%  perf-profile.children.cycles-pp.futex_hash
      4.62 ±  2%      -0.3        4.33 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock
      0.48 ±  3%      -0.2        0.32 ±  5%  perf-profile.children.cycles-pp.futex_setup_timer
      1.71 ±  2%      -0.1        1.57 ±  4%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      1.24 ±  3%      -0.1        1.14 ±  4%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.52 ±  5%      -0.0        0.47 ±  4%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      0.35 ±  3%      -0.0        0.31 ±  3%  perf-profile.children.cycles-pp.syscall at plt
      0.00           +31.5       31.46 ±  3%  perf-profile.children.cycles-pp.__futex_wait
      7.88 ±  2%      -2.4        5.48 ±  2%  perf-profile.self.cycles-pp.futex_wait
     10.37 ±  3%      -0.9        9.46 ±  3%  perf-profile.self.cycles-pp.syscall
      7.46 ±  2%      -0.6        6.91 ±  3%  perf-profile.self.cycles-pp.__get_user_nocheck_4
      4.20 ±  2%      -0.4        3.78 ±  3%  perf-profile.self.cycles-pp.futex_q_unlock
      4.56 ±  2%      -0.4        4.19 ±  3%  perf-profile.self.cycles-pp.futex_hash
      4.44 ±  2%      -0.3        4.16 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
      3.54 ±  2%      -0.2        3.29 ±  3%  perf-profile.self.cycles-pp.futex_q_lock
      1.71 ±  2%      -0.1        1.57 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      0.40 ±  3%      -0.1        0.32 ±  5%  perf-profile.self.cycles-pp.futex_setup_timer
      1.18            -0.1        1.10 ±  3%  perf-profile.self.cycles-pp.do_syscall_64
      1.00            -0.1        0.94 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      2.14 ±  3%      +0.2        2.31 ±  3%  perf-profile.self.cycles-pp.__x64_sys_futex
      0.00            +3.5        3.50 ±  3%  perf-profile.self.cycles-pp.__futex_wait




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki



More information about the Linuxppc-dev mailing list