Hello, kernel test robot noticed a 3.9% improvement of will-it-scale.per_process_ops on: commit: 5730609ffd7e558e1e3305d0c6839044e8f6591b ("select: do_pollfd: add unlikely branch hint return path") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: will-it-scale config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 104 threads 2 sockets (Skylake) with 192G memory parameters: nr_task: 100% mode: process test: poll2 cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250612/202506121540.6eafcec4-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-9.4/process/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/poll2/will-it-scale commit: f1745496d3 ("netfs: Update main API document") 5730609ffd ("select: do_pollfd: add unlikely branch hint return path") f1745496d3fba34a 5730609ffd7e558e1e3305d0c68 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.08 ± 31% -35.8% 0.05 ± 31% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 2800 ± 91% +420.5% 14576 ±121% proc-vmstat.numa_hint_faults 1013 ± 37% +226.4% 3308 ±101% proc-vmstat.numa_hint_faults_local 25518332 +3.9% 26506874 will-it-scale.104.processes 245368 +3.9% 254873 will-it-scale.per_process_ops 25518332 +3.9% 26506874 will-it-scale.workload 4.802e+10 +3.8% 4.983e+10 perf-stat.i.branch-instructions 1.475e+08 +3.4% 1.525e+08 perf-stat.i.branch-misses 1.04 -3.6% 1.00 perf-stat.i.cpi 2.702e+11 +3.9% 2.808e+11 perf-stat.i.instructions 0.97 +3.7% 1.00 perf-stat.i.ipc 1.03 -3.6% 1.00 perf-stat.overall.cpi 0.97 +3.7% 1.00 perf-stat.overall.ipc 4.786e+10 +3.8% 4.966e+10 perf-stat.ps.branch-instructions 1.47e+08 +3.4% 1.52e+08 perf-stat.ps.branch-misses 2.693e+11 +3.9% 2.799e+11 perf-stat.ps.instructions 8.15e+13 +3.9% 8.468e+13 perf-stat.total.instructions 42.32 -4.1 38.22 perf-profile.calltrace.cycles-pp.fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 69.75 -2.1 67.63 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 70.28 -2.1 68.17 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 72.20 -2.0 70.23 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 76.46 -1.8 74.67 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll 58.10 -1.5 56.63 perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 94.49 -0.5 93.97 perf-profile.calltrace.cycles-pp.__poll 0.70 +0.0 0.72 perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll 0.92 +0.0 0.95 perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64 0.54 +0.0 0.58 ± 2% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll 1.94 +0.1 2.00 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.68 +0.1 1.74 perf-profile.calltrace.cycles-pp.rep_movs_alternative._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64 2.34 +0.1 2.40 perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.89 +0.1 0.96 perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.85 +0.1 2.98 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__poll 5.86 +0.3 6.16 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__poll 7.26 +0.5 7.72 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__poll 5.00 ± 2% +0.5 5.51 ± 2% perf-profile.calltrace.cycles-pp.testcase 2.27 ± 3% +0.6 2.84 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.__poll 42.25 -2.8 39.42 perf-profile.children.cycles-pp.fdget 58.21 -2.7 55.52 perf-profile.children.cycles-pp.do_poll 70.35 -2.1 68.24 perf-profile.children.cycles-pp.__x64_sys_poll 69.87 -2.1 67.77 perf-profile.children.cycles-pp.do_sys_poll 72.26 -2.0 70.29 perf-profile.children.cycles-pp.do_syscall_64 76.58 -1.8 74.80 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 95.09 -0.5 94.58 perf-profile.children.cycles-pp.__poll 0.71 +0.0 0.73 perf-profile.children.cycles-pp.__virt_addr_valid 0.99 +0.0 1.02 perf-profile.children.cycles-pp.check_heap_object 0.19 ± 2% +0.0 0.22 ± 3% perf-profile.children.cycles-pp.poll_freewait 0.54 +0.0 0.59 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 1.70 +0.1 1.76 perf-profile.children.cycles-pp.rep_movs_alternative 2.09 +0.1 2.14 perf-profile.children.cycles-pp.__check_object_size 0.89 +0.1 0.96 perf-profile.children.cycles-pp.kfree 2.56 +0.1 2.64 perf-profile.children.cycles-pp._copy_from_user 6.28 +0.3 6.60 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 1.30 ± 2% +0.3 1.61 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 3.84 +0.4 4.22 perf-profile.children.cycles-pp.entry_SYSCALL_64 7.33 +0.5 7.79 perf-profile.children.cycles-pp.syscall_return_via_sysret 5.01 ± 2% +0.5 5.52 ± 2% perf-profile.children.cycles-pp.testcase 40.83 -1.6 39.20 perf-profile.self.cycles-pp.fdget 17.11 -1.0 16.07 perf-profile.self.cycles-pp.do_poll 0.16 ± 3% +0.0 0.18 ± 3% perf-profile.self.cycles-pp.poll_freewait 0.65 +0.0 0.68 perf-profile.self.cycles-pp.__virt_addr_valid 0.98 +0.0 1.01 perf-profile.self.cycles-pp._copy_from_user 0.42 ± 2% +0.0 0.46 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 1.54 +0.0 1.58 perf-profile.self.cycles-pp.rep_movs_alternative 1.08 +0.1 1.14 ± 2% perf-profile.self.cycles-pp.__poll 0.88 +0.1 0.95 perf-profile.self.cycles-pp.kfree 4.39 ± 2% +0.2 4.59 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 6.22 +0.3 6.53 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 3.41 +0.4 3.78 perf-profile.self.cycles-pp.entry_SYSCALL_64 7.32 +0.5 7.78 perf-profile.self.cycles-pp.syscall_return_via_sysret 4.82 ± 2% +0.5 5.32 ± 2% perf-profile.self.cycles-pp.testcase Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki