Hello, kernel test robot noticed a 16.6% improvement of filebench.sum_operations/s on: commit: 26a80762153ba0dc98258b5e6d2e9741178c5114 ("NFSD: Add a Kconfig setting to enable delegated timestamps") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: filebench config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory parameters: disk: 1HDD fs: ext4 fs2: nfsv4 test: webproxy.f cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250529/202505291023.d4c802b1-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase: gcc-12/performance/1HDD/nfsv4/ext4/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/lkp-icl-2sp6/webproxy.f/filebench commit: 87480a8ce5 ("sysctl: Fixes nsm_local_state bounds") 26a8076215 ("NFSD: Add a Kconfig setting to enable delegated timestamps") 87480a8ce567340a 26a80762153ba0dc98258b5e6d2 ---------------- --------------------------- %stddev %change %stddev \ | \ 5114423 -11.9% 4505604 cpuidle..usage 212.82 ± 7% -12.6% 186.03 ± 3% sched_debug.cpu.curr->pid.avg 2922 -7.5% 2703 vmstat.system.in 184.48 ± 9% -17.4% 152.45 ± 5% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 184.41 ± 9% -17.4% 152.39 ± 5% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 0.60 +16.7% 0.70 filebench.sum_bytes_mb/s 7676 +16.6% 8950 filebench.sum_operations 127.92 +16.6% 149.15 filebench.sum_operations/s 33.00 +18.7% 39.17 filebench.sum_reads/s 745.09 -15.3% 630.77 filebench.sum_time_ms/op 7.00 +14.3% 8.00 filebench.sum_writes/s 1465 -6.5% 1370 filebench.time.elapsed_time 1465 -6.5% 1370 filebench.time.elapsed_time.max 41322 +2.3% 42260 filebench.time.voluntary_context_switches 73915 -1.2% 73000 proc-vmstat.nr_dirtied 68893 -4.8% 65555 ± 2% proc-vmstat.nr_inactive_file 21211 -3.3% 20518 proc-vmstat.nr_shmem 73782 -1.5% 72646 proc-vmstat.nr_written 68893 -4.8% 65555 ± 2% proc-vmstat.nr_zone_inactive_file 3148302 -5.2% 2985206 proc-vmstat.numa_hit 3015702 -5.4% 2852716 proc-vmstat.numa_local 3697834 -4.5% 3531265 proc-vmstat.pgalloc_normal 3722673 -6.0% 3499153 proc-vmstat.pgfault 3588865 -4.4% 3429561 proc-vmstat.pgfree 940490 -4.3% 899874 proc-vmstat.pgpgout 162192 -5.9% 152665 proc-vmstat.pgreuse 5.37 -0.1 5.25 perf-stat.i.branch-miss-rate% 1681345 -1.1% 1662216 perf-stat.i.branch-misses 11085725 -5.0% 10535577 perf-stat.i.cache-references 2.25 -1.6% 2.21 perf-stat.i.cpi 2.976e+08 -1.8% 2.922e+08 perf-stat.i.cpu-cycles 0.46 +1.6% 0.47 perf-stat.i.ipc 5.18 -0.1 5.08 perf-stat.overall.branch-miss-rate% 1.87 -2.6% 1.82 perf-stat.overall.cpi 0.54 +2.6% 0.55 perf-stat.overall.ipc 1678730 -1.1% 1659448 perf-stat.ps.branch-misses 11076863 -5.0% 10526572 perf-stat.ps.cache-references 2.972e+08 -1.8% 2.919e+08 perf-stat.ps.cpu-cycles 2.334e+11 -5.8% 2.198e+11 perf-stat.total.instructions 6.43 ± 5% -0.4 6.01 ± 3% perf-profile.children.cycles-pp.__schedule 1.11 ± 11% -0.2 0.87 ± 13% perf-profile.children.cycles-pp.try_to_block_task 0.94 ± 7% -0.2 0.76 ± 9% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string 0.18 ± 14% +0.1 0.27 ± 26% perf-profile.children.cycles-pp.set_pte_range 0.04 ±100% +0.1 0.14 ± 12% perf-profile.children.cycles-pp.xprt_sock_sendmsg 0.13 ± 28% +0.1 0.25 ± 25% perf-profile.children.cycles-pp.devkmsg_read 0.38 ± 14% +0.1 0.50 ± 9% perf-profile.children.cycles-pp.arch_scale_freq_tick 0.03 ±145% +0.1 0.15 ± 16% perf-profile.children.cycles-pp.scnprintf 0.12 ± 27% +0.1 0.24 ± 26% perf-profile.children.cycles-pp.printk_get_next_message 1.45 ± 8% +0.2 1.63 ± 5% perf-profile.children.cycles-pp.copy_process 0.93 ± 6% -0.2 0.74 ± 10% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string 0.21 ± 27% -0.1 0.11 ± 32% perf-profile.self.cycles-pp.hrtimer_update_next_event 0.38 ± 14% +0.1 0.50 ± 9% perf-profile.self.cycles-pp.arch_scale_freq_tick 0.21 ± 20% +0.1 0.34 ± 19% perf-profile.self.cycles-pp.tsc_verify_tsc_adjust Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki