Hello, kernel test robot noticed a 26.0% improvement of stress-ng.pipeherd.ops_per_sec on: commit: ee5eda8ea59546af2e8f192c060fbf29862d7cbd ("pipe: change pipe_write() to never add a zero-sized buffer") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: stress-ng config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 100% testtime: 60s test: pipeherd cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250604/202506042255.d1d90443-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pipeherd/stress-ng/60s commit: f2ffc48de2 ("Merge patch series "pipe: don't update {a,c,m}time for anonymous pipes"") ee5eda8ea5 ("pipe: change pipe_write() to never add a zero-sized buffer") f2ffc48de2017c69 ee5eda8ea59546af2e8f192c060 ---------------- --------------------------- %stddev %change %stddev \ | \ 138055 ± 18% +33.4% 184128 ± 14% cpuidle..usage 0.19 ±160% +61380.1% 115.68 ±185% perf-sched.wait_time.avg.ms.__cond_resched.__mutex_lock.constprop.0.anon_pipe_read 0.20 ±153% +2.1e+05% 418.07 ±203% perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.anon_pipe_read 3930 +7.8% 4235 ± 3% vmstat.procs.r 7317310 ± 4% -14.5% 6253797 ± 10% vmstat.system.cs 1025 ± 38% +135.0% 2409 ± 19% sched_debug.cfs_rq:/.util_est.avg 3586492 ± 4% -15.2% 3040317 ± 10% sched_debug.cpu.nr_switches.avg 474729 ± 17% +49.6% 710035 ± 22% sched_debug.cpu.nr_switches.stddev 3.32 ± 46% +1.5 4.82 ± 13% perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin 3.32 ± 46% +1.5 4.82 ± 13% perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin.handle_internal_command 3.32 ± 46% +1.5 4.82 ± 13% perf-profile.children.cycles-pp.perf_mmap__push 3.32 ± 46% +1.5 4.82 ± 13% perf-profile.children.cycles-pp.record__mmap_read_evlist 0.18 ± 12% -41.7% 0.10 ± 31% stress-ng.pipeherd.context_switches_per_bogo_op 107177 ± 9% -28.5% 76592 ± 21% stress-ng.pipeherd.context_switches_per_sec 2.331e+09 ± 3% +26.0% 2.937e+09 ± 10% stress-ng.pipeherd.ops 38728105 ± 3% +26.0% 48797851 ± 10% stress-ng.pipeherd.ops_per_sec 4.578e+08 ± 4% -15.2% 3.882e+08 ± 10% stress-ng.time.voluntary_context_switches 3.182e+10 +8.2% 3.443e+10 ± 2% perf-stat.i.branch-instructions 0.54 ± 3% -0.1 0.41 ± 12% perf-stat.i.branch-miss-rate% 1.723e+08 ± 3% -18.5% 1.404e+08 ± 10% perf-stat.i.branch-misses 4.74 ± 10% +0.6 5.31 ± 4% perf-stat.i.cache-miss-rate% 2.402e+08 -22.2% 1.869e+08 ± 4% perf-stat.i.cache-references 7536250 ± 4% -15.0% 6402680 ± 10% perf-stat.i.context-switches 1.39 -5.3% 1.31 ± 2% perf-stat.i.cpi 3691320 ± 6% -26.5% 2711535 ± 17% perf-stat.i.cpu-migrations 1.427e+11 +6.2% 1.515e+11 ± 2% perf-stat.i.instructions 0.73 +5.9% 0.77 ± 2% perf-stat.i.ipc 175.44 ± 5% -18.8% 142.50 ± 12% perf-stat.i.metric.K/sec 0.54 ± 4% -0.1 0.41 ± 13% perf-stat.overall.branch-miss-rate% 4.43 ± 11% +0.7 5.18 ± 3% perf-stat.overall.cache-miss-rate% 1.37 -5.7% 1.29 ± 2% perf-stat.overall.cpi 0.73 +6.1% 0.78 ± 2% perf-stat.overall.ipc 3.128e+10 +8.3% 3.386e+10 ± 2% perf-stat.ps.branch-instructions 1.692e+08 ± 3% -18.5% 1.379e+08 ± 10% perf-stat.ps.branch-misses 2.362e+08 -22.2% 1.839e+08 ± 4% perf-stat.ps.cache-references 7395716 ± 4% -15.0% 6283076 ± 10% perf-stat.ps.context-switches 3621055 ± 6% -26.5% 2660111 ± 17% perf-stat.ps.cpu-migrations 1.403e+11 +6.3% 1.49e+11 ± 2% perf-stat.ps.instructions 8.691e+12 +5.3% 9.152e+12 ± 2% perf-stat.total.instructions Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki