[linus:master] [pipe] ee5eda8ea5: stress-ng.pipeherd.ops_per_sec 26.0% improvement

kernel test robot <oliver.sang@xxxxxxxxx> · Wed, 4 Jun 2025 22:47:38 +0800

Hello,

kernel test robot noticed a 26.0% improvement of stress-ng.pipeherd.ops_per_sec on:

commit: ee5eda8ea59546af2e8f192c060fbf29862d7cbd ("pipe: change pipe_write() to never add a zero-sized buffer")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: pipeherd
	cpufreq_governor: performance

Details are as below:
-------------------------------------------------------------------------------------------------->

The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250604/202506042255.d1d90443-lkp@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp7/pipeherd/stress-ng/60s

commit: 
  f2ffc48de2 ("Merge patch series "pipe: don't update {a,c,m}time for anonymous pipes"")
  ee5eda8ea5 ("pipe: change pipe_write() to never add a zero-sized buffer")

f2ffc48de2017c69 ee5eda8ea59546af2e8f192c060 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    138055 ± 18%     +33.4%     184128 ± 14%  cpuidle..usage
      0.19 ±160%  +61380.1%     115.68 ±185%  perf-sched.wait_time.avg.ms.__cond_resched.__mutex_lock.constprop.0.anon_pipe_read
      0.20 ±153%  +2.1e+05%     418.07 ±203%  perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.anon_pipe_read
      3930            +7.8%       4235 ±  3%  vmstat.procs.r
   7317310 ±  4%     -14.5%    6253797 ± 10%  vmstat.system.cs
      1025 ± 38%    +135.0%       2409 ± 19%  sched_debug.cfs_rq:/.util_est.avg
   3586492 ±  4%     -15.2%    3040317 ± 10%  sched_debug.cpu.nr_switches.avg
    474729 ± 17%     +49.6%     710035 ± 22%  sched_debug.cpu.nr_switches.stddev
      3.32 ± 46%      +1.5        4.82 ± 13%  perf-profile.calltrace.cycles-pp.perf_mmap__push.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin
      3.32 ± 46%      +1.5        4.82 ± 13%  perf-profile.calltrace.cycles-pp.record__mmap_read_evlist.__cmd_record.cmd_record.run_builtin.handle_internal_command
      3.32 ± 46%      +1.5        4.82 ± 13%  perf-profile.children.cycles-pp.perf_mmap__push
      3.32 ± 46%      +1.5        4.82 ± 13%  perf-profile.children.cycles-pp.record__mmap_read_evlist
      0.18 ± 12%     -41.7%       0.10 ± 31%  stress-ng.pipeherd.context_switches_per_bogo_op
    107177 ±  9%     -28.5%      76592 ± 21%  stress-ng.pipeherd.context_switches_per_sec
 2.331e+09 ±  3%     +26.0%  2.937e+09 ± 10%  stress-ng.pipeherd.ops
  38728105 ±  3%     +26.0%   48797851 ± 10%  stress-ng.pipeherd.ops_per_sec
 4.578e+08 ±  4%     -15.2%  3.882e+08 ± 10%  stress-ng.time.voluntary_context_switches
 3.182e+10            +8.2%  3.443e+10 ±  2%  perf-stat.i.branch-instructions
      0.54 ±  3%      -0.1        0.41 ± 12%  perf-stat.i.branch-miss-rate%
 1.723e+08 ±  3%     -18.5%  1.404e+08 ± 10%  perf-stat.i.branch-misses
      4.74 ± 10%      +0.6        5.31 ±  4%  perf-stat.i.cache-miss-rate%
 2.402e+08           -22.2%  1.869e+08 ±  4%  perf-stat.i.cache-references
   7536250 ±  4%     -15.0%    6402680 ± 10%  perf-stat.i.context-switches
      1.39            -5.3%       1.31 ±  2%  perf-stat.i.cpi
   3691320 ±  6%     -26.5%    2711535 ± 17%  perf-stat.i.cpu-migrations
 1.427e+11            +6.2%  1.515e+11 ±  2%  perf-stat.i.instructions
      0.73            +5.9%       0.77 ±  2%  perf-stat.i.ipc
    175.44 ±  5%     -18.8%     142.50 ± 12%  perf-stat.i.metric.K/sec
      0.54 ±  4%      -0.1        0.41 ± 13%  perf-stat.overall.branch-miss-rate%
      4.43 ± 11%      +0.7        5.18 ±  3%  perf-stat.overall.cache-miss-rate%
      1.37            -5.7%       1.29 ±  2%  perf-stat.overall.cpi
      0.73            +6.1%       0.78 ±  2%  perf-stat.overall.ipc
 3.128e+10            +8.3%  3.386e+10 ±  2%  perf-stat.ps.branch-instructions
 1.692e+08 ±  3%     -18.5%  1.379e+08 ± 10%  perf-stat.ps.branch-misses
 2.362e+08           -22.2%  1.839e+08 ±  4%  perf-stat.ps.cache-references
   7395716 ±  4%     -15.0%    6283076 ± 10%  perf-stat.ps.context-switches
   3621055 ±  6%     -26.5%    2660111 ± 17%  perf-stat.ps.cpu-migrations
 1.403e+11            +6.3%   1.49e+11 ±  2%  perf-stat.ps.instructions
 8.691e+12            +5.3%  9.152e+12 ±  2%  perf-stat.total.instructions

Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki