Hello, kernel test robot noticed a 41.2% regression of stress-ng.msg.ops_per_sec on: commit: f76e96bdd6405866e2c9c846baee0d9a0f0ae6b7 ("[PATCH v3 2/3] nfs: Add timecreate to nfs inode") url: https://github.com/intel-lab-lkp/linux/commits/Benjamin-Coddington/Expand-the-type-of-nfs_fattr-valid/20250529-184909 base: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next patch link: https://lore.kernel.org/all/1e3677b0655fa2bbaba0817b41d111d94a06e5ee.1748515333.git.bcodding@xxxxxxxxxx/ patch subject: [PATCH v3 2/3] nfs: Add timecreate to nfs inode testcase: stress-ng config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E CPU @ 2.4GHz (Sierra Forest) with 256G memory parameters: nr_threads: 100% testtime: 60s test: msg cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202506121525.2eac47db-lkp@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250612/202506121525.2eac47db-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp2/msg/stress-ng/60s commit: 252685ecbe ("Expand the type of nfs_fattr->valid") f76e96bdd6 ("nfs: Add timecreate to nfs inode") 252685ecbe596954 f76e96bdd6405866e2c9c846bae ---------------- --------------------------- %stddev %change %stddev \ | \ 5.08e+09 +47.6% 7.498e+09 cpuidle..time 18432946 ± 2% -12.4% 16148461 cpuidle..usage 1806088 ± 2% -12.0% 1589397 meminfo.Active 1806088 ± 2% -12.0% 1589397 meminfo.Active(anon) 495901 ± 3% -17.3% 409985 meminfo.Mapped 1081547 ± 4% -19.3% 872340 meminfo.Shmem 649437 ± 10% -23.0% 500076 ± 9% numa-numastat.node0.local_node 754484 ± 5% -17.9% 619525 ± 4% numa-numastat.node0.numa_hit 748532 ± 9% -39.7% 451340 ± 10% numa-numastat.node1.local_node 841379 ± 5% -37.0% 529826 ± 4% numa-numastat.node1.numa_hit 754539 ± 5% -17.8% 620582 ± 4% numa-vmstat.node0.numa_hit 649492 ± 10% -22.8% 501133 ± 9% numa-vmstat.node0.numa_local 841753 ± 5% -37.0% 529938 ± 4% numa-vmstat.node1.numa_hit 748906 ± 9% -39.7% 451451 ± 10% numa-vmstat.node1.numa_local 44.37 +43.6% 63.73 vmstat.cpu.id 105.22 ± 4% -37.4% 65.88 ± 3% vmstat.procs.r 451750 ± 2% -5.4% 427132 vmstat.system.cs 569382 -19.3% 459463 vmstat.system.in 42.98 +20.1 63.12 mpstat.cpu.all.idle% 0.08 ± 2% +0.0 0.13 mpstat.cpu.all.soft% 53.19 -19.2 33.98 mpstat.cpu.all.sys% 2.89 -1.0 1.92 mpstat.cpu.all.usr% 5.83 ± 37% +342.9% 25.83 ± 21% mpstat.max_utilization.seconds 62.63 -36.7% 39.63 mpstat.max_utilization_pct 1.418e+09 ± 2% -41.2% 8.345e+08 stress-ng.msg.ops 23637039 ± 2% -41.2% 13908394 stress-ng.msg.ops_per_sec 24711 -46.6% 13195 stress-ng.time.involuntary_context_switches 10844 -35.8% 6960 stress-ng.time.percent_of_cpu_this_job_got 6233 -35.8% 4000 stress-ng.time.system_time 304.86 ± 2% -35.4% 196.99 stress-ng.time.user_time 14384284 ± 2% -7.0% 13373296 stress-ng.time.voluntary_context_switches 451179 ± 2% -11.9% 397468 proc-vmstat.nr_active_anon 1157519 -4.5% 1105647 proc-vmstat.nr_file_pages 123991 ± 3% -17.0% 102925 ± 2% proc-vmstat.nr_mapped 270071 ± 4% -19.2% 218199 proc-vmstat.nr_shmem 451179 ± 2% -11.9% 397468 proc-vmstat.nr_zone_active_anon 1599152 -28.0% 1151855 proc-vmstat.numa_hit 1401258 -31.9% 953920 proc-vmstat.numa_local 1653599 -27.8% 1194489 proc-vmstat.pgalloc_normal 1263495 ± 2% -32.5% 853108 proc-vmstat.pgfree 0.72 ± 2% +28.6% 0.93 perf-stat.i.MPKI 1.646e+10 -44.6% 9.125e+09 perf-stat.i.branch-instructions 80522106 -45.3% 44038855 perf-stat.i.branch-misses 10.65 ± 2% +3.5 14.20 perf-stat.i.cache-miss-rate% 59505253 ± 4% -27.2% 43333874 perf-stat.i.cache-misses 5.874e+08 ± 2% -46.3% 3.154e+08 perf-stat.i.cache-references 470043 ± 2% -6.2% 440783 perf-stat.i.context-switches 4.22 ± 2% +15.2% 4.87 perf-stat.i.cpi 3.605e+11 -35.1% 2.339e+11 perf-stat.i.cpu-cycles 121109 -58.4% 50415 perf-stat.i.cpu-migrations 6348 ± 4% -14.1% 5456 perf-stat.i.cycles-between-cache-misses 8.496e+10 -43.7% 4.781e+10 perf-stat.i.instructions 0.24 ± 2% -14.0% 0.21 perf-stat.i.ipc 2.44 ± 2% -6.2% 2.29 perf-stat.i.metric.K/sec 0.70 ± 3% +29.6% 0.91 perf-stat.overall.MPKI 10.11 ± 2% +3.6 13.74 perf-stat.overall.cache-miss-rate% 4.25 ± 2% +15.2% 4.89 perf-stat.overall.cpi 6084 ± 5% -11.3% 5397 perf-stat.overall.cycles-between-cache-misses 0.24 ± 2% -13.2% 0.20 perf-stat.overall.ipc 1.619e+10 -44.6% 8.975e+09 perf-stat.ps.branch-instructions 79190529 -45.3% 43288658 perf-stat.ps.branch-misses 58447379 ± 4% -27.1% 42615000 perf-stat.ps.cache-misses 5.778e+08 ± 2% -46.3% 3.102e+08 perf-stat.ps.cache-references 462305 ± 2% -6.2% 433581 perf-stat.ps.context-switches 3.547e+11 -35.1% 2.3e+11 perf-stat.ps.cpu-cycles 119173 -58.4% 49593 perf-stat.ps.cpu-migrations 8.356e+10 -43.7% 4.702e+10 perf-stat.ps.instructions 5.125e+12 -43.9% 2.878e+12 perf-stat.total.instructions 2152139 -59.4% 874031 sched_debug.cfs_rq:/.avg_vruntime.avg 2600983 ± 3% -50.6% 1284774 ± 4% sched_debug.cfs_rq:/.avg_vruntime.max 2015605 -68.3% 638226 sched_debug.cfs_rq:/.avg_vruntime.min 67727 ± 10% +54.9% 104927 ± 3% sched_debug.cfs_rq:/.avg_vruntime.stddev 0.34 ± 9% -33.9% 0.22 ± 5% sched_debug.cfs_rq:/.h_nr_queued.avg 0.33 ± 8% -33.5% 0.22 ± 5% sched_debug.cfs_rq:/.h_nr_runnable.avg 18436 ± 47% -90.7% 1706 ±140% sched_debug.cfs_rq:/.left_deadline.avg 2128677 ± 4% -84.6% 327700 ±140% sched_debug.cfs_rq:/.left_deadline.max 191419 ± 24% -87.7% 23588 ±140% sched_debug.cfs_rq:/.left_deadline.stddev 18435 ± 47% -90.7% 1706 ±140% sched_debug.cfs_rq:/.left_vruntime.avg 2128515 ± 4% -84.6% 327651 ±140% sched_debug.cfs_rq:/.left_vruntime.max 191406 ± 24% -87.7% 23584 ±140% sched_debug.cfs_rq:/.left_vruntime.stddev 2152139 -59.4% 874031 sched_debug.cfs_rq:/.min_vruntime.avg 2600983 ± 3% -50.6% 1284774 ± 4% sched_debug.cfs_rq:/.min_vruntime.max 2015605 -68.3% 638226 sched_debug.cfs_rq:/.min_vruntime.min 67727 ± 10% +54.9% 104927 ± 3% sched_debug.cfs_rq:/.min_vruntime.stddev 0.33 ± 9% -33.5% 0.22 ± 4% sched_debug.cfs_rq:/.nr_queued.avg 18435 ± 47% -90.7% 1706 ±140% sched_debug.cfs_rq:/.right_vruntime.avg 2128515 ± 4% -84.6% 327651 ±140% sched_debug.cfs_rq:/.right_vruntime.max 191406 ± 24% -87.7% 23584 ±140% sched_debug.cfs_rq:/.right_vruntime.stddev 428.21 ± 4% -34.6% 280.10 ± 3% sched_debug.cfs_rq:/.runnable_avg.avg 1451 ± 6% -30.8% 1004 ± 11% sched_debug.cfs_rq:/.runnable_avg.max 257.16 -24.4% 194.45 ± 3% sched_debug.cfs_rq:/.runnable_avg.stddev 428.09 ± 4% -34.6% 279.93 ± 3% sched_debug.cfs_rq:/.util_avg.avg 1449 ± 6% -30.7% 1004 ± 11% sched_debug.cfs_rq:/.util_avg.max 256.69 -24.3% 194.37 ± 3% sched_debug.cfs_rq:/.util_avg.stddev 131.86 ± 11% -64.9% 46.30 ± 4% sched_debug.cfs_rq:/.util_est.avg 155.10 ± 6% -40.9% 91.59 ± 11% sched_debug.cfs_rq:/.util_est.stddev 627218 ± 4% +14.9% 720808 ± 2% sched_debug.cpu.avg_idle.avg 150048 ± 4% +32.6% 198975 ± 4% sched_debug.cpu.avg_idle.stddev 491.93 +29.8% 638.73 sched_debug.cpu.clock_task.stddev 1640 ± 10% -34.5% 1074 ± 5% sched_debug.cpu.curr->pid.avg 0.00 ± 5% -22.3% 0.00 ± 5% sched_debug.cpu.next_balance.stddev 0.33 ± 9% -34.3% 0.22 ± 5% sched_debug.cpu.nr_running.avg 0.28 ± 13% +38.2% 0.39 ± 2% sched_debug.cpu.nr_uninterruptible.avg 227.17 ± 13% -54.3% 103.92 ± 62% sched_debug.cpu.nr_uninterruptible.max -121.42 +85.0% -224.67 sched_debug.cpu.nr_uninterruptible.min 49.87 ± 2% -48.0% 25.95 ± 24% sched_debug.cpu.nr_uninterruptible.stddev 0.01 ± 17% +37.5% 0.01 ± 5% perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.load_msg.do_msgsnd.do_syscall_64 0.01 ± 8% +34.0% 0.01 ± 9% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.00 ±223% +961.1% 0.03 ± 20% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown] 0.00 ±223% +573.3% 0.02 ± 30% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown] 0.00 ±223% +6233.3% 0.03 ± 63% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown] 0.02 ± 48% +143.6% 0.04 ± 53% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.01 ± 11% +28.2% 0.01 ± 5% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 0.01 ± 9% +288.9% 0.02 ± 8% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.03 ± 82% +166.3% 0.07 ± 61% perf-sched.sch_delay.max.ms.__cond_resched.__kmalloc_node_noprof.load_msg.do_msgsnd.do_syscall_64 0.02 ± 5% +36.3% 0.02 ± 18% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.00 ±113% +388.5% 0.02 ± 41% perf-sched.sch_delay.max.ms.__cond_resched.task_work_run.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe 49.32 ± 37% -59.3% 20.08 ± 10% perf-sched.sch_delay.max.ms.do_msgrcv.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.00 ±223% +1690.0% 0.06 ± 29% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown] 0.00 ±223% +1086.7% 0.03 ± 48% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown] 0.00 ±223% +8666.7% 0.04 ± 80% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown] 0.02 ± 46% +788.0% 0.16 ±104% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.03 ± 25% +113.7% 0.06 ± 21% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown] 37.94 ± 13% -46.7% 20.21 ± 12% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.msgctl_info.constprop 25.94 ± 29% -80.4% 5.08 ± 95% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.sysvipc_proc_start 36.84 ± 24% -43.7% 20.74 ± 15% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.msgctl_down 54.03 ± 27% -60.0% 21.60 ± 10% perf-sched.total_sch_delay.max.ms 3.02 ± 3% +36.3% 4.12 perf-sched.total_wait_and_delay.average.ms 1146750 ± 6% -24.8% 862074 perf-sched.total_wait_and_delay.count.ms 3.01 ± 3% +36.5% 4.11 perf-sched.total_wait_time.average.ms 3.40 ± 4% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 1.49 ± 4% +29.3% 1.92 ± 2% perf-sched.wait_and_delay.avg.ms.do_msgrcv.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 1.21 ± 5% +51.5% 1.84 ± 3% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.msgctl_info.constprop 1.52 ± 5% +51.3% 2.29 ± 3% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.sysvipc_proc_start 1.34 ± 5% +44.2% 1.94 ± 4% perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.msgctl_down 1024 ± 8% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 9.17 ± 34% -58.2% 3.83 ± 41% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 555396 ± 5% -21.4% 436361 perf-sched.wait_and_delay.count.do_msgrcv.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 359319 ± 8% -27.1% 261944 ± 3% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.msgctl_info.constprop 50490 ± 5% -42.7% 28940 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.sysvipc_proc_start 173580 ± 6% -27.8% 125292 ± 4% perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.msgctl_down 1000 -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 1312 ± 8% +31.8% 1730 ± 2% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll 61.94 ± 18% -53.6% 28.77 ± 25% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.msgctl_info.constprop 38.16 ± 25% -80.4% 7.48 ± 41% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.sysvipc_proc_start 48.28 ± 19% -49.4% 24.43 ± 24% perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.msgctl_down 1.43 ± 12% +43.7% 2.06 ± 5% perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_noprof.load_msg.do_msgsnd.do_syscall_64 3.33 ± 4% +7.2% 3.57 perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 1.24 ± 7% +57.8% 1.95 ± 7% perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.49 ±139% +517.5% 3.04 ± 77% perf-sched.wait_time.avg.ms.__cond_resched.task_work_run.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.48 ± 4% +29.5% 1.92 ± 2% perf-sched.wait_time.avg.ms.do_msgrcv.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.00 ±223% +944.4% 0.03 ± 19% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown] 1.38 ± 14% +90.7% 2.64 ± 52% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 1.32 ± 9% +58.5% 2.09 ± 8% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 0.09 ±223% +3002.3% 2.67 ± 31% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown] 0.21 ±223% +1106.0% 2.49 ± 28% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown] 1.13 ± 60% +1476.2% 17.79 ±191% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 1.20 ± 5% +51.9% 1.82 ± 3% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.msgctl_info.constprop 1.50 ± 5% +51.8% 2.28 ± 3% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.sysvipc_proc_start 1.33 ± 5% +44.7% 1.92 ± 4% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.msgctl_down 1.31 ± 10% +80.6% 2.37 ± 20% perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 3.37 ± 17% +27.7% 4.30 ± 7% perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.62 ±154% +2705.3% 17.45 ±173% perf-sched.wait_time.max.ms.__cond_resched.task_work_run.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe 72.86 ± 8% +37.9% 100.47 ± 4% perf-sched.wait_time.max.ms.do_msgrcv.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.00 ±223% +1690.0% 0.06 ± 29% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown] 3.34 ± 19% +53.2% 5.12 ± 22% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 0.09 ±223% +4518.2% 3.98 ± 32% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown] 0.21 ±223% +1573.8% 3.46 ± 15% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown] 1.78 ± 59% +9489.7% 170.25 ±217% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 1312 ± 8% +31.8% 1730 ± 2% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_poll 33.63 ± 17% -50.2% 16.75 ± 25% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.msgctl_info.constprop 21.64 ± 31% -78.1% 4.74 ± 3% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.sysvipc_proc_start 27.46 ± 19% -52.8% 12.95 ± 35% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write.msgctl_down Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki