Hello, kernel test robot noticed a 1.2% regression of aim7.jobs-per-min on: commit: f4818881c47fd91fcb6d62373c57c7844e3de1c0 ("x86/its: Enable Indirect Target Selection mitigation") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master [still regression on linus/master fee3e843b309444f48157e2188efa6818bae85cf] [still regression on linux-next/master 484803582c77061b470ac64a634f25f89715be3f] testcase: aim7 config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory parameters: disk: 4BRD_12G md: RAID1 fs: xfs test: disk_src load: 3000 cpufreq_governor: performance In addition to that, the commit also has significant impact on the following tests: +------------------+--------------------------------------------------------------------------------------------+ | testcase: change | netperf: netperf.Throughput_Mbps 1.8% regression | | test machine | 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory | | test parameters | cluster=cs-localhost | | | cpufreq_governor=performance | | | ip=ipv4 | | | nr_threads=200% | | | runtime=300s | | | test=UDP_STREAM | +------------------+--------------------------------------------------------------------------------------------+ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202505191021.9e9f0ba2-lkp@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250519/202505191021.9e9f0ba2-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase: gcc-12/performance/4BRD_12G/xfs/x86_64-rhel-9.4/3000/RAID1/debian-12-x86_64-20240206.cgz/lkp-csl-2sp3/disk_src/aim7 commit: a75bf27fe4 ("x86/its: Add support for ITS-safe return thunk") f4818881c4 ("x86/its: Enable Indirect Target Selection mitigation") a75bf27fe41abe65 f4818881c47fd91fcb6d62373c5 ---------------- --------------------------- %stddev %change %stddev \ | \ 75730 -1.2% 74795 aim7.jobs-per-min 494.88 +2.2% 505.76 aim7.time.system_time 170491 -1.5% 167881 proc-vmstat.nr_shmem 1436502 +1.8% 1463032 proc-vmstat.pgfree 0.01 ± 26% -54.5% 0.01 ± 38% perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.getname_flags.part.0 3221 ± 10% +24.6% 4014 ± 13% perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 9.35 ± 54% -52.2% 4.47 ± 61% perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_noprof.security_inode_init_security.xfs_generic_create.lookup_open 8.92 ± 36% -48.6% 4.58 ± 54% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.xlog_cil_commit 3221 ± 10% +24.6% 4014 ± 13% perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 1.542e+09 +4.1% 1.606e+09 perf-stat.i.branch-instructions 1.87 -0.1 1.77 perf-stat.i.branch-miss-rate% 31.54 -0.4 31.16 perf-stat.i.cache-miss-rate% 1.63 +1.3% 1.65 perf-stat.i.cpi 0.63 -1.4% 0.62 perf-stat.i.ipc 1.94 -0.1 1.82 perf-stat.overall.branch-miss-rate% 31.70 -0.3 31.45 perf-stat.overall.cache-miss-rate% 1.59 +1.1% 1.61 perf-stat.overall.cpi 0.63 -1.1% 0.62 perf-stat.overall.ipc 1.536e+09 +4.1% 1.599e+09 perf-stat.ps.branch-instructions 987.24 -1.0% 977.35 perf-stat.ps.cpu-migrations 8224 ± 10% +28.5% 10566 sched_debug.cfs_rq:/.avg_vruntime.min 8224 ± 10% +28.5% 10566 sched_debug.cfs_rq:/.min_vruntime.min 144919 ± 7% +17.0% 169577 sched_debug.cpu.clock.avg 144939 ± 7% +17.0% 169596 sched_debug.cpu.clock.max 144898 ± 7% +17.0% 169556 sched_debug.cpu.clock.min 144346 ± 7% +17.0% 168892 sched_debug.cpu.clock_task.avg 144557 ± 7% +17.0% 169113 sched_debug.cpu.clock_task.max 136847 ± 7% +17.8% 161242 sched_debug.cpu.clock_task.min 13899 ± 6% +13.9% 15831 ± 5% sched_debug.cpu.nr_switches.stddev 144899 ± 7% +17.0% 169556 sched_debug.cpu_clk 144339 ± 7% +17.1% 168996 sched_debug.ktime 145461 ± 7% +17.0% 170148 sched_debug.sched_clk 56.39 -0.8 55.60 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry 46.66 -0.7 45.97 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 5.72 -0.3 5.46 ± 2% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call 0.64 ± 3% -0.0 0.60 ± 2% perf-profile.calltrace.cycles-pp.xfs_buf_item_release.xlog_cil_commit.__xfs_trans_commit.xfs_trans_commit.xfs_create 1.10 ± 2% +0.1 1.17 perf-profile.calltrace.cycles-pp.enqueue_task_fair.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue 1.22 ± 3% +0.1 1.29 perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle 1.18 ± 2% +0.1 1.26 perf-profile.calltrace.cycles-pp.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue 4.92 -0.2 4.71 ± 2% perf-profile.children.cycles-pp.intel_idle_irq 10.59 -0.2 10.41 perf-profile.children.cycles-pp.__xfs_trans_commit 10.65 -0.2 10.47 perf-profile.children.cycles-pp.xfs_trans_commit 9.36 -0.1 9.23 perf-profile.children.cycles-pp.xlog_cil_commit 0.22 ± 7% -0.0 0.18 ± 6% perf-profile.children.cycles-pp.xlog_ticket_alloc 0.15 ± 3% -0.0 0.13 ± 2% perf-profile.children.cycles-pp.xfs_buf_rele_cached 0.24 ± 5% +0.0 0.28 ± 9% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler 1.13 +0.0 1.18 ± 2% perf-profile.children.cycles-pp.try_to_block_task 1.40 ± 3% +0.1 1.48 perf-profile.children.cycles-pp.enqueue_task_fair 1.66 ± 2% +0.1 1.75 perf-profile.children.cycles-pp.sched_ttwu_pending 0.00 +0.7 0.70 ± 2% perf-profile.children.cycles-pp.its_return_thunk 4.37 -0.2 4.13 ± 2% perf-profile.self.cycles-pp.intel_idle_irq 0.14 ± 5% -0.0 0.12 ± 6% perf-profile.self.cycles-pp.xfs_trans_precommit_sort 0.06 ± 7% +0.0 0.08 ± 6% perf-profile.self.cycles-pp.__update_blocked_fair 0.09 ± 4% +0.0 0.11 ± 8% perf-profile.self.cycles-pp.enqueue_task_fair 0.70 ± 4% +0.0 0.74 ± 2% perf-profile.self.cycles-pp.xlog_cil_alloc_shadow_bufs 0.00 +0.6 0.56 ± 2% perf-profile.self.cycles-pp.its_return_thunk *************************************************************************************************** lkp-icl-2sp2: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory ========================================================================================= cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase: cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-9.4/200%/debian-12-x86_64-20240206.cgz/300s/lkp-icl-2sp2/UDP_STREAM/netperf commit: a75bf27fe4 ("x86/its: Add support for ITS-safe return thunk") f4818881c4 ("x86/its: Enable Indirect Target Selection mitigation") a75bf27fe41abe65 f4818881c47fd91fcb6d62373c5 ---------------- --------------------------- %stddev %change %stddev \ | \ 2058 ± 4% +11.7% 2298 ± 6% perf-c2c.HITM.local 7436735 -3.0% 7213042 vmstat.system.cs 5.72 ± 49% +3259.9% 192.19 ±185% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 5.72 ± 49% +3259.9% 192.19 ±185% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 3.223e+09 -1.7% 3.17e+09 proc-vmstat.numa_hit 3.222e+09 -1.6% 3.169e+09 proc-vmstat.numa_local 2.574e+10 -1.6% 2.532e+10 proc-vmstat.pgalloc_normal 2.574e+10 -1.6% 2.532e+10 proc-vmstat.pgfree 29701 -2.1% 29079 netperf.ThroughputBoth_Mbps 7574004 -2.0% 7425040 netperf.ThroughputBoth_total_Mbps 8150 -2.9% 7916 netperf.ThroughputRecv_Mbps 21551 -1.8% 21162 netperf.Throughput_Mbps 5495563 -1.7% 5403562 netperf.Throughput_total_Mbps 1.142e+09 -3.1% 1.107e+09 netperf.time.involuntary_context_switches 4.336e+09 -2.0% 4.251e+09 netperf.workload 2.52e+10 +3.4% 2.605e+10 perf-stat.i.branch-instructions 0.88 -0.0 0.83 perf-stat.i.branch-miss-rate% 2.196e+08 -2.2% 2.148e+08 perf-stat.i.branch-misses 7497258 -3.1% 7265561 perf-stat.i.context-switches 58.57 -3.1% 56.76 perf-stat.i.metric.K/sec 0.87 -0.0 0.82 perf-stat.overall.branch-miss-rate% 2.19 +1.1% 2.21 perf-stat.overall.cpi 2.511e+10 +3.4% 2.596e+10 perf-stat.ps.branch-instructions 2.189e+08 -2.2% 2.141e+08 perf-stat.ps.branch-misses 7471654 -3.1% 7240223 perf-stat.ps.context-switches Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki