On 09/09/2025 14:34, John Ogness wrote: > On 2025-09-09, Marc Gonzalez wrote: > >> I captured a full trace of this occurrence with >> echo *:* > $TRACING/set_event >> cat $TRACING/per_cpu/cpu$CPU/trace > trace.$CPU > > ... > >> But there is absolutely NOTHING traced between 4606.628019 & 4606.811832. > > I recommend using the function tracer: > > # echo function > $TRACING/current_tracer > > It should not affect your timings for pure-userspace code. This should > catch any unintentional kernel activity. OK. This time, I used: echo function > $TRACING/current_tracer cat $TRACING/trace > trace.all (which produced a 12.65 MB file) I dumped a problematic run where T=2722 instead of T=2500 i.e. total run-time = 178.389 ms instead of 163.84 ms Looking for a.out-23683 until the log goes quiet... a.out-23683 [003] d.... 14939.982368: user_enter: a.out-23683 [003] d.... 14939.982368: vtime_user_enter <-__ct_user_enter a.out-23683 [003] d.... 14939.982368: get_vtime_delta <-vtime_user_enter NO OCCURRENCE of a.out or [003] between these 2 points a.out-23683 [003] d.... 14940.160764: vtime_user_exit <-__ct_user_exit a.out-23683 [003] d.... 14940.160765: get_vtime_delta <-vtime_user_exit a.out-23683 [003] d.... 14940.160765: account_user_time <-vtime_user_exit 14940.160764 - 14939.982368 = 178.396 ms (which corresponds to the total run-time) Are the vtime_user_enter/exit tied to clock_gettime? (Probably not, since they're not balanced) It's starting to look like the problem is on the CPU itself, not the kernel interrupting the CPU. Might be time to break out perf? My hunch is frequency dropped to ~2.7 GHz for the duration of the benchmark. Maybe I should lower the frequency to 2 GHz. But again, why would my code force throttling and not the toy code? One possible reason is my code reaches 3.5 IPC, while the toy code remains at 1.5 IPC (with micro-op fusion). It's a stretch, but easy to test. I will measure CPU cycles, to see if the increased run-time corresponds with a change in CPU cycles. Regards