RE: [EXT] Re: Large(ish) variance induced by SCHED_FIFO

Rui Sousa <rui.sousa@xxxxxxx> · Mon, 8 Sep 2025 16:52:50 +0000

Hi Marc,

Just replace your last setup script line with:
echo -1 > /proc/sys/kernel/sched_rt_runtime_us

?
You are running a realtime task for a long while, but you have rt throttling enabled. At some point it will trigger, and you have no control of when. I believe you are seeing regular tasks
Being scheduled in the middle of the benchmark measurement.

Disabling throttling may block your entire system while the benchmark is running, so lower the priority a bit and use affinity to land the benchmark process in a core that is not usually used?

Br
Rui

> -----Original Message-----
> From: Marc Gonzalez <marc.w.gonzalez@xxxxxxx>
> Sent: Monday, September 8, 2025 17:43
> To: John Ogness <john.ogness@xxxxxxxxxxxxx>; Leon Woestenberg
> <leon@xxxxxxxxxxxxxx>; Daniel Wagner <dwagner@xxxxxxx>
> Cc: linux-rt-users@xxxxxxxxxxxxxxx; Steven Rostedt <rostedt@xxxxxxxxxxx>;
> Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Sebastian Andrzej Siewior
> <bigeasy@xxxxxxxxxxxxx>; Daniel Wagner <daniel.wagner@xxxxxxxx>; Clark
> Williams <williams@xxxxxxxxxx>; Pavel Machek <pavel@xxxxxxx>; Luis
> Goncalves <lgoncalv@xxxxxxxxxx>; John McCalpin <mccalpin@xxxxxxxxxxxxxxx>
> Subject: [EXT] Re: Large(ish) variance induced by SCHED_FIFO
> 
> Caution: This is an external email. Please take care when clicking links or opening
> attachments. When in doubt, report the message using the 'Report this email'
> button
> 
> 
> On 08/09/2025 11:36, John Ogness wrote:
> 
> > There are still reasons why CLOCK_MONOTONIC_RAW might be
> > interesting. For example, if you want a very stable time source to
> > compare intervals, but do not care so much about the real world time
> > lengths of those intervals (i.e. where is the greatest latency vs. what
> > is the value of the greatest latency). Although even here, I doubt
> > CLOCK_MONOTONIC_RAW has a practical advantage over
> CLOCK_MONOTONIC.
> 
> In fact, I'm just trying to compare the run-time of 2 minor
> variations of the same program (testing micro-optimizations).
> 
> Absolute run-time is not really important, what I really want
> to know is: does v2 run faster or slower than v1?
> 
> This is the framework I'm using at this point:
> 
> #include <stdio.h>
> #include <time.h>
> extern void my_code(void);
> 
> static long loop(int log2)
> {
>         int n = 1 << log2;
>         struct timespec t0, t1;
>         clock_gettime(CLOCK_MONOTONIC, &t0);
>         for (int i = 0; i < n; ++i) my_code();
>         clock_gettime(CLOCK_MONOTONIC, &t1);
>         long d = (t1.tv_sec - t0.tv_sec)*1000000000L + (t1.tv_nsec - t0.tv_nsec);
>         long t = d >> log2;
>         return t;
> }
> 
> int main(void)
> {
>         long t, min = loop(4);
>         for (int i = 0; i < 20; ++i)
>                 if ((t = loop(8)) < min) min = t;
>         printf("MIN=%ld\n", min);
>         return 0;
> }
> 
> Basically:
> - warm up the caches
> - run my_code() 256 times && compute average run-time
> - repeat 20 times to find MINIMUM average run-time
> 
> When my_code() is a trivial computational loop such as:
> 
>         mov $(1<<12), %eax
> 1:      dec %ecx
>         dec %ecx
>         dec %eax
>         jnz 1b
>         ret
> 
> Then running the benchmark 1000 times returns the same value 1000 times:
> MIN=2737
> 
> 
> Obviously, the program I'm working on is a bit more complex, but barely:
> - no system calls, no library calls
> - just simple bit twiddling
> - tiny code, tiny data structures, everything fits in L1
> $ size a.out
>    text    data     bss     dec     hex filename
>    8549     632    1072   10253    280d a.out
> 
> When I run the benchmark 1000 times, there are some large outliers:
> MIN_MIN=2502
> MAX_MIN=2774
> 
> NOTE: 95% of the results are below 2536.
> But the top 1% (worst 10) are really bad (2646-2774)
> 
> How to get repeatable results?
> 
> Random 10% outliers break the ability to measure the impact
> of micro-optimizations expected to provide 0-3% improvements :(
> 
> For reference, the script launching the benchmark does:
> 
> echo     -1 > /proc/sys/kernel/sched_rt_runtime_us
> for I in 0 1 2 3; do echo userspace >
> /sys/devices/system/cpu/cpu$I/cpufreq/scaling_governor; done
> sleep 0.25
> for I in 0 1 2 3; do echo   3000000 >
> /sys/devices/system/cpu/cpu$I/cpufreq/scaling_setspeed; done
> sleep 0.25
> 
> for I in $(seq 1 1000); do
> chrt -f 99 taskset -c 2 ./a.out < $1
> done
> 
> for I in 0 1 2 3; do
> echo schedutil > /sys/devices/system/cpu/cpu$I/cpufreq/scaling_governor
> done
> echo 950000 > /proc/sys/kernel/sched_rt_runtime_us
> 
> 
> I've run out of ideas to identify other sources of variance.
> (I ran everything in single user mode with nothing else running.)
> Perhaps with perf I could identify the source of stalls or bubbles?
> 
> Hoping someone can point me in the right direction.
> 
> Regards
> 
>