On Sat, Aug 09, 2025 at 11:42:47AM +0200, Frederic Weisbecker wrote: > +- :ref:`Documentation/admin-guide/cgroup-v2.rst <Cpuset v2 "isolated" > + partitions>` > + are recommended because they are tunable at runtime. Anchor link target and name are mistakenly swapped so I have to correct them back to their appropriate places: ---- >8 ---- diff --git a/Documentation/admin-guide/cpu-isolation.rst b/Documentation/admin-guide/cpu-isolation.rst index 250027acf7b26f..aef0b53b0ad5e6 100644 --- a/Documentation/admin-guide/cpu-isolation.rst +++ b/Documentation/admin-guide/cpu-isolation.rst @@ -107,9 +107,8 @@ are extracted from the global load balancing. Interface ~~~~~~~~~ -- :ref:`Documentation/admin-guide/cgroup-v2.rst <Cpuset v2 "isolated" - partitions>` - are recommended because they are tunable at runtime. +- :doc:`cgroup cpuset isolated partitions <cgroup-v2>` are recommended because + they are tunable at runtime. - The 'isolcpus=' kernel boot parameter with the 'domain' flag is a less flexible alternative that doesn't allow for runtime @@ -124,7 +123,8 @@ target CPUs. Interface ~~~~~~~~~ -- /proc/irq/*/smp_affinity as explained :ref:`Documentation/core-api/irq/irq-affinity.rst <here>` in detail. +- /proc/irq/\*/smp_affinity as explained in + Documentation/core-api/irq/irq-affinity.rst. - The "irqaffinity=" kernel boot parameter for a default setting. @@ -330,9 +330,8 @@ retained when that happens. Some tools may also be useful for higher level analysis: -- :ref:`Documentation/tools/rtla/rtla-osnoise.rst <rtla-osnoise>` runs a kernel - tracer that analyzes and output a - summary of the noises. +- :doc:`rtla-osnoise </tools/rtla/rtla-osnoise>` runs a kernel tracer that + analyzes and output a summary of the noises. - dynticks-testing does something similar but in userspace. It is available at git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git > +The full command line is then: > + > + nohz_full=7 irqaffinity=0-6 isolcpus=managed_irq,7 nosmt > + > +CPUSET configuration (cgroup v2) > +-------------------------------- > + > +Assuming cgroup v2 is mounted to /sys/fs/cgroup, the following script > +isolates CPU 7 from scheduler domains. > + > + cd /sys/fs/cgroup > + # Activate the cpuset subsystem > + echo +cpuset > cgroup.subtree_control > + # Create partition to be isolated > + mkdir test > + cd test > + echo +cpuset > cgroup.subtree_control > + # Isolate CPU 7 > + echo 7 > cpuset.cpus > + echo "isolated" > cpuset.cpus.partition > + > +The userspace workload > +---------------------- > + > +Fake a pure userspace workload, the below program runs a dummy > +userspace loop on the isolated CPU 7. > + > + #include <stdio.h> > + #include <fcntl.h> > + #include <unistd.h> > + #include <errno.h> > + int main(void) > + { > + // Move the current task to the isolated cpuset (bind to CPU 7) > + int fd = open("/sys/fs/cgroup/test/cgroup.procs", O_WRONLY); > + if (fd < 0) { > + perror("Can't open cpuset file...\n"); > + return 0; > + } > + > + write(fd, "0\n", 2); > + close(fd); > + > + // Run an endless dummy loop until the launcher kills us > + while (1) > + ; > + > + return 0; > + } > + > +Build it and save for later step: > + > + # gcc user_loop.c -o user_loop > + > +The launcher > +------------ > + > +The below launcher runs the above program for 10 seconds and traces > +the noise resulting from preempting tasks and IRQs. > + > + TRACING=/sys/kernel/tracing/ > + # Make sure tracing is off for now > + echo 0 > $TRACING/tracing_on > + # Flush previous traces > + echo > $TRACING/trace > + # Record disturbance from other tasks > + echo 1 > $TRACING/events/sched/sched_switch/enable > + # Record disturbance from interrupts > + echo 1 > $TRACING/events/irq_vectors/enable > + # Now we can start tracing > + echo 1 > $TRACING/tracing_on > + # Run the dummy user_loop for 10 seconds on CPU 7 > + ./user_loop & > + USER_LOOP_PID=$! > + sleep 10 > + kill $USER_LOOP_PID > + # Disable tracing and save traces from CPU 7 in a file > + echo 0 > $TRACING/tracing_on > + cat $TRACING/per_cpu/cpu7/trace > trace.7 > + > +If no specific problem arose, the output of trace.7 should look like > +the following: > + > + <idle>-0 [007] d..2. 1980.976624: sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=user_loop next_pid=1553 next_prio=120 > + user_loop-1553 [007] d.h.. 1990.946593: reschedule_entry: vector=253 > + user_loop-1553 [007] d.h.. 1990.946593: reschedule_exit: vector=253 Wrap these snippets above in literal code blocks. Thanks. -- An old man doll... just what I always wanted! - Clara