Le Mon, Aug 11, 2025 at 06:35:26PM +0200, Valentin Schneider a écrit : > On 09/08/25 11:42, Frederic Weisbecker wrote: > > nohz_full was introduced in v3.10 in 2013, which means this > > documentation is overdue for 12 years. > > > > 12 years is not that bad, it's not old enough to drink (legally) yet! ;-) > > > The shoemaker's children always go barefoot. And working on timers > > hasn't made me arriving on time either. > > > > Fortunately Paul wrote a part of the needed documentation a while ago, > > especially concerning nohz_full in Documentation/timers/no_hz.rst and > > also about per-CPU kthreads in > > Documentation/admin-guide/kernel-per-CPU-kthreads.rst > > > > Introduce a new page that gives an overview of CPU isolation in general. > > > > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx> > > --- > > Documentation/admin-guide/cpu-isolation.rst | 338 ++++++++++++++++++++ > > Documentation/admin-guide/index.rst | 1 + > > 2 files changed, 339 insertions(+) > > create mode 100644 Documentation/admin-guide/cpu-isolation.rst > > > > diff --git a/Documentation/admin-guide/cpu-isolation.rst b/Documentation/admin-guide/cpu-isolation.rst > > new file mode 100644 > > index 000000000000..250027acf7b2 > > --- /dev/null > > +++ b/Documentation/admin-guide/cpu-isolation.rst > > @@ -0,0 +1,338 @@ > > +============= > > +CPU Isolation > > +============= > > + > > +Introduction > > +============ > > + > > +"CPU Isolation" means leaving a CPU exclusive to a given userspace > ^^^^^^^^^ > Eh I'm being nitpicky, but this doesn't have to be userspace stuff right? > "someone" could e.g. affine some IRQ to an isolated CPU to have the > irqthread run undisturbed there, or somesuch. Good point! > > + > > +Scheduler domain isolation > > +-------------------------- > > + > > +This feature isolates a CPU from the scheduler topology. As a result, > > +the target isn't part of the load balancing. Tasks won't migrate > > +neither from nor to it unless affine explicitly. > ^^^^^^ > s/affine/affined/ Right. > > > +As a side effect the CPU is also isolated from unbound workqueues and > > +unbound kthreads. > > > +Checklist > > +========= > > + > > +You have set up each of the above isolation features but you still > > +observe jitters that trash your workload? Make sure to check a few > > +elements before proceeding. > > + > > +Some of these checklist items are similar to those of real time > > +workloads: > > + > > +- Use mlock() to prevent your pages from being swapped away. Page > > + faults are usually not compatible with jitter sensitive workloads. > > + > > +- Avoid SMT to prevent your hardware thread from being "preempted" > > + by another one. > > + > > +- CPU frequency changes may induce subtle sorts of jitter in a > > + workload. Cpufreq should be used and tuned with caution. > > + > > +- Deep C-states may result in latency issues upon wake-up. If this > > + happens to be a problem, C-states can be limited via kernel boot > > + parameters such as processor.max_cstate or intel_idle.max_cstate. > > + > > Nitpickery again, I know it's not an exhaustive listing, but I'd rather > point to the sysfs cpuidle interface (or just mention it too), since that > means deep C-states can be left enabled for HK CPUs. Yes! > > > Should we also mention BIOS/firmware fuckery like SMIs? > > """ > - Your system may be subject to firmware-originating interrupts - x86 has > System Management Interrupts (SMIs) for example. Check your system BIOS > to disable such interference, and with some luck your vendor will have > a BIOS tuning guidance for low-latency operations. > """ Definetely! > > > +Debugging > > +========= > > + > > +Of course things are never so easy, especially on this matter. > > +Chances are that actual noise will be observed in the aforementioned > > +trace.7 file. > > + > > +The best way to investigate further is to enable finer grained > > +tracepoints such as those of subsystems producing asynchronous > > +events: workqueue, timer, irq_vector, etc... It also can be > > +interesting to enable the tick_stop event to diagnose why the tick is > > +retained when that happens. > > + > > I'd also list the 'ipi_send*' family, although that's emitted from the HK > CPU, not the disturbed isolated CPU. Yeah I can do that. > > > +Some tools may also be useful for higher level analysis: > > + > > +- :ref:`Documentation/tools/rtla/rtla-osnoise.rst <rtla-osnoise>` runs a kernel > > + tracer that analyzes and output a > > + summary of the noises. > > + > > I'd want to point to hwnoise and timerlat as well, so maybe point to > rtla.rst? Good point. Thanks! > > > +- dynticks-testing does something similar but in userspace. It is available > > + at git://git.kernel.org/pub/scm/linux/kernel/git/frederic/dynticks-testing.git > > diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst > > index 259d79fbeb94..b5f1fc7d5290 100644 > > --- a/Documentation/admin-guide/index.rst > > +++ b/Documentation/admin-guide/index.rst > > @@ -94,6 +94,7 @@ likely to be of interest on almost any system. > > > > cgroup-v2 > > cgroup-v1/index > > + cpu-isolation > > cpu-load > > mm/index > > module-signing > > -- > > 2.50.1 > -- Frederic Weisbecker SUSE Labs