On Wed, Jun 25, 2025 at 06:23:37PM +0800, Chen Yu <yu.c.chen@xxxxxxxxx> wrote: > [Problem Statement] > Currently, NUMA balancing is configured system-wide. > However, in some production environments, different > cgroups may have varying requirements for NUMA balancing. > Some cgroups are CPU-intensive, while others are > memory-intensive. Some do not benefit from NUMA balancing > due to the overhead associated with VMA scanning, while > others prefer NUMA balancing as it helps improve memory > locality. In this case, system-wide NUMA balancing is > usually disabled to avoid causing regressions. > > [Proposal] > Introduce a per-cgroup interface to enable NUMA balancing > for specific cgroups. The balancing works with task granularity already and this new attribute is not much of a resource to control. Have you considered a per-task attribute? (sched_setattr(), prctl() or similar) That one could be inherited and respective cgroups would be seeded with a process with intended values. And cpuset could be traditionally used to restrict the scope of balancing of such tasks. WDYT? > This interface is associated with the CPU subsystem, which > does not support threaded subtrees, and close to CPU bandwidth > control. (??) does support > The system administrator needs to set the NUMA balancing mode to > NUMA_BALANCING_CGROUP=4 to enable this feature. When the system is in > NUMA_BALANCING_CGROUP mode, NUMA balancing for all cgroups is disabled > by default. After the administrator enables this feature for a > specific cgroup, NUMA balancing for that cgroup is enabled. How much dynamic do you such changes to be? In relation to given cgroup's/process's lifecycle. Thanks, Michal
Attachment:
signature.asc
Description: PGP signature