Re: [PATCH 3/6] sched_ext: idle: Introduce the concept of allowed CPUs

Tejun Heo <tj@xxxxxxxxxx> · Sun, 9 Mar 2025 04:56:34 -1000

Hello,

On Sat, Mar 08, 2025 at 07:48:42AM +0100, Andrea Righi wrote:
> > > With this concept the idle CPU selection policy becomes the following:
> > >  - always prioritize CPUs from fully idle SMT cores (if SMT is enabled),
> > >  - select the same CPU if it's idle and in the allowed domain,
> > >  - select an idle CPU within the same LLC domain, if the LLC domain is a
> > >    subset of the allowed domain,
> > 
> > Why not select from the intersection of the same LLC domain and the cpumask?
> 
> We could do that, but to guarantee the intersection we need to introduce
> other temporary cpumasks (one for the LLC intersection and another for the
> NUMA), which is not a big problem, but it can introduce overhead. And most
> of the time the LLC group is either a subset of the allowed CPUs or
> vice-versa, so in this case the current logic already works.
> 
> The extra cpumask work is needed only when the allowed cpumask spans
> multiple partial LLCs, which should be rare. So maybe in such cases, we
> could tolerate the additional overhead of updating an additional temporary
> cpumask to ensure proper hierarchical semantics (maintaining consistency
> with the topology hierarchy). WDYT?

Would just using a pre-allocated cpumask to do pre-and on @cpus_allowed
work? This won't only be used for topology support (e.g. soft partitioning
in scx_layered and scx_mitosis may want to use multi-topology-unit spanning
subsets) and I'm not sure assuming and optimizing for that is a good idea
for generic API.

We can do something simple now. Note that if we want to optimize it, we can
introduce cpumask_any_and_and_distribute(). There already is
cpumask_first_and_and(), so the pattern isn't new and the only extra bitops
we need to add is find_next_and_and_bit_wrap(). There's already
find_first_and_and_bit(), so I don't think it will be all that difficult to
add.

Thanks.

-- 
tejun