Re: [PATCH] sched/fair: Only increment deadline once on yield

Alexander Graf <graf@xxxxxxxxxx> · Sun, 13 Apr 2025 20:38:00 +0200

On 01.04.25 14:36, Fernand Sieber wrote:
If a task yields, the scheduler may decide to pick it again. The task in
turn may decide to yield immediately or shortly after, leading to a tight
loop of yields.

If there's another runnable task as this point, the deadline will be
increased by the slice at each loop. This can cause the deadline to runaway
pretty quickly, and subsequent elevated run delays later on as the task
doesn't get picked again. The reason the scheduler can pick the same task
again and again despite its deadline increasing is because it may be the
only eligible task at that point.

Fix this by updating the deadline only to one slice ahead.

Note, we might want to consider iterating on the implementation of yield as
follow up:
* the yielding task could be forfeiting its remaining slice by
   incrementing its vruntime correspondingly
* in case of yield_to the yielding task could be donating its remaining
   slice to the target task

Signed-off-by: Fernand Sieber <sieberf@xxxxxxxxxx>

IMHO it's worth noting that this is not a theoretical issue. We have 
seen this in real life: A KVM virtual machine's vCPU which runs into a 
busy guest spin lock calls kvm_vcpu_yield_to() which eventually ends up 
in the yield_task_fair() function. We have seen such spin locks due to 
guest contention rather than host overcommit, which means we go into a 
loop of vCPU execution and spin loop exit, which results in an 
undesirable increase in the vCPU thread's deadline.

Given this impacts real workloads and is a bug present since the 
introduction of EEVDF, I would say it warrants a

Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling 
policy")

tag.

Alex