Re: Lockdep failure due to 'wierd' per-cpu wakeup_vcpus_on_cpu_lock lock

Yan Zhao <yan.y.zhao@xxxxxxxxx> · Fri, 28 Mar 2025 08:52:17 +0800



On Thu, Mar 27, 2025 at 04:41:42PM -0700, Sean Christopherson wrote:
> On Thu, Mar 27, 2025, Yan Zhao wrote:
> > On Fri, Mar 21, 2025 at 12:49:42PM +0100, Paolo Bonzini wrote:
> > > On Wed, Mar 19, 2025 at 5:17 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > > > Yan posted a patch to fudge around the issue[*], I strongly objected (and still
> > > > object) to making a functional and confusing code change to fudge around a lockdep
> > > > false positive.
> > > 
> > > In that thread I had made another suggestion, which Yan also tried,
> > > which was to use subclasses:
> > > 
> > > - in the sched_out path, which cannot race with the others:
> > >   raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu), 1);
> > >
> > > - in the irq and sched_in paths, which can race with each other:
> > >   raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));
> > Hi Paolo, Sean, Maxim,
> > 
> > The sched_out path still may race with sched_in path. e.g.
> >     CPU 0                 CPU 1
> > -----------------     ---------------
> > vCPU 0 sched_out
> > vCPU 1 sched_in
> > vCPU 1 sched_out      vCPU 0 sched_in
> > 
> > vCPU 0 sched_in may race with vCPU 1 sched_out on CPU 0's wakeup list.
> > 
> > 
> > So, the situation is
> > sched_in, sched_out: race
> > sched_in, irq:       race
> > sched_out, irq: mutual exclusive, do not race
> > 
> > 
> > Hence, do you think below subclasses assignments reasonable?
> > irq: subclass 0
> > sched_out: subclass 1
> > sched_in: subclasses 0 and 1
> > 
> > As inspired by Sean's solution, I made below patch to inform lockdep that the
> > sched_in path involves both subclasses 0 and 1 by adding a line
> > "spin_acquire(&spinlock->dep_map, 1, 0, _RET_IP_)".
> > 
> > I like it because it accurately conveys the situation to lockdep :)
> 
> Me too :-)
Great!

> Can you give your SoB?  I wrote comments and a changelog to explain to myself
Sure. Thanks for helping on the comments and changelog :)
Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>

> (yet again), what the problem is, and why it's a false positive.  I also want
> to change the local_irq_{save,restore}() into a lockdep assertion in a prep patch,
> because this and the self-IPI trick rely on IRQs being disabled until the task
> is fully scheduled out and the scheduler locks are dopped.
Fair enough.