On Fri, Mar 21, 2025 at 12:49:42PM +0100, Paolo Bonzini wrote: > On Wed, Mar 19, 2025 at 5:17 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > Yan posted a patch to fudge around the issue[*], I strongly objected (and still > > object) to making a functional and confusing code change to fudge around a lockdep > > false positive. > > In that thread I had made another suggestion, which Yan also tried, > which was to use subclasses: > > - in the sched_out path, which cannot race with the others: > raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu), 1); > > - in the irq and sched_in paths, which can race with each other: > raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); Hi Paolo, Sean, Maxim, The sched_out path still may race with sched_in path. e.g. CPU 0 CPU 1 ----------------- --------------- vCPU 0 sched_out vCPU 1 sched_in vCPU 1 sched_out vCPU 0 sched_in vCPU 0 sched_in may race with vCPU 1 sched_out on CPU 0's wakeup list. So, the situation is sched_in, sched_out: race sched_in, irq: race sched_out, irq: mutual exclusive, do not race Hence, do you think below subclasses assignments reasonable? irq: subclass 0 sched_out: subclass 1 sched_in: subclasses 0 and 1 As inspired by Sean's solution, I made below patch to inform lockdep that the sched_in path involves both subclasses 0 and 1 by adding a line "spin_acquire(&spinlock->dep_map, 1, 0, _RET_IP_)". I like it because it accurately conveys the situation to lockdep :) What are your thoughts? Thanks Yan diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index ec08fa3caf43..c5684225255a 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -89,9 +89,12 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) * current pCPU if the task was migrated. */ if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) { - raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); + raw_spinlock_t *spinlock = &per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu); + raw_spin_lock(spinlock); + spin_acquire(&spinlock->dep_map, 1, 0, _RET_IP_); list_del(&vmx->pi_wakeup_list); - raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); + spin_release(&spinlock->dep_map, _RET_IP_); + raw_spin_unlock(spinlock); } dest = cpu_physical_id(cpu); @@ -152,7 +155,7 @@ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu) local_irq_save(flags); - raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); + raw_spin_lock_nested(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu), 1); list_add_tail(&vmx->pi_wakeup_list, &per_cpu(wakeup_vcpus_on_cpu, vcpu->cpu)); raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu));