On 9/9/2025 10:18 PM, Sean Christopherson wrote:
On Tue, Sep 09, 2025, Binbin Wu wrote:
On 8/22/2025 3:05 PM, Yan Zhao wrote:
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 6784aaaced87..de2c4bb36069 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
* blocked by TDs, false positives are inevitable i.e., KVM may re-enter
* the guest even if the IRQ/NMI can't be delivered.
*
+ * Breaking out of the local retries if a retry is caused by faulting
+ * in an invalid memslot (indicating the slot is under removal), so that
+ * the slot removal will not be blocked due to waiting for releasing
+ * SRCU lock in the VMExit handler.
+ *
* Note: even without breaking out of local retries, zero-step
* mitigation may still occur due to
* - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT,
@@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
* handle retries locally in their EPT violation handlers.
*/
while (1) {
+ struct kvm_memory_slot *slot;
+
ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
if (ret != RET_PF_RETRY || !local_retry)
@@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
break;
}
+ slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa));
+ if (slot && slot->flags & KVM_MEMSLOT_INVALID)
The slot couldn't be NULL here, right?
Uh, hmm. It could be NULL. If the memslot deletion starts concurrently with the
S-EPT violation, then the memslot could be transitioned to INVALID (prepared for
deletion) prior to the vCPU acquiring SRCU after the VM-Exit. Memslot deletion
could then assign to kvm->memslots with a NULL memslot.
vCPU DELETE
S-EPT Violation
Set KVM_MEMSLOT_INVALID
synchronize_srcu_expedited()
Acquire SRCU
__vmx_handle_ept_violation()
RET_PF_RETRY due to INVALID
Set memslot NULL
kvm_vcpu_gfn_to_memslot()
Got it, thanks!