Re: [RFC PATCH v2 05/18] KVM: TDX: Drop superfluous page pinning in S-EPT management

Yan Zhao <yan.y.zhao@xxxxxxxxx> · Mon, 1 Sep 2025 09:25:43 +0800

On Sat, Aug 30, 2025 at 03:53:24AM +0800, Edgecombe, Rick P wrote:
> On Thu, 2025-08-28 at 17:06 -0700, Sean Christopherson wrote:
> > From: Yan Zhao <yan.y.zhao@xxxxxxxxx>
> > When S-EPT zapping errors occur, KVM_BUG_ON() is invoked to kick off all
> > vCPUs and mark the VM as dead. Although there is a potential window that a
> > private page mapped in the S-EPT could be reallocated and used outside the
> > VM, the loud warning from KVM_BUG_ON() should provide sufficient debug
> > information.
... 
> Yan, can you clarify what you mean by "there could be a small window"? I'm
> thinking this is a hypothetical window around vm_dead races? Or more concrete? I
> *don't* want to re-open the debate on whether to go with this approach, but I
> think this is a good teaching edge case to settle on how we want to treat
> similar issues. So I just want to make sure we have the justification right.
I think this window isn't hypothetical.

1. SEAMCALL failure in tdx_sept_remove_private_spte().
   KVM_BUG_ON() sets vm_dead and kicks off all vCPUs.
2. guest_memfd invalidation completes. memory is freed.
3. VM gets killed.

After 2, the page is still mapped in the S-EPT, but it could potentially be
reallocated and used outside the VM.

>From the TDX module and hardware's perspective, the mapping in the S-EPT for
this page remains valid. So, I'm uncertain if the TDX module might do something
creative to access the guest page after 2.

Besides, a cache flush after 2 can essentially cause a memory write to the page.
Though we could invoke tdh_phymem_page_wbinvd_hkid() after the KVM_BUG_ON(), the
SEAMCALL itself can fail.