On Mon, Aug 04, 2025, dan.j.williams@xxxxxxxxx wrote: > Sean Christopherson wrote: > > On Mon, Aug 04, 2025, dan.j.williams@xxxxxxxxx wrote: > > > Xu Yilun wrote: > > > > So my idea is to remove tdx_tsm device (thus disables tdx_tsm driver) on > > > > vmxoff. > > > > > > > > KVM TDX core TDX TSM driver > > > > ----------------------------------------------------- > > > > tdx_disable() > > > > tdx_tsm dev del > > > > driver.remove() > > > > vmxoff() > > > > > > > > An alternative is to move vmxon/off management out of KVM, that requires > > > > a lot of complex work IMHO, Chao & I both prefer not to touch it. > > > > Eh, it's complex, but not _that_ complex. > > > > > It is fine to require that vmxon/off management remain within KVM, and > > > tie the lifetime of the device to the lifetime of the kvm_intel module*. > > > > Nah, let's do this right. Speaking from experience; horrible, make-your-eyes-bleed > > experience; playing games with kvm-intel.ko to try to get and keep CPUs post-VMXON > > will end in tears. > > > > And it's not just TDX-feature-of-the-day that needs VMXON to be handled outside > > of KVM, I'd also like to do so to allow out-of-tree hypervisors to do the "right > > thing"[*]. Not because I care deeply about out-of-tree hypervisors, but because > > the lack of proper infrastructure for utilizing virtualization hardware irks me. > > > > The basic gist is to extract system-wide resources out of KVM and into a separate > > module, so that e.g. tdx_tsm or whatever can take a dependency on _that_ module > > and elevate refcounts as needed. All things considered, there aren't so many > > system-wide resources that it's an insurmountable task. > > > > I can provide some rough patches to kickstart things. It'll probably take me a > > few weeks to extract them from an old internal branch, and I can't promise they'll > > compile. But they should be good enough to serve as an RFC. > > > > https://lore.kernel.org/all/ZwQjUSOle6sWARsr@xxxxxxxxxx > > Sounds reasonable to me. > > Not clear on how it impacts tdx_tsm implementation. The lifetime of this > tdx_tsm device can still be bound by tdx_enable() / tdx_cleanup(). The > refactor removes the need for the autoprobe hack below. It may also > preclude async vmxoff cases by pinning? Or does pinning still not solve > the reasons for bouncing vmx on suspend/shutdown? What exactly is the concern with suspend/shutdown? Suspend should be a non-issue, as userspace tasks need to be frozen before the kernel fires off the suspend notifiers. Ditto for a normal shutdown. Forced shutdown will be asynchronous with respect to running vCPUs, but all bets are off on a forced shutdown. Ditto for disabling VMX via NMI shootdown on a crash.