On Wed, May 14, 2025, Amit Shah wrote: > On Tue, 2025-05-13 at 06:28 -0700, Sean Christopherson wrote: > > On Tue, May 13, 2025, Jon Kohler wrote: > > > > On May 12, 2025, at 2:23 PM, Sean Christopherson > > > > This is wrong and unnecessary. As mentioned early, the input that > > > > matters is vmcs12. This flag should *never* be set for vmcs01. > > > > > > I’ll page this back in, but I’m like 75% sure it didn’t work when I > > > did it that way. > > > > Then you had other bugs. The control is per-VMCS and thus needs to > > be emulated > > as such. Definitely holler if you get stuck, there's no need to > > develop this in > > complete isolation. > > Looking at this from the AMD GMET POV, here's how I think support for > this feature for a Windows guest would be implemented: > > * Do not enable the GMET feature in vmcb01. Only the Windows guest (L1 > guest) sets this bit for its own guest (L2 guest). KVM (L0) should see > the bit set in vmcb02 (and vmcb12). OTOH, pass on the CPUID bit to the > L1 guest. > > * KVM needs to propagate the #NPF to Windows (instead of handling > anything itself -- ie no shadow page table adjustments or walks > needed). Windows spawns an L2 guest that causes the #NPF, and Windows > is the one that needs to consume that fault. > > * KVM needs to differentiate an #NPF exit due to GMET or non-GMET > condition -- check the CPL and U/S bits from the exit, and the NX bit > from the PTE that faulted. If due to GMET, propagate it to the guest. > If not, continue handling it Yes, but no. KVM shouldn't need to do anything special here other than teaching update_permission_bitmask() to understand the GMET fault case. Ditto for MBEC. I'd type something up, but I would quickly encounter -ENOCOFFE :-) With the correct mmu->permissions[], permission_fault() will naturally detect that a #NPF (or EPT Violation) from L2 due to a GMET/MBEC violation is a fault in the nNPT/nEPT domain and route the exit to L1. > (btw KVM MMU API question -- from the #NPF, I have the GPA of the L2 > guest. How to go from that guest GPA to look up the NX bit for that > page? I skimmed and there doesn't seem to be an existing API for it - > so is walking the tables the only solution?) As above, KVM doesn't manually look up individual bits while handling faults. The walk of the guest page tables (L1's NPT/EPT for this scenario) performed by FNAME(walk_addr_generic) will gather the effective permissions in walker->pte_access, and check for a permission_fault() after the walk is completed.