## Summary This series introduces support for Intel Mode-Based Execute Control (MBEC) to KVM and nested VMX virtualization, aiming to significantly reduce VMexits and improve performance for Windows guests running with Hypervisor-Protected Code Integrity (HVCI). ## What? Intel MBEC is a hardware feature, introduced in the Kabylake generation, that allows for more granular control over execution permissions. MBEC enables the separation and tracking of execution permissions for supervisor (kernel) and user-mode code. It is used as an accelerator for Microsoft's Memory Integrity [1] (also known as hypervisor-protected code integrity or HVCI). ## Why? The primary reason for this feature is performance. Without hardware-level MBEC, enabling Windows HVCI runs a 'software MBEC' known as Restricted User Mode, which imposes a runtime overhead due to increased state transitions between the guest's L2 root partition and the L2 secure partition for running kernel mode code integrity operations. In practice, this results in a significant number of exits. For example, playing a YouTube video within the Edge Browser produces roughly 1.2 million VMexits/second across an 8 vCPU Windows 11 guest. Most of these exits are VMREAD/VMWRITE operations, which can be emulated with Enlightened VMCS (eVMCS). However, even with eVMCS, this configuration still produces around 200,000 VMexits/second. With MBEC exposed to the L1 Windows Hypervisor, the same scenario results in approximately 50,000 VMexits/second, a *24x* reduction from the baseline. Not a typo, 24x reduction in VMexits. ## How? This series implements core KVM support for exposing the MBEC bit in secondary execution controls (bit 22) to L1 and L2, based on configuration from user space and a module parameter 'enable_pt_guest_exec_control'. The inspiration for this series started with Mickaël's series for Heki [3], where we've extracted, refactored, and extended the MBEC-specific use case to be general-purpose. MBEC, which appears in Linux /proc/cpuinfo as ept_mode_based_exec, splits the EPT exec bit (bit 2 in PTE) into two bits. When secondary execution control bit 22 is set, PTE bit 2 reflects supervisor mode executable, and PTE bit 10 reflects user mode executable. The semantics for EPT violation qualifications also change when MBEC is enabled, with bit 5 reflecting supervisor/kernel mode execute permissions and bit 6 reflecting user mode execute permissions. This ultimately serves to expose this feature to the L1 hypervisor, which consumes MBEC and informs the L2 partitions not to use the software MBEC by removing bit 14 in 0x40000004 EAX [4]. ## Where? Enablement spans both VMX code and MMU code to teach the shadow MMU about the different execution modes, as well as user space VMM to pass secondary execution control bit 22. A patch for QEMU enablement is available [5]. ## Testing Initial testing has been on done on 6.12-based code with: Guests - Windows 11 24H2 26100.2894 - Windows Server 2025 24H2 26100.2894 - Windows Server 2022 W1H2 20348.825 Processors: - Intel Skylake 6154 - Intel Sapphire Rapids 6444Y ## Acknowledgements Special thanks to all contributors and reviewers who have provided valuable feedback and support for this patch series. [1] https://learn.microsoft.com/en-us/windows/security/hardware-security/enable-virtualization-based-protection-of-code-integrity [2] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/nested-virtualization#enlightened-vmcs-intel [3] https://patchwork.kernel.org/project/kvm/patch/20231113022326.24388-6-mic@xxxxxxxxxxx/ [4] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/feature-discovery#implementation-recommendations---0x40000004 [5] https://github.com/JonKohler/qemu/tree/mbec-rfc-v1 Cc: Alexander Grest <Alexander.Grest@xxxxxxxxxxxxx> Cc: Nicolas Saenz Julienne <nsaenz@xxxxxxxxx> Cc: Madhavan T. Venkataraman <madvenka@xxxxxxxxxxxxxxxxxxx> Cc: Mickaël Salaün <mic@xxxxxxxxxxx> Cc: Tao Su <tao1.su@xxxxxxxxxxxxxxx> Cc: Xiaoyao Li <xiaoyao.li@xxxxxxxxx> Cc: Zhao Liu <zhao1.liu@xxxxxxxxx> Jon Kohler (11): KVM: x86: Add module parameter for Intel MBEC KVM: x86: Add pt_guest_exec_control to kvm_vcpu_arch KVM: VMX: Wire up Intel MBEC enable/disable logic KVM: x86/mmu: Remove SPTE_PERM_MASK KVM: VMX: Extend EPT Violation protection bits KVM: x86/mmu: Introduce shadow_ux_mask KVM: x86/mmu: Adjust SPTE_MMIO_ALLOWED_MASK to understand MBEC KVM: x86/mmu: Extend make_spte to understand MBEC KVM: nVMX: Setup Intel MBEC in nested secondary controls KVM: VMX: Allow MBEC with EVMCS KVM: x86: Enable module parameter for MBEC Mickaël Salaün (5): KVM: VMX: add cpu_has_vmx_mbec helper KVM: VMX: Define VMX_EPT_USER_EXECUTABLE_MASK KVM: x86/mmu: Extend access bitfield in kvm_mmu_page_role KVM: VMX: Enhance EPT violation handler for PROT_USER_EXEC KVM: x86/mmu: Extend is_executable_pte to understand MBEC Nikolay Borisov (1): KVM: VMX: Remove EPT_VIOLATIONS_ACC_*_BIT defines Sean Christopherson (1): KVM: nVMX: Decouple EPT RWX bits from EPT Violation protection bits arch/x86/include/asm/kvm_host.h | 13 +++++---- arch/x86/include/asm/vmx.h | 45 ++++++++++++++++++++--------- arch/x86/kvm/mmu.h | 3 +- arch/x86/kvm/mmu/mmu.c | 13 +++++---- arch/x86/kvm/mmu/mmutrace.h | 23 ++++++++++----- arch/x86/kvm/mmu/paging_tmpl.h | 19 +++++++++--- arch/x86/kvm/mmu/spte.c | 51 ++++++++++++++++++++++++++++----- arch/x86/kvm/mmu/spte.h | 36 +++++++++++++++-------- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- arch/x86/kvm/vmx/capabilities.h | 6 ++++ arch/x86/kvm/vmx/hyperv.c | 5 +++- arch/x86/kvm/vmx/hyperv_evmcs.h | 1 + arch/x86/kvm/vmx/nested.c | 4 +++ arch/x86/kvm/vmx/vmx.c | 21 ++++++++++++-- arch/x86/kvm/vmx/vmx.h | 7 +++++ arch/x86/kvm/x86.c | 4 +++ 16 files changed, 192 insertions(+), 61 deletions(-) -- 2.43.0