Introduction ------------ Secure AVIC is a new hardware feature in the AMD64 architecture to allow SEV-SNP guests to prevent the hypervisor from generating unexpected interrupts to a vCPU or otherwise violate architectural assumptions around APIC behavior. One of the significant differences from AVIC or emulated x2APIC is that Secure AVIC uses a guest-owned and managed APIC backing page. It also introduces additional fields in both the VMCB and the Secure AVIC backing page to aid the guest in limiting which interrupt vectors can be injected into the guest. Guest APIC Backing Page ----------------------- Each vCPU has a guest-allocated APIC backing page of size 4K, which maintains APIC state for that vCPU. The x2APIC MSRs are mapped at their corresposing x2APIC MMIO offset within the guest APIC backing page. All x2APIC accesses by guest or Secure AVIC hardware operate on this backing page. The backing page should be pinned and NPT entry for it should be always mapped while the corresponding vCPU is running. MSR Accesses ------------ Secure AVIC only supports x2APIC MSR accesses. xAPIC MMIO offset based accesses are not supported. Some of the MSR accesses such as ICR writes (with shorthand equal to self), SELF_IPI, EOI, TPR writes are accelerated by Secure AVIC hardware. Other MSR accesses generate a #VC exception. The #VC exception handler reads/writes to the guest APIC backing page. As guest APIC backing page is accessible to the guest, the Secure AVIC driver code optimizes APIC register access by directly reading/writing to the guest APIC backing page (instead of taking the #VC exception route). In addition to the architected MSRs, following new fields are added to the guest APIC backing page which can be modified directly by the guest: a. ALLOWED_IRR ALLOWED_IRR reg offset indicates the interrupt vectors which the guest allows the hypervisor to send. The combination of host-controlled REQUESTED_IRR vectors (part of VMCB) and ALLOWED_IRR is used by hardware to update the IRR vectors of the Guest APIC backing page. #Offset #bits Description 204h 31:0 Guest allowed vectors 0-31 214h 31:0 Guest allowed vectors 32-63 ... 274h 31:0 Guest allowed vectors 224-255 ALLOWED_IRR is meant to be used specifically for vectors that the hypervisor is allowed to inject, such as device interrupts. Interrupt vectors used exclusively by the guest itself (like IPI vectors) should not be allowed to be injected into the guest for security reasons. b. NMI Request #Offset #bits Description 278h 0 Set by Guest to request Virtual NMI Guest need to set NMI Request register to allow the Hypervisor to inject vNMI to it. LAPIC Timer Support ------------------- LAPIC timer is emulated by the hypervisor. So, APIC_LVTT, APIC_TMICT and APIC_TDCR, APIC_TMCCT APIC registers are not read/written to the guest APIC backing page and are communicated to the hypervisor using SVM_EXIT_MSR VMGEXIT. IPI Support ----------- Only SELF_IPI is accelerated by Secure AVIC hardware. Other IPIs require writing (from the Secure AVIC driver) to the IRR vector of the target CPU backing page and then issuing VMGEXIT for the hypervisor to notify the target vCPU. KEXEC Support ------------- Secure AVIC enabled guest can kexec to another kernel which has Secure AVIC enabled, as the Hypervisor has Secure AVIC feature bit set in the sev_status. Open Points ----------- The Secure AVIC driver only supports physical destination mode. If logical destination mode need to be supported, then a separate x2apic driver would be required for supporting logical destination mode. Testing ------- This series is based on top of commit 4628e5bbca91 "Merge branch into tip/master: 'x86/tdx'" of the tip/tip master branch. Host Secure AVIC support patch series is at [1]. Qemu support patch is at [2]. QEMU commandline for testing Secure AVIC enabled guest: qemu-system-x86_64 <...> -object sev-snp-guest,id=sev0,policy=0xb0000,cbitpos=51, reduced-phys-bits=1,allowed-sev-features=true,secure-avic=true Following tests are done: 1) Boot to Prompt using initramfs and ubuntu fs. 2) Verified timer and IPI as part of the guest bootup. 3) Verified long run SCF TORTURE IPI test. [1] https://github.com/AMDESE/linux-kvm/tree/savic-host-latest [2] https://github.com/AMDESE/qemu/tree/secure-avic Changes since v9 v9: https://lore.kernel.org/lkml/20250811094444.203161-1-Neeraj.Upadhyay@xxxxxxx/ - Commit log updates. - Update comments to be more descriptive. - Various coding style updates. Changes since v8 v8: https://lore.kernel.org/lkml/20250709033242.267892-1-Neeraj.Upadhyay@xxxxxxx/ - Removed KVM lapic refactoring patches which have been included in v6.17-rc1. - Added Tianyu's Reviewed-by's. - Dropped below 2 patches based on review feedback: x86/apic: Unionize apic regs for 32bit/64bit access w/o type casting x86/apic: Simplify bitwise operations on APIC bitmap - Misc cleanups suggested by Boris and Sean. Changes since v7 v7: https://lore.kernel.org/lkml/20250610175424.209796-1-Neeraj.Upadhyay@xxxxxxx/ - Commit log updates. - Applied Reviewed-by and Acked-by. - Combined few patches. Changes since v6 v6: https://lore.kernel.org/lkml/20250514071803.209166-1-Neeraj.Upadhyay@xxxxxxx/ - Restructured the patches to split out function/macro rename into separate patches. - Update commit logs with more details on impact to kvm.ko text size. - Updated the new macros in patch "x86/apic: KVM: Deduplicate APIC vector => register+bit math" to type cast macro parameter to unsigned int. This ensures better code generation for cases where signed int is passed to these macros. With this update, below patches have been removed in this version: x86/apic: Change apic_*_vector() vector param to unsigned x86/apic: Change get/set reg operations reg param to unsigned - Added Tianyu's Reviewed-by's. Changes since v5 v5: https://lore.kernel.org/lkml/20250429061004.205839-1-Neeraj.Upadhyay@xxxxxxx/ - Add back RFC tag due to new changes to share code between KVM's lapic emulation and Secure AVIC. - Minor optimizations to the apic bitwise ops and set/get reg operations. - Other misc fixes, cleanups and refactoring due to code sharing with KVM lapic implementation. Change since v4 v4: https://lore.kernel.org/lkml/20250417091708.215826-1-Neeraj.Upadhyay@xxxxxxx/ - Add separate patch for update_vector() apic callback addition. - Add a cleanup patch for moving apic_update_irq_cfg() calls to apic_update_vector(). - Cleaned up change logs. - Rebased to latest tip/tip master. Resolved merge conflicts due to sev code movement to sev-startup.c in mainline. - Other misc cleanups. Change since v3 v3: https://lore.kernel.org/lkml/20250401113616.204203-1-Neeraj.Upadhyay@xxxxxxx/ - Move KVM updates to a separate patch. - Cleanups to use guard(). - Refactored IPI callbacks addition. - Misc cleanups. Change since v2 v2: https://lore.kernel.org/lkml/20250226090525.231882-1-Neeraj.Upadhyay@xxxxxxx/ - Removed RFC tag. - Change config rule to not select AMD_SECURE_AVIC config if AMD_MEM_ENCRYPT config is enabled. - Fix broken backing page GFP_KERNEL allocation in setup_local_APIC(). Use alloc_percpu() for APIC backing pages allocation during Secure AVIC driver probe. - Remove code to check for duplicate APIC_ID returned by the Hypervisor. Topology evaluation code already does that during boot. - Fix missing update_vector() callback invocation during vector cleanup paths. Invoke update_vector() during setup and tearing down of a vector. - Reuse find_highest_vector() from kvm/lapic.c. - Change savic_register_gpa/savic_unregister_gpa() interface to be invoked only for the local CPU. - Misc cleanups. Change since v1 v1: https://lore.kernel.org/lkml/20240913113705.419146-1-Neeraj.Upadhyay@xxxxxxx/ - Added Kexec support. - Instead of doing a 2M aligned allocation for backing pages, allocate individual PAGE_SIZE pages for vCPUs. - Instead of reading Extended Topology Enumeration CPUID, APIC_ID value is read from Hv and updated in APIC backing page. Hv returned ID is checked for any duplicates. - Propagate all LVT* register reads and writes to Hv. - Check that Secure AVIC control MSR is not intercepted by Hv. - Fix EOI handling for level-triggered interrupts. - Misc cleanups and commit log updates. Kishon Vijay Abraham I (2): x86/sev: Initialize VGIF for secondary vCPUs for Secure AVIC x86/sev: Enable NMI support for Secure AVIC Neeraj Upadhyay (16): x86/apic: Add new driver for Secure AVIC x86/apic: Initialize Secure AVIC APIC backing page x86/apic: Populate .read()/.write() callbacks of Secure AVIC driver x86/apic: Initialize APIC ID for Secure AVIC x86/apic: Add update_vector() callback for APIC drivers x86/apic: Add update_vector() callback for Secure AVIC x86/apic: Add support to send IPI for Secure AVIC x86/apic: Support LAPIC timer for Secure AVIC x86/apic: Add support to send NMI IPI for Secure AVIC x86/apic: Allow NMI to be injected from hypervisor for Secure AVIC x86/apic: Read and write LVT* APIC registers from HV for SAVIC guests x86/apic: Handle EOI writes for Secure AVIC guests x86/apic: Add kexec support for Secure AVIC x86/apic: Enable Secure AVIC in Control MSR x86/sev: Prevent SECURE_AVIC_CONTROL MSR interception for Secure AVIC guests x86/sev: Indicate SEV-SNP guest supports Secure AVIC arch/x86/Kconfig | 13 + arch/x86/boot/compressed/sev.c | 10 +- arch/x86/coco/core.c | 3 + arch/x86/coco/sev/core.c | 103 +++++++ arch/x86/coco/sev/vc-handle.c | 20 +- arch/x86/include/asm/apic.h | 11 + arch/x86/include/asm/apicdef.h | 2 + arch/x86/include/asm/msr-index.h | 9 +- arch/x86/include/asm/sev-internal.h | 2 + arch/x86/include/asm/sev.h | 8 + arch/x86/include/uapi/asm/svm.h | 4 + arch/x86/kernel/apic/Makefile | 1 + arch/x86/kernel/apic/apic.c | 8 + arch/x86/kernel/apic/vector.c | 28 +- arch/x86/kernel/apic/x2apic_savic.c | 427 ++++++++++++++++++++++++++++ include/linux/cc_platform.h | 8 + 16 files changed, 639 insertions(+), 18 deletions(-) create mode 100644 arch/x86/kernel/apic/x2apic_savic.c base-commit: 4628e5bbca916edaf4ed55915ab399f9ba25519f -- 2.34.1