On Fri, May 23, 2025 at 4:53 AM Chao Gao <chao.gao@xxxxxxxxx> wrote: > > Hi Reviewers, > > This series adds support for runtime TDX module updates that preserve > running TDX guests (a.k.a, TD-Preserving updates). The goal is to gather > feedback on the feature design. Please pay attention to the following items: > > 1. TD-Preserving updates are done in stop_machine() context. it copy-pastes > part of multi_cpu_stop() to guarantee step-locked progress on all CPUs. > But, there are a few differences between them. I am wondering whether > these differences have reached a point where abstracting a common > function might do more harm than good. See more details in patch 10. > > 2. P-SEAMLDR seamcalls (specificially SEAMRET from P-SEAMLDR) clear current > VMCS pointers, which may disrupt KVM. To prevent VMX instructions in IRQ > context from encountering NULL current-VMCS pointers, P-SEAMLDR > seamcalls are called with IRQ disabled. I'm uncertain if NMIs could > cause a problem, but I believe they won't. See more information in patch 3. > > 3. Two helpers, cpu_vmcs_load() and cpu_vmcs_store(), are added in patch 3 > to save and restore the current VMCS. KVM has a variant of cpu_vmcs_load(), > i.e., vmcs_load(). Extracting KVM's version would cause a lot of code > churn, and I don't think that can be justified for reducing ~16 LoC > duplication. Please let me know if you disagree. > > == Background == > > Intel TDX isolates Trusted Domains (TDs), or confidential guests, from the > host. A key component of Intel TDX is the TDX module, which enforces > security policies to protect the memory and CPU states of TDs from the > host. However, the TDX module is software that require updates, it is not > device firmware in the typical sense. > > == Problems == > > Currently, the TDX module is loaded by the BIOS at boot time, and the only > way to update it is through a reboot, which results in significant system > downtime. Users expect the TDX module to be updatable at runtime without > disrupting TDX guests. > > == Solution == > > On TDX platforms, P-SEAMLDR[1] is a component within the protected SEAM > range. It is loaded by the BIOS and provides the host with functions to > install a TDX module at runtime. > > Implement a TDX Module update facility via the fw_upload mechanism. Given > that there is variability in which module update to load based on features, > fix levels, and potentially reloading the same version for error recovery > scenarios, the explicit userspace chosen payload flexibility of fw_upload > is attractive. > > This design allows the kernel to accept a bitstream instead of loading a > named file from the filesystem, as the module selection and policy > enforcement for TDX modules are quite complex (see more in patch 8). By > doing so, much of this complexity is shifted out of the kernel. The kernel > need to expose information, such as the TDX module version, to userspace. > The userspace tool must understand the TDX module versioning scheme and > update policy to select the appropriate TDX module (see "TDX Module > Versioning" below). > > In the unlikely event the update fails, for example userspace picks an > incompatible update image, or the image is otherwise corrupted, all TDs > will experience SEAMCALL failures and be killed. The recovery of TD > operation from that event requires a reboot. > > Given there is no mechanism to quiesce SEAMCALLs, the TDs themselves must > pause execution over an update. The most straightforward way to meet the > 'pause TDs while update executes' constraint is to run the update in > stop_machine() context. All other evaluated solutions export more > complexity to KVM, or exports more fragility to userspace. > > == How to test this series == > > # git clone https://github.com/intel/tdx-module-binaries > # cd tdx-module-binaries > # python version_select_and_load.py --update > > > This series is based on Sean's kvm-x86/next branch > > https://github.com/kvm-x86/linux.git next > > > == Other information relevant to TD-Preserving updates == > > === TDX module versioning === > > Each TDX module is assigned a version number x.y.z, where x represents the > "major" version, y the "minor" version, and z the "update" version. > > TD-Preserving updates are restricted to Z-stream releases. > > Note that Z-stream releases do not necessarily guarantee compatibility. A > new release may not be compatible with all previous versions. To address this, > Intel provides a separate file containing compatibility information, which > specifies the minimum module version required for a particular update. This > information is referenced by the tool to determine if two modules are > compatible. > > === TCB Stability === > > Updates change the TCB as viewed by attestation reports. In TDX there is a > distinction between launch-time version and current version where TD-preserving > updates cause that latter version number to change, subject to Z-stream > constraints. The need for runtime updates and the implications of that version > change in the attestation was previously discussed in [3]. > > === TDX Module Distribution Model === > > At a high level, Intel publishes all TDX modules on the github [2], along with > a mapping_file.json which documents the compatibility information about each > TDX module and a script to install the TDX module. OS vendors can package > these modules and distribute them. Administrators install the package and > use the script to select the appropriate TDX module and install it via the > interfaces exposed by this series. > > [1]: https://cdrdv2.intel.com/v1/dl/getContent/733584 > [2]: https://github.com/intel/tdx-module-binaries > [3]: https://lore.kernel.org/all/5d1da767-491b-4077-b472-2cc3d73246d6@xxxxxxxxxx/ > > > Chao Gao (20): > x86/virt/tdx: Print SEAMCALL leaf numbers in decimal > x86/virt/tdx: Prepare to support P-SEAMLDR SEAMCALLs > x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs > x86/virt/tdx: Introduce a "tdx" subsystem and "tsm" device > x86/virt/tdx: Export tdx module attributes via sysfs > x86/virt/seamldr: Add a helper to read P-SEAMLDR information > x86/virt/tdx: Expose SEAMLDR information via sysfs > x86/virt/seamldr: Implement FW_UPLOAD sysfs ABI for TD-Preserving > Updates > x86/virt/seamldr: Allocate and populate a module update request > x86/virt/seamldr: Introduce skeleton for TD-Preserving updates > x86/virt/seamldr: Abort updates if errors occurred midway > x86/virt/seamldr: Shut down the current TDX module > x86/virt/tdx: Reset software states after TDX module shutdown > x86/virt/seamldr: Install a new TDX module > x86/virt/seamldr: Handle TD-Preserving update failures > x86/virt/seamldr: Do TDX cpu init after updates > x86/virt/tdx: Establish contexts for the new module > x86/virt/tdx: Update tdx_sysinfo and check features post-update > x86/virt/seamldr: Verify availability of slots for TD-Preserving > updates > x86/virt/seamldr: Enable TD-Preserving Updates > > Documentation/ABI/testing/sysfs-devices-tdx | 32 ++ > MAINTAINERS | 1 + > arch/x86/Kconfig | 12 + > arch/x86/include/asm/tdx.h | 20 +- > arch/x86/include/asm/tdx_global_metadata.h | 12 + > arch/x86/virt/vmx/tdx/Makefile | 1 + > arch/x86/virt/vmx/tdx/seamldr.c | 443 ++++++++++++++++++++ > arch/x86/virt/vmx/tdx/seamldr.h | 16 + > arch/x86/virt/vmx/tdx/tdx.c | 248 ++++++++++- > arch/x86/virt/vmx/tdx/tdx.h | 12 + > arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 29 ++ > arch/x86/virt/vmx/vmx.h | 40 ++ > 12 files changed, 862 insertions(+), 4 deletions(-) > create mode 100644 Documentation/ABI/testing/sysfs-devices-tdx > create mode 100644 arch/x86/virt/vmx/tdx/seamldr.c > create mode 100644 arch/x86/virt/vmx/tdx/seamldr.h > create mode 100644 arch/x86/virt/vmx/vmx.h > > -- > 2.47.1 > > Tested-by: Sagi Shahar <sagis@xxxxxxxxxx> I was able to update the module while several VMs were running on the machine using a modified version of the tdx selftests. Measuring the update time shows less than 10ms for update regardless of the number of VMs running.