Hi Xiaoyao, On Wed, 16 Jul 2025 at 06:40, Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote: > > On 7/15/2025 5:33 PM, Fuad Tabba wrote: > > Introduce the core infrastructure to enable host userspace to mmap() > > guest_memfd-backed memory. This is needed for several evolving KVM use > > cases: > > > > * Non-CoCo VM backing: Allows VMMs like Firecracker to run guests > > entirely backed by guest_memfd, even for non-CoCo VMs [1]. This > > provides a unified memory management model and simplifies guest memory > > handling. > > > > * Direct map removal for enhanced security: This is an important step > > for direct map removal of guest memory [2]. By allowing host userspace > > to fault in guest_memfd pages directly, we can avoid maintaining host > > kernel direct maps of guest memory. This provides additional hardening > > against Spectre-like transient execution attacks by removing a > > potential attack surface within the kernel. > > > > * Future guest_memfd features: This also lays the groundwork for future > > enhancements to guest_memfd, such as supporting huge pages and > > enabling in-place sharing of guest memory with the host for CoCo > > platforms that permit it [3]. > > > > Therefore, enable the basic mmap and fault handling logic within > > guest_memfd. However, this functionality is not yet exposed to userspace > > and remains inactive until two conditions are met in subsequent patches: > > > > * Kconfig Gate (CONFIG_KVM_GMEM_SUPPORTS_MMAP): A new Kconfig option, > > KVM_GMEM_SUPPORTS_MMAP, is introduced later in this series. > > Well, KVM_GMEM_SUPPORTS_MMAP is actually introduced by *this* patch, not > other patches later. > > > This > > option gates the compilation and availability of this mmap > > functionality at a system level. > > Well, at least from this patch, it doesn't gate the compilation. You're right. This commit changed a bit, and I should have updated the commit message. > > > While the code changes in this patch > > might seem small, the Kconfig option is introduced to explicitly > > signal the intent to enable this new capability and to provide a clear > > compile-time switch for it. It also helps ensure that the necessary > > architecture-specific glue (like kvm_arch_supports_gmem_mmap) is > > properly defined. > > > > * Per-instance opt-in (GUEST_MEMFD_FLAG_MMAP): On a per-instance basis, > > this functionality is enabled by the guest_memfd flag > > GUEST_MEMFD_FLAG_MMAP, which will be set in the KVM_CREATE_GUEST_MEMFD > > ioctl. This flag is crucial because when host userspace maps > > guest_memfd pages, KVM must *not* manage the these memory regions in > > the same way it does for traditional KVM memory slots. The presence of > > GUEST_MEMFD_FLAG_MMAP on a guest_memfd instance allows mmap() and > > faulting of guest_memfd memory to host userspace. Additionally, it > > informs KVM to always consume guest faults to this memory from > > guest_memfd, regardless of whether it is a shared or a private fault. > > This opt-in mechanism ensures compatibility and prevents conflicts > > with existing KVM memory management. This is a per-guest_memfd flag > > rather than a per-memslot or per-VM capability because the ability to > > mmap directly applies to the specific guest_memfd object, regardless > > of how it might be used within various memory slots or VMs. > > > > [1] https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding > > [2] https://lore.kernel.org/linux-mm/cc1bb8e9bc3e1ab637700a4d3defeec95b55060a.camel@xxxxxxxxxx > > [3] https://lore.kernel.org/all/c1c9591d-218a-495c-957b-ba356c8f8e09@xxxxxxxxxx/T/#u > > > > Reviewed-by: Gavin Shan <gshan@xxxxxxxxxx> > > Reviewed-by: Shivank Garg <shivankg@xxxxxxx> > > Acked-by: David Hildenbrand <david@xxxxxxxxxx> > > Co-developed-by: Ackerley Tng <ackerleytng@xxxxxxxxxx> > > Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx> > > Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx> > > --- > > include/linux/kvm_host.h | 13 +++++++ > > include/uapi/linux/kvm.h | 1 + > > virt/kvm/Kconfig | 4 +++ > > virt/kvm/guest_memfd.c | 73 ++++++++++++++++++++++++++++++++++++++++ > > 4 files changed, 91 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 1ec71648824c..9ac21985f3b5 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -740,6 +740,19 @@ static inline bool kvm_arch_supports_gmem(struct kvm *kvm) > > } > > #endif > > > > +/* > > + * Returns true if this VM supports mmap() in guest_memfd. > > + * > > + * Arch code must define kvm_arch_supports_gmem_mmap if support for guest_memfd > > + * is enabled. > > It describes the similar requirement as kvm_arch_has_private_mem and > kvm_arch_supports_gmem, but it doesn't have the check of > > && !IS_ENABLED(CONFIG_KVM_GMEM) > > So it's straightforward for people to wonder why. > > I would suggest just adding the check of !IS_ENABLED(CONFIG_KVM_GMEM) > like what for kvm_arch_has_private_mem and kvm_arch_supports_gmem. So it > will get compilation error if any ARCH enables CONFIG_KVM_GMEM without > defining kvm_arch_supports_gmem_mmap. Thanks! /fuad > > > + */ > > +#if !defined(kvm_arch_supports_gmem_mmap) > > +static inline bool kvm_arch_supports_gmem_mmap(struct kvm *kvm) > > +{ > > + return false; > > +} > > +#endif > > +