On 7/22/2025 8:10 PM, David Hildenbrand wrote: > On 13.07.25 19:43, Shivank Garg wrote: >> This series introduces NUMA-aware memory placement support for KVM guests >> with guest_memfd memory backends. It builds upon Fuad Tabba's work that >> enabled host-mapping for guest_memfd memory [1]. >> >> == Background == >> KVM's guest-memfd memory backend currently lacks support for NUMA policy >> enforcement, causing guest memory allocations to be distributed across host >> nodes according to kernel's default behavior, irrespective of any policy >> specified by the VMM. This limitation arises because conventional userspace >> NUMA control mechanisms like mbind(2) don't work since the memory isn't >> directly mapped to userspace when allocations occur. >> Fuad's work [1] provides the necessary mmap capability, and this series >> leverages it to enable mbind(2). >> >> == Implementation == >> >> This series implements proper NUMA policy support for guest-memfd by: >> >> 1. Adding mempolicy-aware allocation APIs to the filemap layer. >> 2. Introducing custom inodes (via a dedicated slab-allocated inode cache, >> kvm_gmem_inode_info) to store NUMA policy and metadata for guest memory. >> 3. Implementing get/set_policy vm_ops in guest_memfd to support NUMA >> policy. >> >> With these changes, VMMs can now control guest memory placement by mapping >> guest_memfd file descriptor and using mbind(2) to specify: >> - Policy modes: default, bind, interleave, or preferred >> - Host NUMA nodes: List of target nodes for memory allocation >> >> These Policies affect only future allocations and do not migrate existing >> memory. This matches mbind(2)'s default behavior which affects only new >> allocations unless overridden with MPOL_MF_MOVE/MPOL_MF_MOVE_ALL flags (Not >> supported for guest_memfd as it is unmovable by design). >> >> == Upstream Plan == >> Phased approach as per David's guest_memfd extension overview [2] and >> community calls [3]: >> >> Phase 1 (this series): >> 1. Focuses on shared guest_memfd support (non-CoCo VMs). >> 2. Builds on Fuad's host-mapping work. > > Just to clarify: this is based on Fuad's stage 1 and should probably still be > tagged "RFC" until stage-1 is finally upstream. > Sure. > (I was hoping stage-1 would go upstream in 6.17, but I am not sure yet if that is > still feasible looking at the never-ending review) > > I'm surprised to see that > > commit cbe4134ea4bc493239786220bd69cb8a13493190 > Author: Shivank Garg <shivankg@xxxxxxx> > Date: Fri Jun 20 07:03:30 2025 +0000 > > fs: export anon_inode_make_secure_inode() and fix secretmem LSM bypass > was merged with the kvm export > > EXPORT_SYMBOL_GPL_FOR_MODULES(anon_inode_make_secure_inode, "kvm"); > > I thought I commented that this is something to done separately and not really > "fix" material. > > Anyhow, good for this series, no need to touch that. > Yeah, V2 got merged instead of V3. https://lore.kernel.org/all/1ab3381b-1620-485d-8e1b-fff2c48d45c3@xxxxxxx but backporting did not give issues either. Thank you for the reviews :) Best Regards, Shivank