Ackerley Tng <ackerleytng@xxxxxxxxxx> writes: > Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> writes: > >> On 5/15/2025 7:41 AM, Ackerley Tng wrote: >>> Track guest_memfd memory's shareability status within the inode as >>> opposed to the file, since it is property of the guest_memfd's memory >>> contents. >>> >>> Shareability is a property of the memory and is indexed using the >>> page's index in the inode. Because shareability is the memory's >>> property, it is stored within guest_memfd instead of within KVM, like >>> in kvm->mem_attr_array. >>> >>> KVM_MEMORY_ATTRIBUTE_PRIVATE in kvm->mem_attr_array must still be >>> retained to allow VMs to only use guest_memfd for private memory and >>> some other memory for shared memory. >>> >>> Not all use cases require guest_memfd() to be shared with the host >>> when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_PRIVATE, >>> which when set on KVM_CREATE_GUEST_MEMFD, initializes the memory as >>> private to the guest, and therefore not mappable by the >>> host. Otherwise, memory is shared until explicitly converted to >>> private. >>> >>> Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx> >>> Co-developed-by: Vishal Annapurve <vannapurve@xxxxxxxxxx> >>> Signed-off-by: Vishal Annapurve <vannapurve@xxxxxxxxxx> >>> Co-developed-by: Fuad Tabba <tabba@xxxxxxxxxx> >>> Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx> >>> Change-Id: If03609cbab3ad1564685c85bdba6dcbb6b240c0f >>> --- >>> Documentation/virt/kvm/api.rst | 5 ++ >>> include/uapi/linux/kvm.h | 2 + >>> virt/kvm/guest_memfd.c | 124 ++++++++++++++++++++++++++++++++- >>> 3 files changed, 129 insertions(+), 2 deletions(-) >>> >>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst >>> index 86f74ce7f12a..f609337ae1c2 100644 >>> --- a/Documentation/virt/kvm/api.rst >>> +++ b/Documentation/virt/kvm/api.rst >>> @@ -6408,6 +6408,11 @@ belonging to the slot via its userspace_addr. >>> The use of GUEST_MEMFD_FLAG_SUPPORT_SHARED will not be allowed for CoCo VMs. >>> This is validated when the guest_memfd instance is bound to the VM. >>> >>> +If the capability KVM_CAP_GMEM_CONVERSIONS is supported, then the 'flags' field >>> +supports GUEST_MEMFD_FLAG_INIT_PRIVATE. >> >> It seems that the sentence is stale? >> Didn't find the definition of KVM_CAP_GMEM_CONVERSIONS. >> > > Thanks. This should read > > If the capability KVM_CAP_GMEM_SHARED_MEM is supported, and > GUEST_MEMFD_FLAG_SUPPORT_SHARED is specified, then the 'flags' field > supports GUEST_MEMFD_FLAG_INIT_PRIVATE. > My bad, saw your other email. Fixing the above: If the capability KVM_CAP_GMEM_CONVERSION is supported, and GUEST_MEMFD_FLAG_SUPPORT_SHARED is specified, then the 'flags' field supports GUEST_MEMFD_FLAG_INIT_PRIVATE. >>> Setting GUEST_MEMFD_FLAG_INIT_PRIVATE >>> +will initialize the memory for the guest_memfd as guest-only and not faultable >>> +by the host. >>> + >> [...] >>> >>> static int kvm_gmem_init_fs_context(struct fs_context *fc) >>> @@ -549,12 +645,26 @@ static const struct inode_operations kvm_gmem_iops = { >>> static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, >>> loff_t size, u64 flags) >>> { >>> + struct kvm_gmem_inode_private *private; >>> struct inode *inode; >>> + int err; >>> >>> inode = alloc_anon_secure_inode(kvm_gmem_mnt->mnt_sb, name); >>> if (IS_ERR(inode)) >>> return inode; >>> >>> + err = -ENOMEM; >>> + private = kzalloc(sizeof(*private), GFP_KERNEL); >>> + if (!private) >>> + goto out; >>> + >>> + mt_init(&private->shareability); >> >> shareability is defined only when CONFIG_KVM_GMEM_SHARED_MEM enabled, should be done within CONFIG_KVM_GMEM_SHARED_MEM . >> >> > > Yes, thank you! Will also update this to only initialize shareability if > (flags & GUEST_MEMFD_FLAG_SUPPORT_SHARED). > >>> + inode->i_mapping->i_private_data = private; >>> + >>> + err = kvm_gmem_shareability_setup(private, size, flags); >>> + if (err) >>> + goto out; >>> + >>> inode->i_private = (void *)(unsigned long)flags; >>> inode->i_op = &kvm_gmem_iops; >>> inode->i_mapping->a_ops = &kvm_gmem_aops; >>> @@ -566,6 +676,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, >>> WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); >>> >>> return inode; >>> + >>> +out: >>> + iput(inode); >>> + >>> + return ERR_PTR(err); >>> } >>> >>> >> [...]