Re: [PATCH v7 4/9] KVM: guest_memfd: Handle in-place shared memory as guest_memfd backed memory

David Hildenbrand <david@xxxxxxxxxx> · Mon, 14 Apr 2025 21:42:48 +0200

On 14.04.25 18:03, Fuad Tabba wrote:
Hi David,

On Mon, 14 Apr 2025 at 12:51, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 18.03.25 17:18, Fuad Tabba wrote:
For VMs that allow sharing guest_memfd backed memory in-place,
handle that memory the same as "private" guest_memfd memory. This
means that faulting that memory in the host or in the guest will
go through the guest_memfd subsystem.

Note that the word "private" in the name of the function
kvm_mem_is_private() doesn't necessarily indicate that the memory
isn't shared, but is due to the history and evolution of
guest_memfd and the various names it has received. In effect,
this function is used to multiplex between the path of a normal
page fault and the path of a guest_memfd backed page fault.

Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx>
---
   include/linux/kvm_host.h | 3 ++-
   1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 601bbcaa5e41..3d5595a71a2a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -2521,7 +2521,8 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
   #else
   static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn)
   {
-     return false;
+     return kvm_arch_gmem_supports_shared_mem(kvm) &&
+            kvm_slot_can_be_private(gfn_to_memslot(kvm, gfn));
   }
   #endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */


I've been thinking long about this, and was wondering if we should instead
clean up the code to decouple the "private" from gmem handling first.

I know, this was already discussed a couple of times, but faking that
shared memory is private looks odd.

I agree. I've been wanting to do that as part of a separate series,
since renaming discussions sometimes tend to take a disproportionate
amount of time.But the confusion the current naming (and overloading
of terms) is causing is probably worse.

Exactly my thoughts. The cleanup diff I was able to come up with is not 
too crazy, so it feels feasible to just include the cleanups as a 
preparation for mmap() where we introduce the concept of shared memory 
in guest_memfd.



I played with the code to star cleaning this up. I ended up with the following
gmem-terminology  cleanup patches (not even compile tested)

KVM: rename CONFIG_KVM_GENERIC_PRIVATE_MEM to CONFIG_KVM_GENERIC_GMEM_POPULATE
KVM: rename CONFIG_KVM_PRIVATE_MEM to CONFIG_KVM_GMEM
KVM: rename kvm_arch_has_private_mem() to kvm_arch_supports_gmem()
KVM: x86: rename kvm->arch.has_private_mem to kvm->arch.supports_gmem
KVM: rename kvm_slot_can_be_private() to kvm_slot_has_gmem()
KVM: x86: generalize private fault lookups to "gmem" fault lookups

https://github.com/davidhildenbrand/linux/tree/gmem_shared_prep

On top of that, I was wondering if we could look into doing something like
the following. It would also allow for pulling pages out of gmem for
existing SW-protected VMs once they enable shared memory for GMEM IIUC.


diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 08eebd24a0e18..6f878cab0f466 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4495,11 +4495,6 @@ static int kvm_mmu_faultin_pfn_gmem(struct kvm_vcpu *vcpu,
   {
          int max_order, r;

-       if (!kvm_slot_has_gmem(fault->slot)) {
-               kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
-               return -EFAULT;
-       }
-
          r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, &fault->pfn,
                               &fault->refcounted_page, &max_order);
          if (r) {
@@ -4518,8 +4513,19 @@ static int __kvm_mmu_faultin_pfn(struct kvm_vcpu *vcpu,
                                   struct kvm_page_fault *fault)
   {
          unsigned int foll = fault->write ? FOLL_WRITE : 0;
+       bool use_gmem = false;
+
+       if (fault->is_private) {
+               if (!kvm_slot_has_gmem(fault->slot)) {
+                       kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
+                       return -EFAULT;
+               }
+               use_gmem = true;
+       } else if (kvm_slot_has_gmem_with_shared(fault->slot)) {
+               use_gmem = true;
+       }

-       if (fault->is_private)
+       if (use_gmem)
                  return kvm_mmu_faultin_pfn_gmem(vcpu, fault);

          foll |= FOLL_NOWAIT;


That is, we'd not claim that things are private when they are not, but instead
teach the code about shared memory coming from gmem.

There might be some more missing, just throwing it out there if I am completely off.

For me these changes seem to be reasonable all in all. I might want to
suggest a couple of modifications, but I guess the bigger question is
what the KVM maintainers and guest_memfd's main contributors think.

I'm afraid we won't get a reply before we officially send it ...


Also, how do you suggest we go about this? Send out a separate series
first, before continuing with the mapping series? Or have it all as
one big series? It could be something to add to the agenda for
Thursday.

... and ideally it would be part of this series. After all, this series 
shrunk a bit :)

Feel free to use my commits when helpful: they are still missing 
descriptions and probably have other issues. Feel free to turn my SOB 
into a Co-developed-by+SOB and make yourself the author.

Alternatively, let me know and I can polish them up and we can discuss 
what you have in mind (either here or elsewhere).

I'd suggest we go full-steam on this series to finally get it over the 
finish line :)

--
Cheers,

David / dhildenb