Re: [PATCH v10 13/16] KVM: arm64: Handle guest_memfd-backed guest page faults

David Hildenbrand <david@xxxxxxxxxx> · Wed, 4 Jun 2025 15:33:53 +0200

On 04.06.25 15:30, Fuad Tabba wrote:
Hi David

On Wed, 4 Jun 2025 at 14:17, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 27.05.25 20:02, Fuad Tabba wrote:
Add arm64 support for handling guest page faults on guest_memfd backed
memslots. Until guest_memfd supports huge pages, the fault granule is
restricted to PAGE_SIZE.

Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx>

---

Note: This patch introduces a new function, gmem_abort() rather than
previous attempts at trying to expand user_mem_abort(). This is because
there are many differences in how faults are handled when backed by
guest_memfd vs regular memslots with anonymous memory, e.g., lack of
VMA, and for now, lack of huge page support for guest_memfd. The
function user_mem_abort() is already big and unwieldly, adding more
complexity to it made things more difficult to understand.

Once larger page size support is added to guest_memfd, we could factor
out the common code between these two functions.

---
   arch/arm64/kvm/mmu.c | 89 +++++++++++++++++++++++++++++++++++++++++++-
   1 file changed, 87 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 9865ada04a81..896c56683d88 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1466,6 +1466,87 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
       return vma->vm_flags & VM_MTE_ALLOWED;
   }

+static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
+                       struct kvm_memory_slot *memslot, bool is_perm)

TBH, I have no idea why the existing function is called "_abort". I am
sure there is a good reason :)


The reason is ARM. They're called "memory aborts", see D8.15 Memory
aborts in the ARM ARM:

https://developer.arm.com/documentation/ddi0487/latest/

Warning: PDF is 100mb+ with almost 15k pages :)

+{
+     enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED;
+     enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
+     bool logging, write_fault, exec_fault, writable;
+     struct kvm_pgtable *pgt;
+     struct page *page;
+     struct kvm *kvm;
+     void *memcache;
+     kvm_pfn_t pfn;
+     gfn_t gfn;
+     int ret;
+
+     if (!is_perm) {
+             int min_pages = kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu);
+
+             if (!is_protected_kvm_enabled()) {
+                     memcache = &vcpu->arch.mmu_page_cache;
+                     ret = kvm_mmu_topup_memory_cache(memcache, min_pages);
+             } else {
+                     memcache = &vcpu->arch.pkvm_memcache;
+                     ret = topup_hyp_memcache(memcache, min_pages);
+             }
+             if (ret)
+                     return ret;
+     }
+
+     kvm = vcpu->kvm;
+     gfn = fault_ipa >> PAGE_SHIFT;

These two can be initialized directly above.


I was trying to go with reverse christmas tree order of declarations,
but I'll do that.

Can still do that, no? vcpu and fault_ipa are input parameters, so no 
dependency between them.


+
+     logging = memslot_is_logging(memslot);
+     write_fault = kvm_is_write_fault(vcpu);
+     exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
  > +    VM_BUG_ON(write_fault && exec_fault);

No VM_BUG_ON please.

VM_WARN_ON_ONCE() maybe. Or just handle it along the "Unexpected L2 read
permission error" below cleanly.

I'm following the same pattern as the existing user_mem_abort(), but
I'll change it.

Yeah, there are a lot of BUG_ON thingies that should be reworked.

--
Cheers,

David / dhildenb