On Fri, 11 Jul 2025 10:59:39 +0100, "Roy, Patrick" <roypat@xxxxxxxxxxxx> wrote: > > > Hi Fuad, > > On Wed, 2025-07-09 at 11:59 +0100, Fuad Tabba wrote:> -snip- > > +#define KVM_PGTABLE_WALK_MEMABORT_FLAGS (KVM_PGTABLE_WALK_HANDLE_FAULT | KVM_PGTABLE_WALK_SHARED) > > + > > +static int gmem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > + struct kvm_s2_trans *nested, > > + struct kvm_memory_slot *memslot, bool is_perm) > > +{ > > + bool write_fault, exec_fault, writable; > > + enum kvm_pgtable_walk_flags flags = KVM_PGTABLE_WALK_MEMABORT_FLAGS; > > + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; > > + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; > > + struct page *page; > > + struct kvm *kvm = vcpu->kvm; > > + void *memcache; > > + kvm_pfn_t pfn; > > + gfn_t gfn; > > + int ret; > > + > > + ret = prepare_mmu_memcache(vcpu, true, &memcache); > > + if (ret) > > + return ret; > > + > > + if (nested) > > + gfn = kvm_s2_trans_output(nested) >> PAGE_SHIFT; > > + else > > + gfn = fault_ipa >> PAGE_SHIFT; > > + > > + write_fault = kvm_is_write_fault(vcpu); > > + exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); > > + > > + if (write_fault && exec_fault) { > > + kvm_err("Simultaneous write and execution fault\n"); > > + return -EFAULT; > > + } > > + > > + if (is_perm && !write_fault && !exec_fault) { > > + kvm_err("Unexpected L2 read permission error\n"); > > + return -EFAULT; > > + } > > + > > + ret = kvm_gmem_get_pfn(kvm, memslot, gfn, &pfn, &page, NULL); > > + if (ret) { > > + kvm_prepare_memory_fault_exit(vcpu, fault_ipa, PAGE_SIZE, > > + write_fault, exec_fault, false); > > + return ret; > > + } > > + > > + writable = !(memslot->flags & KVM_MEM_READONLY); > > + > > + if (nested) > > + adjust_nested_fault_perms(nested, &prot, &writable); > > + > > + if (writable) > > + prot |= KVM_PGTABLE_PROT_W; > > + > > + if (exec_fault || > > + (cpus_have_final_cap(ARM64_HAS_CACHE_DIC) && > > + (!nested || kvm_s2_trans_executable(nested)))) > > + prot |= KVM_PGTABLE_PROT_X; > > + > > + kvm_fault_lock(kvm); > > Doesn't this race with gmem invalidations (e.g. fallocate(PUNCH_HOLE))? > E.g. if between kvm_gmem_get_pfn() above and this kvm_fault_lock() a > gmem invalidation occurs, don't we end up with stage-2 page tables > refering to a stale host page? In user_mem_abort() there's the "grab > mmu_invalidate_seq before dropping mmap_lock and check it hasnt changed > after grabbing mmu_lock" which prevents this, but I don't really see an > equivalent here. Indeed. We have a similar construct in kvm_translate_vncr() as well, and I'd definitely expect something of the sort 'round here. If for some reason this is not needed, then a comment explaining why would be welcome. But this brings me to another interesting bit: kvm_translate_vncr() is another path that deals with a guest translation fault (despite being caught as an EL2 S1 fault), and calls kvm_faultin_pfn(). What happens when the backing store is gmem? Probably nothin I don't immediately see why NV and gmem should be incompatible, so something must be done on that front too (including the return to userspace if the page is gone). Thanks, M. -- Without deviation from the norm, progress is not possible.