Hi James, On Fri, 16 May 2025 at 21:22, James Houghton <jthoughton@xxxxxxxxxx> wrote: > > On Tue, May 13, 2025 at 11:37 AM Ackerley Tng <ackerleytng@xxxxxxxxxx> wrote: > > > > Fuad Tabba <tabba@xxxxxxxxxx> writes: > > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > > > index 6db515833f61..8e6d1866b55e 100644 > > > --- a/virt/kvm/guest_memfd.c > > > +++ b/virt/kvm/guest_memfd.c > > > @@ -312,7 +312,88 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) > > > return gfn - slot->base_gfn + slot->gmem.pgoff; > > > } > > > > > > +#ifdef CONFIG_KVM_GMEM_SHARED_MEM > > > + > > > +static bool kvm_gmem_supports_shared(struct inode *inode) > > > +{ > > > + uint64_t flags = (uint64_t)inode->i_private; > > > + > > > + return flags & GUEST_MEMFD_FLAG_SUPPORT_SHARED; > > > +} > > > + > > > +static vm_fault_t kvm_gmem_fault_shared(struct vm_fault *vmf) > > > +{ > > > + struct inode *inode = file_inode(vmf->vma->vm_file); > > > + struct folio *folio; > > > + vm_fault_t ret = VM_FAULT_LOCKED; > > > + > > > + filemap_invalidate_lock_shared(inode->i_mapping); > > > + > > > + folio = kvm_gmem_get_folio(inode, vmf->pgoff); > > > + if (IS_ERR(folio)) { > > > + int err = PTR_ERR(folio); > > > + > > > + if (err == -EAGAIN) > > > + ret = VM_FAULT_RETRY; > > > + else > > > + ret = vmf_error(err); > > > + > > > + goto out_filemap; > > > + } > > > + > > > + if (folio_test_hwpoison(folio)) { > > > + ret = VM_FAULT_HWPOISON; > > > + goto out_folio; > > > + } > > nit: shmem_fault() does not include an equivalent of the above > HWPOISON check, and __do_fault() already handles HWPOISON. > > It's very unlikely for `folio` to be hwpoison and not up-to-date, and > even then, writing over poison (to zero the folio) is not usually > fatal. No strong preference, but the fact the it's still possible (even if unlikely) makes me lean towards keeping it. > > > + > > > + if (WARN_ON_ONCE(folio_test_large(folio))) { > > > + ret = VM_FAULT_SIGBUS; > > > + goto out_folio; > > > + } > > nit: I would prefer we remove this SIGBUS bit and change the below > clearing logic to handle large folios. Up to you I suppose. No strong preference here either. This is meant as a way to point out the lack of hugepage support, based on suggestions from a previous spin of this series. > > > + > > > + if (!folio_test_uptodate(folio)) { > > > + clear_highpage(folio_page(folio, 0)); > > > + kvm_gmem_mark_prepared(folio); > > > + } > > > + > > > + vmf->page = folio_file_page(folio, vmf->pgoff); > > > + > > > +out_folio: > > > + if (ret != VM_FAULT_LOCKED) { > > > + folio_unlock(folio); > > > + folio_put(folio); > > > + } > > > + > > > +out_filemap: > > > + filemap_invalidate_unlock_shared(inode->i_mapping); > > > > Do we need to hold the filemap_invalidate_lock while zeroing? Would > > holding the folio lock be enough? > > Do we need to hold the filemap_invalidate_lock for reading *at all*? > > I don't see why we need it. We're not checking gmem->bindings, and > filemap_grab_folio() already synchronizes with filemap removal > properly. Ack. Thanks! /fuad > > > > > + > > > + return ret; > > > +}