On Wed, May 14, 2025, Fuad Tabba wrote: > On Tue, 13 May 2025 at 21:31, James Houghton <jthoughton@xxxxxxxxxx> wrote: > > > > On Tue, May 13, 2025 at 9:34 AM Fuad Tabba <tabba@xxxxxxxxxx> wrote: > > > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c > > > index 8e6d1866b55e..2f499021df66 100644 > > > --- a/virt/kvm/guest_memfd.c > > > +++ b/virt/kvm/guest_memfd.c > > > @@ -556,6 +556,32 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) > > > return __kvm_gmem_create(kvm, size, flags); > > > } > > > > > > +static bool kvm_gmem_is_same_range(struct kvm *kvm, > > > + struct kvm_memory_slot *slot, > > > + struct file *file, loff_t offset) > > > +{ > > > + struct mm_struct *mm = kvm->mm; > > > + loff_t userspace_addr_offset; > > > + struct vm_area_struct *vma; > > > + bool ret = false; > > > + > > > + mmap_read_lock(mm); > > > + > > > + vma = vma_lookup(mm, slot->userspace_addr); > > > + if (!vma) > > > + goto out; > > > + > > > + if (vma->vm_file != file) > > > + goto out; > > > + > > > + userspace_addr_offset = slot->userspace_addr - vma->vm_start; > > > + ret = userspace_addr_offset + (vma->vm_pgoff << PAGE_SHIFT) == offset; > > > +out: > > > + mmap_read_unlock(mm); > > > + > > > + return ret; > > > +} > > > + > > > int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, > > > unsigned int fd, loff_t offset) > > > { > > > @@ -585,9 +611,14 @@ int kvm_gmem_bind(struct kvm *kvm, struct kvm_memory_slot *slot, > > > offset + size > i_size_read(inode)) > > > goto err; > > > > > > - if (kvm_gmem_supports_shared(inode) && > > > - !kvm_arch_vm_supports_gmem_shared_mem(kvm)) > > > - goto err; > > > + if (kvm_gmem_supports_shared(inode)) { > > > + if (!kvm_arch_vm_supports_gmem_shared_mem(kvm)) > > > + goto err; > > > + > > > + if (slot->userspace_addr && > > > + !kvm_gmem_is_same_range(kvm, slot, file, offset)) > > > + goto err; > > > > This is very nit-picky, but I would rather this not be -EINVAL, maybe > > -EIO instead? Or maybe a pr_warn_once() and let the call proceed? Or just omit the check entirely. The check isn't binding (ba-dump, ching!), because the mapping/VMA can change the instant mmap_read_unlock() is called. > > The userspace_addr we got isn't invalid per se, we're just trying to > > give a hint to the user that their VMAs (or the userspace address they > > gave us) are messed up. I don't really like lumping this in with truly > > invalid arguments. > > I don't mind changing the return error, but I don't think that we > should have a kernel warning (pr_warn_once) for something userspace > can trigger. This isn't a WARN, e.g. won't trip panic_on_warn. In practice, it's not meaningfully different than pr_info(). That said, I agree that printing anything is a bad approach. > It's not an IO error either. I think that this is an invalid argument > (EINVAL). I agree with James, this isn't an invalid argument. Having the validity of an input hinge on the ordering between a KVM ioctl() and mmap() is quite odd. I know KVM arm64 does exactly this for KVM_SET_USER_MEMORY_REGION{,2}, but I don't love the semantics. And unlike that scenario, where e.g. MTE tags are verified again at fault-time, KVM won't re-check the VMA when accessing guest memory via the userspace mapping, e.g. through uaccess. Unless I'm forgetting something, I'm leaning toward omitting the check entirely. > That said, other than opposing the idea of pr_warn, I am happy to change it.