On Mon, Sep 08, 2025 at 07:36:59PM +0200, David Hildenbrand wrote: > On 08.09.25 17:56, Jason Gunthorpe wrote: > > On Mon, Sep 08, 2025 at 05:50:18PM +0200, David Hildenbrand wrote: > > > > > So in practice there is indeed not a big difference between a private and > > > cow mapping. > > > > Right and most drivers just check SHARED. > > > > But if we are being documentative why they check shared is because the > > driver cannot tolerate COW. > > > > I think if someone is cargo culting a diver and sees > > 'vma_never_cowable' they will have a better understanding of the > > driver side issues. > > > > Driver's don't actually care about private vs shared, except this > > indirectly implies something about cow. > > I recall some corner cases, but yes, most drivers don't clear MAP_MAYWRITE so > is_cow_mapping() would just rule out what they wanted to rule out (no anon > pages / cow semantics). > > FWIW, I recalled some VM_MAYWRITE magic in memfd, but it's really just for > !cow mappings, so the following should likely work: I was invovled in these dark arts :) Since we gate the check_write_seal() function (which is the one that removes VM_MAYWRITE) on the mapping being shared, then obviously we can't remove VM_MAYWRITE in the first place. The only other way VM_MAYWRITE could be got rid of is if it already a MAP_SHARED or MAP_SHARED_VALIDATE mapping without write permission, and then it'd fail this check anyway. So I think the below patch is fine! > > diff --git a/mm/memfd.c b/mm/memfd.c > index 1de610e9f2ea2..2a3aa26444bbb 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -346,14 +346,11 @@ static int check_write_seal(vm_flags_t *vm_flags_ptr) > vm_flags_t vm_flags = *vm_flags_ptr; > vm_flags_t mask = vm_flags & (VM_SHARED | VM_WRITE); > - /* If a private mapping then writability is irrelevant. */ > - if (!(mask & VM_SHARED)) > + /* If a CoW mapping then writability is irrelevant. */ > + if (is_cow_mapping(vm_flags)) > return 0; > - /* > - * New PROT_WRITE and MAP_SHARED mmaps are not allowed when > - * write seals are active. > - */ > + /* New PROT_WRITE mappings are not allowed when write-sealed. */ > if (mask & VM_WRITE) > return -EPERM; > > > -- > Cheers > > David / dhildenb > Cheers, Lorenzo