On Thu 01-05-25 18:25:26, Lorenzo Stoakes wrote: > During the mmap() of a file-backed mapping, we invoke the underlying driver > file's mmap() callback in order to perform driver/file system > initialisation of the underlying VMA. > > This has been a source of issues in the past, including a significant > security concern relating to unwinding of error state discovered by Jann > Horn, as fixed in commit 5de195060b2e ("mm: resolve faulty mmap_region() > error path behaviour") which performed the recent, significant, rework of > mmap() as a whole. > > However, we have had a fly in the ointment remain - drivers have a great > deal of freedom in the .mmap() hook to manipulate VMA state (as well as > page table state). > > This can be problematic, as we can no longer reason sensibly about VMA > state once the call is complete (the ability to do - anything - here does > rather interfere with that). > > In addition, callers may choose to do odd or unusual things which might > interfere with subsequent steps in the mmap() process, and it may do so and > then raise an error, requiring very careful unwinding of state about which > we can make no assumptions. > > Rather than providing such an open-ended interface, this series provides an > alternative, far more restrictive one - we expose a whitelist of fields > which can be adjusted by the driver, along with immutable state upon which > the driver can make such decisions: > > struct vm_area_desc { > /* Immutable state. */ > struct mm_struct *mm; > unsigned long start; > unsigned long end; > > /* Mutable fields. Populated with initial state. */ > pgoff_t pgoff; > struct file *file; > vm_flags_t vm_flags; > pgprot_t page_prot; > > /* Write-only fields. */ > const struct vm_operations_struct *vm_ops; > void *private_data; > }; > > The mmap logic then updates the state used to either merge with a VMA or > establish a new VMA based upon this logic. > > This is achieved via new file hook .mmap_prepare(), which is, importantly, > invoked very early on in the mmap() process. > > If an error arises, we can very simply abort the operation with very little > unwinding of state required. Looks sensible. So is there a plan to transform existing .mmap hooks to .mmap_prepare hooks? I agree that for most filesystems this should be just easy 1:1 replacement and AFAIU this would be prefered? Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR