On Mon, May 19, 2025 at 09:11:10PM +0200, David Hildenbrand wrote: > On 19.05.25 21:02, Lorenzo Stoakes wrote: > > On Mon, May 19, 2025 at 08:04:22PM +0200, David Hildenbrand wrote: > > > > > > > > +/* > > > > > + * Are we guaranteed no driver can change state such as to preclude KSM merging? > > > > > + * If so, let's set the KSM mergeable flag early so we don't break VMA merging. > > > > > + * > > > > > + * This is applicable when PR_SET_MEMORY_MERGE has been set on the mm_struct via > > > > > + * prctl() causing newly mapped VMAs to have the KSM mergeable VMA flag set. > > > > > + * > > > > > + * If this is not the case, then we set the flag after considering mergeability, > > > > > + * which will prevent mergeability as, when PR_SET_MEMORY_MERGE is set, a new > > > > > + * VMA will not have the KSM mergeability VMA flag set, but all other VMAs will, > > > > > + * preventing any merge. > > > > > > > > Hmmm, so an ordinary MAP_PRIVATE of any file (executable etc.) will get > > > > VM_MERGEABLE set but not be able to merge? > > > > > > > > Probably these are not often expected to be merged ... > > > > > > > > Preventing merging should really only happen because of VMA flags that > > > > are getting set: VM_PFNMAP, VM_MIXEDMAP, VM_DONTEXPAND, VM_IO. > > > > > > > > > > > > I am not 100% sure why we bail out on special mappings: all we have to > > > > do is reliably identify anon pages, and we should be able to do that. > > > > > > > > GUP does currently refuses any VM_PFNMAP | VM_IO, and KSM uses GUP, > > > > which might need a tweak then (maybe the solution could be to ... not > > > > use GUP but a folio_walk). > > > > > > Oh, someone called "David" already did that. Nice :) > > > > > > So we *should* be able to drop > > > > > > * VM_PFNMAP: we correctly identify CoWed pages > > > * VM_MIXEDMAP: we correctly identify CoWed pages > > > * VM_IO: should not affect CoWed pages > > > * VM_DONTEXPAND: no idea why that should even matter here > > > > I objected in the other thread but now realise I forgot we're talking about > > MAP_PRIVATE... So we can do the CoW etc. Right. > > > > Then we just need to be able to copy the thing on CoW... but what about > > write-through etc. cache settings? I suppose we don't care once CoW'd... > > Yes. It's ordinary kernel-managed memory. Yeah, it's the CoW'd bit right? So it's fine. > > > > > But is this common enough of a use case to be worth the hassle of checking this > > is all ok? > > The reason I bring it up is because > > 1) Just because some drivers do weird mmap() things, we cannot merge any > MAP_PRIVATE file mappings (except shmem ;) and mmap_prepare). > > 2) The whole "early_ksm" checks/handling would go away, making this patch > significantly simpler :) OK you're starting to convince me... Maybe this isn't such a big deal if the KSM code handles it already anwyay. If you're sure GUP isn't relying on this... it could be an additional patch like: 'remove VM_SPECIAL limitation, KSM can already handle this' And probably we _should_ let any insane driver blow up if they change stupid things they must not change. > > -- > Cheers, > > David / dhildenb >