On 2025/5/19 16:51, Lorenzo Stoakes wrote:
If a user wishes to enable KSM mergeability for an entire process and all fork/exec'd processes that come after it, they use the prctl() PR_SET_MEMORY_MERGE operation. This defaults all newly mapped VMAs to have the VM_MERGEABLE VMA flag set (in order to indicate they are KSM mergeable), as well as setting this flag for all existing VMAs. However it also entirely and completely breaks VMA merging for the process and all forked (and fork/exec'd) processes. This is because when a new mapping is proposed, the flags specified will never have VM_MERGEABLE set. However all adjacent VMAs will already have VM_MERGEABLE set, rendering VMAs unmergeable by default.
Great catch! I'm wondering how about fixing the vma_merge_new_range() to make it mergeable in this case?
To work around this, we try to set the VM_MERGEABLE flag prior to attempting a merge. In the case of brk() this can always be done. However on mmap() things are more complicated - while KSM is not supported for file-backed mappings, it is supported for MAP_PRIVATE file-backed mappings.
So we don't need to set VM_MERGEABLE flag so early, given these corner cases that you described below need to consider.
And these mappings may have deprecated .mmap() callbacks specified which could, in theory, adjust flags and thus KSM merge eligiblity. So we check to determine whether this at all possible. If not, we set VM_MERGEABLE prior to the merge attempt on mmap(), otherwise we retain the previous behaviour. When .mmap_prepare() is more widely used, we can remove this precaution.
Sounds good too.
While this doesn't quite cover all cases, it covers a great many (all anonymous memory, for instance), meaning we should already see a significant improvement in VMA mergeability.
Agree, it's a very good improvement. Thanks!