Before the change: ./run-proc-vs-map.sh --nsamples 100 --rawdata -- --busyduration 2 0.011 0.008 0.521 0.011 0.008 0.552 0.011 0.008 0.590 0.011 0.008 0.660 ... 0.011 0.015 2.987 0.011 0.015 3.038 0.011 0.016 3.431 0.011 0.016 4.707 After the change: ./run-proc-vs-map.sh --nsamples 100 --rawdata -- --busyduration 2 0.006 0.005 0.026 0.006 0.005 0.029 0.006 0.005 0.034 0.006 0.005 0.035 ... 0.006 0.006 0.421 0.006 0.006 0.423 0.006 0.006 0.439 0.006 0.006 0.608 The patchset also adds a number of tests to check for /proc/pid/maps data coherency. They are designed to detect any unexpected data tearing while performing some common address space modifications (vma split, resize and remap). Even before these changes, reading /proc/pid/maps might have inconsistent data because the file is read page-by-page with mmap_lock being dropped between the pages. An example of user-visible inconsistency can be that the same vma is printed twice: once before it was modified and then after the modifications. For example if vma was extended, it might be found and reported twice. What is not expected is to see a gap where there should have been a vma both before and after modification. This patchset increases the chances of such tearing, therefore it's even more important now to test for unexpected inconsistencies. In [3] Lorenzo identified the following possible vma merging/splitting scenarios: Merges with changes to existing vmas: 1 Merge both - mapping a vma over another one and between two vmas which can be merged after this replacement; 2. Merge left full - mapping a vma at the end of an existing one and completely over its right neighbor; 3. Merge left partial - mapping a vma at the end of an existing one and partially over its right neighbor; 4. Merge right full - mapping a vma before the start of an existing one and completely over its left neighbor; 5. Merge right partial - mapping a vma before the start of an existing one and partially over its left neighbor; Merges without changes to existing vmas: 6. Merge both - mapping a vma into a gap between two vmas which can be merged after the insertion; 7. Merge left - mapping a vma at the end of an existing one; 8. Merge right - mapping a vma before the start end of an existing one; Splits 9. Split with new vma at the lower address; 10. Split with new vma at the higher address; If such merges or splits happen concurrently with the /proc/maps reading we might report a vma twice, once before the modification and once after it is modified: Case 1 might report overwritten and previous vma along with the final merged vma; Case 2 might report previous and the final merged vma; Case 3 might cause us to retry once we detect the temporary gap caused by shrinking of the right neighbor; Case 4 might report overritten and the final merged vma; Case 5 might cause us to retry once we detect the temporary gap caused by shrinking of the left neighbor; Case 6 might report previous vma and the gap along with the final marged vma; Case 7 might report previous and the final merged vma; Case 8 might report the original gap and the final merged vma covering the gap; Case 9 might cause us to retry once we detect the temporary gap caused by shrinking of the original vma at the vma start; Case 10 might cause us to retry once we detect the temporary gap caused by shrinking of the original vma at the vma end; In all these cases the retry mechanism prevents us from reporting possible temporary gaps. Changes since v6 [4]: - Updated patch 7/8 changelog, per Lorenzo Stoakes - Added comments, per Lorenzo Stoakes - Added Reviewed-by, per Lorenzo Stoakes and Liam Howlett - Replaced iter with vmi, per Lorenzo Stoakes - Renamed from lock_vma_under_mmap_lock() to lock_next_vma_under_mmap_lock(), per Lorenzo Stoakes - Renamed lock_next_vma() parameter from addr to from_addr - Renamed labels in lock_next_vma() to reflect fallback cases, per Lorenzo Stoakes - Handle vma_start_read_locked() failure inside lock_next_vma_under_mmap_lock() and added fallback_to_mmap_lock() for that, per Vlastimil Babka - Added missing vma_iter_init() after re-entering rcu read section inside lock_next_vma(), per Vlastimil Babka - Replaced vma_iter_init() with vma_iter_set(), per Liam Howlett - Removed the last patch converting PROCMAP_QUERY to use per-vma locks. That patch will be posted separately, per David Hildenbrand, Vlastimil Babka and Liam Howlett - Updated performance numbers, per Paul E. McKenney !!! NOTES FOR APPLYING THE PATCHSET !!! Applies cleanly over mm-unstable after reverting v6 version of this patchset (from 2771a4b86aa1 to a20b00f7cf33 in mm-unstable). [1] https://lore.kernel.org/all/20250418174959.1431962-1-surenb@xxxxxxxxxx/ [2] https://github.com/paulmckrcu/proc-mmap_sem-test [3] https://lore.kernel.org/all/e1863f40-39ab-4e5b-984a-c48765ffde1c@lucifer.local/ [4] https://lore.kernel.org/all/20250704060727.724817-1-surenb@xxxxxxxxxx/ Suren Baghdasaryan (7): selftests/proc: add /proc/pid/maps tearing from vma split test selftests/proc: extend /proc/pid/maps tearing test to include vma resizing selftests/proc: extend /proc/pid/maps tearing test to include vma remapping selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified selftests/proc: add verbose more for tests to facilitate debugging fs/proc/task_mmu: remove conversion of seq_file position to unsigned fs/proc/task_mmu: read proc/pid/maps under per-vma lock fs/proc/internal.h | 5 + fs/proc/task_mmu.c | 155 +++- include/linux/mmap_lock.h | 11 + mm/madvise.c | 3 +- mm/mmap_lock.c | 93 ++ tools/testing/selftests/proc/.gitignore | 1 + tools/testing/selftests/proc/Makefile | 1 + tools/testing/selftests/proc/proc-maps-race.c | 829 ++++++++++++++++++ 8 files changed, 1082 insertions(+), 16 deletions(-) create mode 100644 tools/testing/selftests/proc/proc-maps-race.c -- 2.50.0.727.gbf7dc18ff4-goog