On Mon, Apr 28, 2025 at 07:53:18PM +0100, Al Viro wrote: > Up to you; propagation calculations *are* hard-serialized (on namespace_sem) > and changing that is too much pain to consider, so I have no problem with > globals in that specific case (note several such in propagate_mnt() > machinery; that was a deliberate decision to avoid shitloads of arguments > that would have to be passed around otherwise), but... OK, now I finally understand what felt fishy about either solution. Back when the checks had been IS_MNT_NEW, we were guaranteed that anything on the the slave lists of new mount would be new as well. No amount of copy_tree() could change ->mnt_master of existing mounts, so anything predating the beginning of propagate_mnt() would still have ->mnt_master pointing to old mounts - no operations other than copy_tree() had been done since we have taken namespace_sem. That's where your IS_MNT_PROPAGATED breaks. It mixes "nothing useful to be found in this direction" with "don't mount anything on this one". And these are not the same now. Suppose you have mounts A, B and C, A propagating to B, B - to C. If you made B private, propagation would go directly from A to C, and mount on A/foo would result in a copy on C/foo. Suppose you've done open_tree B with OPEN_TREE_CLONE before making B private. After open_tree your propagation graph is A -> [B <-> B'] -> C with new mount B' being in your anon_ns. Making B private leaves you with A -> B' -> C and mount on A/foo still propagates to C/foo, along with foo in your anon_ns. So far, so good, but what happens if you move_mount the root of your anon_ns to A/foo? Sure, you want to suppress copying it to foo in B', but you will end suppressing the copy on C/foo as well. propagation_next() will not visit C at all - when it reaches B', it'll see IS_MNT_PROPAGATED and refuse to look what B' might be propagating to. IOW, IS_MNT_PROPAGATED in propagate_one() is fine, but in propagation_next(), skip_propagation_subtree() and next_group() we really need IS_MNT_NEW. And the check in propagate_one() should be /* skip ones added by this propagate_mnt() */ if (IS_MNT_NEW(m)) return 0; /* skip if mountpoint is outside of subtree seen in m */ if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root)) return 0; /* skip if m is in the anon_ns we are emptying */ if (m->mnt_ns->mntns_flags & MNTNS_PROPAGATING) return 0; That part of check is really about the validity of this particular location, not the cutoff for further propagation. IS_MNT_NEW(), OTOH, is a hard cutoff. FWIW, I would take the last remaining IS_MNT_PROPAGATED() (in propagation_would_overmount()) as discussed in this thread - with - if (propagation_would_overmount(parent_mnt_to, mnt_from, mp)) + if (check_mnt(mnt_from) && + propagation_would_overmount(parent_mnt_to, mnt_from, mp)) in can_move_mount_beneath() and lose the one in propagation_would_overmount() I'll cook something along those lines (on top of "do_move_mount(): don't leak MNTNS_PROPAGATING on failures") and if it survives overnight tests post it tomorrow^Win the morning...