reproducer: ------------------------------------------------------------ # create a playground mkdir /tmp/foo mount -t tmpfs none /tmp/foo mount --make-private /tmp/foo cd /tmp/foo # set one-way propagation from A to B mkdir A mkdir B mount -t tmpfs none A mount --make-shared A mount --bind A B mount --make-slave B # A/1 -> B/1, A/1/2 -> B/1/2 mkdir A/1 mount -t tmpfs none A/1 mkdir A/1/2 mount -t tmpfs none A/1/2 # overmount the entire B/1/2 mount -t tmpfs none B/1/2 # make sure it's busy - set a mount at B/1/2/x mkdir B/1/2/x mount -t tmpfs none B/1/2/x stat B/1/x # shouldn't exist umount -l A/1 stat B/1/x # ... and now it does ------------------------------------------------------------ What happens is that mounts on B/1 and B/1/2 had been considered as victims - and taken out, since the overmount on top of B/1/2 overmounted the root of the first mount on B/1/2 and it got reparented - all the way to B/1. Correct behaviour would be to have B/1 left in place and upper B/1/2 to be reparented once. As an aside, that's a catch from several days of attempts to prove correctness of propagate_umount(); I'm still not sure there's nothing else wrong with it. _Maybe_ it's the only problem in there, but reconstructing the proof of correctness has turned out to be a real bitch ;-/ I seriously suspect that a lot of headache comes from trying to combine collecting the full set of potential victims with deciding what can and what can not be taken out - gathering all of them first would simplify things. First pass would've logics similar to your variant, but without __propagate_umount() part[*] After the set is collected, we could go through it, doing the something along the lines of how = 0 for each child in children(m) if child in set continue how = 1 if child is not mounted on root how = 2 break if how == 2 kick_out_ancestors(m) remove m itself from set // needs to cooperate with outer loop else if how == 1 for (p = m; p in set && p is mounted on root; p = p->mnt_parent) ; if p in set kick_out_ancestors(p) else if children(m) is empty && m is not locked // to optimize things a bit commit to unmounting m (again, needs to cooperate with the outer loop) "Cooperate with the outer loop" might mean something like having this per-element work leave removal of its argument to caller and report whether its argument needs to be removed. After that we'd be left with everything still in the set having no out-of-set children that would be obstacles. The only thing that remains after that is MNT_LOCKED and that's as simple as while set is not empty m = first element of set for (p = m; p is locked && p in set; p = p->mnt_parent) ; if p not in set { if p is not committed to unmount remove everything from m to p from set continue } else { p = p->mnt_parent } commit everything from m to p to unmount, removing from set I'll try to put something of that sort together, along with detailed explanation of what it's doing - in D/f/*, rather than buring it in commit messages, and along with "read and update D/f/... if you are ever touch this function" in fs/pnode.c itself; this fun is not something I would like to repeat several years down the road ;-/ We *REALLY* need a good set of regression tests for that stuff. If you have anything along those lines sitting somewhere, please post a reference. The same goes for everybody else who might have something in that general area. [*] well, that and with fixed skip_propagation_subtree() logics; it's easier to combine it with propagation_next() rather than trying to set the things up so that the next call of propagation_next() would DTRT - it's less work and yours actually has a corner case if the last element of ->mnt_slave_list has non-empty ->mnt_slave_list itself.