On Mon, Aug 18, 2025 at 09:14:28PM +0100, Al Viro wrote: > Alternative would be to treat these races as "act as if we'd won and > the other guy had overmounted ours", i.e. *NOT* follow mounts. Again, > for old syscalls that's fine - if another thread has raced with us and > mounted something on top of the place we want to mount on, it could just > as easily have come *after* we'd completed mount(2) and mounted their > stuff on top of ours. If userland is not fine with such outcome, it needs > to provide serialization between the callers. For move_mount(2)... again, > the only real question is empty to_path case. > > Comments? Thinking about it a bit more... Unfortunately, there's another corner case: "." as mountpoint. That would affect that old syscalls as well and I'm not sure that there's no userland code that relies upon the current behaviour. Background: pathname resolution does *NOT* follow mounts on the starting point and it does not follow mounts after "." ; mkdir /tmp/foo ; mount -t tmpfs none /tmp/foo ; cd /tmp/foo ; echo under > a ; cat /tmp/foo/a under ; mount -t tmpfs none /tmp/foo ; cat a under ; cat /tmp/foo/a cat: /tmp/foo/a: no such file or directory ; echo under > b ; cat b under ; cat /tmp/foo/b cat: /tmp/foo/b: no such file or directory ; It's been a bad decision (if it can be called that - it's been more of an accident, AFAICT), but it's decades too late to change it. And interaction with mount is also fun: mount(2) *DOES* follow mounts on the end of any pathname, no matter what. So in case when we are standing in an overmounted directory, ls . will show the contents of that directory, but mount <something> . will mount on top of whatever's mounted there. So the alternative I've mentioned above would change the behaviour of old syscalls in a corner case that just might be actually used in userland code - including the scripts run at the boot time, of all things ;-/ IOW, it probably falls under "can't touch that, no matter how much we'd like to" ;-/ Pity, that... That leaves the question of MOVE_MOUNT_BENEATH with empty pathname - do we want a variant that would say "slide precisely under the opened directory I gave you, no matter what might overmount it"? At the very least this corner case needs to be documented in move_mount(2) - behaviour of move_mount(_, _, dir_fd, "", MOVE_MOUNT_T_EMPTY | MOVE_MOUNT_BENEATH) has two apriori reasonable variants ("slide right under the top of whatever pile there might be over dir_fd" and "slide right under dir_fd itself, no matter what pile might be on top of that") and leaving it unspecified is not good, IMO...