On Mon, Aug 18, 2025 at 09:56:06PM +0100, Al Viro wrote: > On Mon, Aug 18, 2025 at 09:14:28PM +0100, Al Viro wrote: > > > Alternative would be to treat these races as "act as if we'd won and > > the other guy had overmounted ours", i.e. *NOT* follow mounts. Again, > > for old syscalls that's fine - if another thread has raced with us and > > mounted something on top of the place we want to mount on, it could just > > as easily have come *after* we'd completed mount(2) and mounted their > > stuff on top of ours. If userland is not fine with such outcome, it needs > > to provide serialization between the callers. For move_mount(2)... again, > > the only real question is empty to_path case. > > > > Comments? > > Thinking about it a bit more... Unfortunately, there's another corner > case: "." as mountpoint. That would affect that old syscalls as well > and I'm not sure that there's no userland code that relies upon the > current behaviour. > > Background: pathname resolution does *NOT* follow mounts on the starting > point and it does not follow mounts after "." > > ; mkdir /tmp/foo > ; mount -t tmpfs none /tmp/foo > ; cd /tmp/foo > ; echo under > a > ; cat /tmp/foo/a > under > ; mount -t tmpfs none /tmp/foo > ; cat a > under > ; cat /tmp/foo/a > cat: /tmp/foo/a: no such file or directory > ; echo under > b > ; cat b > under > ; cat /tmp/foo/b > cat: /tmp/foo/b: no such file or directory > ; > > It's been a bad decision (if it can be called that - it's been more > of an accident, AFAICT), but it's decades too late to change it. > And interaction with mount is also fun: mount(2) *DOES* follow mounts > on the end of any pathname, no matter what. So in case when we are > standing in an overmounted directory, ls . will show the contents of > that directory, but mount <something> . will mount on top of whatever's > mounted there. > > So the alternative I've mentioned above would change the behaviour of > old syscalls in a corner case that just might be actually used in userland > code - including the scripts run at the boot time, of all things ;-/ > > IOW, it probably falls under "can't touch that, no matter how much we'd > like to" ;-/ Pity, that... > > That leaves the question of MOVE_MOUNT_BENEATH with empty pathname - > do we want a variant that would say "slide precisely under the opened > directory I gave you, no matter what might overmount it"? Afaict, right now MOVE_MOUNT_BENEATH will take the overmount into account even for "." just like mount(2) will lookup the topmost mount no matter what. That is what userspace expects. I don't think we need a variant where "." ignores overmounts for MOVE_MOUNT_BENEATH and really not unless someone has a specific use-case for it. If it comes to that we should probably add a new flag. > > At the very least this corner case needs to be documented in move_mount(2) > - behaviour of > move_mount(_, _, dir_fd, "", > MOVE_MOUNT_T_EMPTY | MOVE_MOUNT_BENEATH) > has two apriori reasonable variants ("slide right under the top of > whatever pile there might be over dir_fd" and "slide right under dir_fd Yes, that's what's intended and documented also what I wrote in my commit messages and what the selftests should test for. I specifically did not make it deviate from standard mount(2) behavior. > itself, no matter what pile might be on top of that") and leaving it > unspecified is not good, IMO... Sure, Aleksa can pull that into his documentation patches.