On Fri, May 30, 2025 at 07:43:48PM +0100, Al Viro wrote: > On Fri, May 30, 2025 at 02:20:39PM +0200, Mickaël Salaün wrote: > > > Without access to mount_lock, what would be the best way to fix this > > Landlock issue while making it backportable? > > > > > > > > If we update path_parent in this patchset with choose_mountpoint(), > > > and use it in Landlock, we will close this race condition, right? > > > > choose_mountpoint() is currently private, but if we add a new filesystem > > helper, I think the right approach would be to expose follow_dotdot(), > > updating its arguments with public types. This way the intermediates > > mount points will not be exposed, RCU optimization will be leveraged, > > and usage of this new helper will be simplified. > > IMO anything that involves struct nameidata should remain inside > fs/namei.c - something public might share helpers with it, but that's > it. We had more than enough pain on changes in there, and I'm pretty > sure that we are not done yet; in the area around atomic_open, but not > only there. Parts of that are still too subtle, IMO - it got a lot > better over the years, but I would really prefer to avoid the need > to bring more code into analysis for any further massage. > > Are you sure that follow_dotdot() behaviour is what you really want? > > Note that it's not quite how the pathname resolution works. There we > have the result of follow_dotdot() fed to step_into(), and that changes > things. Try this: > > mkdir /tmp/foo > mkdir /tmp/foo/bar > cd /tmp/foo/bar > mount -t tmpfs none /tmp/foo > touch /tmp/foo/x > ls -Uldi . .. /tmp/foo ../.. /tmp ../x > > and think about the results. Traversing .. is basically "follow_up as much > as possible, then to parent, then follow_down as much as possible" and > the last part (../x) explains why we do it that way. > > Which objects would you want to iterate through when dealing with the > current directory in the experiment above? Simulation of pathwalk > would have the root of overmounting filesystem as the second object > visited; follow_dotdot() would yield the directory overmounted by > that instead. > > I'm not saying that either behaviour is right for your case - just that > they are not identical and it's something that needs to be consciously > chosen. Thanks, this helps. I didn't though about this semantic difference. We don't want the handle_dots() semantic (which call follow_dotdot() and step_into()), only the (backward) pathwalk one which is equivalent to follow_dotdot(). I'll add Landlock tests for this specific scenario.