On Fri, May 30, 2025 at 12:10:18AM +0100, Al Viro wrote: > On Thu, May 29, 2025 at 03:13:10PM -0700, Song Liu wrote: > > > Is it an issue if we only hold a reference to a MNT_LOCKED mount for > > short period of time? "Short period" means it may get interrupted, page > > faults, or wait for an IO (read xattr), but it won't hold a reference to the > > mount and sleep indefinitely. > > MNT_LOCKED mount itself is not a problem. What shouldn't be done is > looking around in the mountpoint it covers. It depends upon the things > you are going to do with that, but it's very easy to get an infoleak > that way. > > > > OTOH, there's a good cause for moving some of the flags, MNT_LOCKED > > > included, out of ->mnt_flags and into a separate field in struct mount. > > > However, that would conflict with any code using that to deal with > > > your iterator safely. > > > > > > What's more, AFAICS in case of a stack of mounts each covering the root > > > of parent mount, you stop in each of those. The trouble is, umount(2) > > > propagation logics assumes that intermediate mounts can be pulled out of > > > such stack without causing trouble. For pathname resolution that is > > > true; it goes through the entire stack atomically wrt that stuff. > > > For your API that's not the case; somebody who has no idea about an > > > intermediate mount being there might get caught on it while it's getting > > > pulled from the stack. > > > > > > What exactly do you need around the mountpoint crossing? > > > > I thought about skipping intermediate mounts (that are hidden by > > other mounts). AFAICT, not skipping them will not cause any issue. > > It can. Suppose e.g. that /mnt gets propagation from another namespace, > but not the other way round and you mount something on /mnt. > > Later, in that another namespace, somebody mounts something on wherever > your /mnt gets propagation to. A copy will be propagated _between_ > your /mnt and whatever you've mounted on top of it; it will be entirely > invisible until you umount your /mnt. At that point the propagated > copy will show up there, same as if it had appeared just after your > umount. Prior to that it's entirely invisible. If its original > counterpart in another namespace gets unmounted first, the copy will > be quietly pulled out. Fwiw, I have explained these and similar issues at length multiple times. > > Note that choose_mountpoint_rcu() callers (including choose_mountpoint()) > will have mount_lock seqcount sampled before the traversal _and_ recheck > it after having reached the bottom of stack. IOW, if you traverse .. > on the way to root, you won't get caught on the sucker being pulled out. > > Your iterator, OTOH, would stop in that intermediate mount - and get > an unpleasant surprise when it comes back to do the next step (towards > /mnt on root filesystem, that is) and finds that path->mnt points > to something that is detached from everything - no way to get from > it any further. That - despite the fact that location you've started > from is still mounted, still has the same pathname, etc. and nothing > had been disrupted for it. Same...