On Tue, May 06, 2025 at 08:34:27PM +0200, Klara Modin wrote: > > What's more, on the overlayfs side we managed to get to > > upper_mnt = clone_private_mount(upperpath); > > err = PTR_ERR(upper_mnt); > > if (IS_ERR(upper_mnt)) { > > pr_err("failed to clone upperpath\n"); > > goto out; > > so the upper path had been resolved... > > > > OK, let's try to see what clone_private_mount() is unhappy about... > > Could you try the following on top of -next + braino fix and see > > what shows up? Another interesting thing, assuming you can get > > to shell after overlayfs mount failure, would be /proc/self/mountinfo > > contents and stat(1) output for upper path of your overlayfs mount... > > It looks like the mount never succeded in the first place? It doesn't > appear in /proc/self/mountinfo at all: > > 2 2 0:2 / / rw - rootfs rootfs rw > 24 2 0:22 / /proc rw,relatime - proc proc rw > 25 2 0:23 / /sys rw,relatime - sysfs sys rw > 26 2 0:6 / /dev rw,relatime - devtmpfs dev rw,size=481992k,nr_inodes=120498,mode=755 > 27 2 259:1 / /mnt/root-ro ro,relatime - squashfs /dev/nvme0n1 ro,errors=continue > > I get the "kern_mount?" message. What the... actually, the comment in front of that thing makes no sense whatsoever - it's *not* something kernel-internal; we get there for mounts that are absolute roots of some non-anonymous namespace; kernel-internal ones fail on if (!is_mounted(...)) just above that. OK, the comment came from db04662e2f4f "fs: allow detached mounts in clone_private_mount()" and it does point in an interesting direction - commit message there speaks of overlayfs and use of descriptors to specify layers. Not that check_for_nsfs_mounts() (from the same commit) made any sense there - we don't *care* about anything mounted somewhere in that mount, since whatever's mounted on top of it does not follow into the copy (which is what has_locked_children() call is about - in effect, in copy you see all mountpoints that had been covered in the original)... Oh, well - so we are seeing an absolute root of some non-anonymous namespace there. Or a weird detached mount claimed to belong to some namespace, anyway. Let's see if that's the way upperpath comes to be (and get a bit more information on that weird mount): diff --git a/fs/namespace.c b/fs/namespace.c index eb990e9a668a..9b4c4afa2b29 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2480,31 +2480,52 @@ struct vfsmount *clone_private_mount(const struct path *path) guard(rwsem_read)(&namespace_sem); - if (IS_MNT_UNBINDABLE(old_mnt)) + if (IS_MNT_UNBINDABLE(old_mnt)) { + pr_err("unbindable"); return ERR_PTR(-EINVAL); + } if (mnt_has_parent(old_mnt)) { - if (!check_mnt(old_mnt)) + if (!check_mnt(old_mnt)) { + pr_err("mounted, but not in our namespace"); return ERR_PTR(-EINVAL); + } } else { - if (!is_mounted(&old_mnt->mnt)) + if (!is_mounted(&old_mnt->mnt)) { + pr_err("not mounted"); return ERR_PTR(-EINVAL); + } /* Make sure this isn't something purely kernel internal. */ - if (!is_anon_ns(old_mnt->mnt_ns)) + if (!is_anon_ns(old_mnt->mnt_ns)) { + if (old_mnt == old_mnt->mnt_ns->root) + pr_err("absolute root"); + else + pr_err("detached, but claimed to be in some ns"); + if (check_mnt(old_mnt)) + pr_err("our namespace, at that"); + else + pr_err("some other non-anon namespace"); return ERR_PTR(-EINVAL); + } /* Make sure we don't create mount namespace loops. */ - if (!check_for_nsfs_mounts(old_mnt)) + if (!check_for_nsfs_mounts(old_mnt)) { + pr_err("shite with nsfs"); return ERR_PTR(-EINVAL); + } } - if (has_locked_children(old_mnt, path->dentry)) + if (has_locked_children(old_mnt, path->dentry)) { + pr_err("has locked children"); return ERR_PTR(-EINVAL); + } new_mnt = clone_mnt(old_mnt, path->dentry, CL_PRIVATE); - if (IS_ERR(new_mnt)) + if (IS_ERR(new_mnt)) { + pr_err("clone_mnt failed (%ld)", PTR_ERR(new_mnt)); return ERR_PTR(-EINVAL); + } /* Longterm mount to be removed by kern_unmount*() */ new_mnt->mnt_ns = MNT_NS_INTERNAL;