On Thu, Jul 31, 2025 at 3:53 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Thu, Jul 31, 2025 at 10:40:40AM +0800, Pavel Tikhomirov wrote: > > If detached mounts are our only concern, it looks like the check instead of: > > > > if (!check_mnt(mnt)) { > > err = -EINVAL; > > goto out_unlock; > > } > > > > could've been a more relaxed one: > > > > if (mnt_detached(mnt)) { > > err = -EINVAL; > > goto out_unlock; > > } > > > > bool mnt_detached(struct mount *mnt) > > { > > return !mnt->mnt_ns; > > } > > > > not to allow propagation change only on detached mounts. (As > > umount_tree sets mnt_ns to NULL.) > > Changing propagation settings on detached mounts is fine and shoud work? > Changing propagation settings on unmounted mounts not so much... Sorry, it's my confused terminology, here by "detached" mounts I mean mounts which were unmounted with umount2(MNT_DETACH), maybe I should call them "unmounted" (e.g. s/mnt_detached/mnt_unmounted/). And yes, I understand why we need to allow changing propagation on mounts in anonymous namespace without being inside it, because one can't just enter anonymous namespace. I don't think that we need to change anything, just sharing my thoughts that it could be more relaxed and will still protect us from propagation setting on unmounted mounts. > > > > > Also in do_mount_setattr we have a more relaxed check too: > > > > if ((mnt_has_parent(mnt) || !is_anon_ns(mnt->mnt_ns)) && !check_mnt(mnt)) > > goto out; > > > > Best Regards, Tikhomirov Pavel. > > > > On Sun, Jul 27, 2025 at 5:01 AM Andrei Vagin <avagin@xxxxxxxxxx> wrote: > > > > > > On Sat, Jul 26, 2025 at 10:53 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > On Sat, Jul 26, 2025 at 10:12:34AM -0700, Andrei Vagin wrote: > > > > > On Thu, Jul 24, 2025 at 4:00 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Thu, Jul 24, 2025 at 01:02:48PM -0700, Andrei Vagin wrote: > > > > > > > Hi Al and Christian, > > > > > > > > > > > > > > The commit 12f147ddd6de ("do_change_type(): refuse to operate on > > > > > > > unmounted/not ours mounts") introduced an ABI backward compatibility > > > > > > > break. CRIU depends on the previous behavior, and users are now > > > > > > > reporting criu restore failures following the kernel update. This change > > > > > > > has been propagated to stable kernels. Is this check strictly required? > > > > > > > > > > > > Yes. > > > > > > > > > > > > > Would it be possible to check only if the current process has > > > > > > > CAP_SYS_ADMIN within the mount user namespace? > > > > > > > > > > > > Not enough, both in terms of permissions *and* in terms of "thou > > > > > > shalt not bugger the kernel data structures - nobody's priveleged > > > > > > enough for that". > > > > > > > > > > Al, > > > > > > > > > > I am still thinking in terms of "Thou shalt not break userspace"... > > > > > > > > > > Seriously though, this original behavior has been in the kernel for 20 > > > > > years, and it hasn't triggered any corruptions in all that time. > > > > > > > > For a very mild example of fun to be had there: > > > > mount("none", "/mnt", "tmpfs", 0, ""); > > > > chdir("/mnt"); > > > > umount2(".", MNT_DETACH); > > > > mount(NULL, ".", NULL, MS_SHARED, NULL); > > > > Repeat in a loop, watch mount group id leak. That's a trivial example > > > > of violating the assertion ("a mount that had been through umount_tree() > > > > is out of propagation graph and related data structures for good"). > > > > > > I wasn't referring to detached mounts. CRIU modifies mounts from > > > non-current namespaces. > > > > > > > > > > > As for the "CAP_SYS_ADMIN within the mount user namespace" - which > > > > userns do you have in mind? > > > > > > > > > > The user namespace of the target mount: > > > ns_capable(mnt->mnt_ns->user_ns, CAP_SYS_ADMIN) > > >