On Sat, Jul 26, 2025 at 10:12:34AM -0700, Andrei Vagin wrote: > On Thu, Jul 24, 2025 at 4:00 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: > > > > On Thu, Jul 24, 2025 at 01:02:48PM -0700, Andrei Vagin wrote: > > > Hi Al and Christian, > > > > > > The commit 12f147ddd6de ("do_change_type(): refuse to operate on > > > unmounted/not ours mounts") introduced an ABI backward compatibility > > > break. CRIU depends on the previous behavior, and users are now > > > reporting criu restore failures following the kernel update. This change > > > has been propagated to stable kernels. Is this check strictly required? > > > > Yes. > > > > > Would it be possible to check only if the current process has > > > CAP_SYS_ADMIN within the mount user namespace? > > > > Not enough, both in terms of permissions *and* in terms of "thou > > shalt not bugger the kernel data structures - nobody's priveleged > > enough for that". > > Al, > > I am still thinking in terms of "Thou shalt not break userspace"... > > Seriously though, this original behavior has been in the kernel for 20 > years, and it hasn't triggered any corruptions in all that time. For a very mild example of fun to be had there: mount("none", "/mnt", "tmpfs", 0, ""); chdir("/mnt"); umount2(".", MNT_DETACH); mount(NULL, ".", NULL, MS_SHARED, NULL); Repeat in a loop, watch mount group id leak. That's a trivial example of violating the assertion ("a mount that had been through umount_tree() is out of propagation graph and related data structures for good"). As for the "CAP_SYS_ADMIN within the mount user namespace" - which userns do you have in mind?