On 2025/05/15 4:02, Al Viro wrote: > On Wed, May 14, 2025 at 08:37:54AM +0000, KONDO KAZUMA(近藤 和真) wrote: >> On 2025/05/14 11:43, Al Viro wrote: >>> On Wed, May 14, 2025 at 12:25:58AM +0000, KONDO KAZUMA(近藤 和真) wrote: >>> >>>> @@ -2482,17 +2482,13 @@ struct vfsmount *clone_private_mount(const struct path *path) >>>> if (IS_MNT_UNBINDABLE(old_mnt)) >>>> return ERR_PTR(-EINVAL); >>>> >>>> - if (mnt_has_parent(old_mnt)) { >>>> + if (!is_mounted(&old_mnt->mnt)) >>>> + return ERR_PTR(-EINVAL); >>>> + >>>> + if (mnt_has_parent(old_mnt) || !is_anon_ns(old_mnt->mnt_ns)) { >>>> if (!check_mnt(old_mnt)) >>>> return ERR_PTR(-EINVAL); >>>> } else { >>>> - if (!is_mounted(&old_mnt->mnt)) >>>> - return ERR_PTR(-EINVAL); >>>> - >>>> - /* Make sure this isn't something purely kernel internal. */ >>>> - if (!is_anon_ns(old_mnt->mnt_ns)) >>>> - return ERR_PTR(-EINVAL); >>>> - >>>> /* Make sure we don't create mount namespace loops. */ >>>> if (!check_for_nsfs_mounts(old_mnt)) >>>> return ERR_PTR(-EINVAL); >>> >>> Not the right way to do that. What we want is >>> >>> /* ours are always fine */ >>> if (!check_mnt(old_mnt)) { >>> /* they'd better be mounted _somewhere */ >>> if (!is_mounted(old_mnt)) >>> return -EINVAL; >>> /* no other real namespaces; only anon */ >>> if (!is_anon_ns(old_mnt->mnt_ns)) >>> return -EINVAL; >>> /* ... and root of that anon */ >>> if (mnt_has_parent(old_mnt)) >>> return -EINVAL; >>> /* Make sure we don't create mount namespace loops. */ >>> if (!check_for_nsfs_mounts(old_mnt)) >>> return ERR_PTR(-EINVAL); >>> } >> >> Hello Al Viro, >> >> Thank you for your comment. >> That code can solve my problem, and it seems to be better! > > BTW, see https://lore.kernel.org/all/20250506194849.GT2023217@ZenIV/ for > discussion about a week ago when that got noticed: > > || In case of clone_private_mount(), though, there's nothing wrong > || with "clone me a subtree of absolute root", so it has to be > || done other way round - check if it's ours first, then in "not > || ours" case check that it's a root of anon namespace. > || > || Failing btrfs mount has ended up with upper layer pathname > || pointing to initramfs directory where btrfs would've been > || mounted, which had walked into that corner case. In your > || case the problem has already happened by that point, but on > || a setup a-la X Terminal it would cause trouble... > > Looks like such setups are less theoretical than I thought. > >> So, I will revise my patch and resend it. > > Probably worth gathering the comments in one place. Something like > /* > * Check if the source is acceptable; anything mounted in > * our namespace is fine, otherwise it must be the root of > * some anon namespace and we need to make sure no namespace > * loops get created. > */ > if (!check_mnt(old_mnt)) { > if (!is_mounted(&old_mnt->mnt) || > !is_anon_ns(old_mnt->mnt_ns) || > mnt_has_parent(old_mnt)) > return ERR_PTR(-EINVAL); > if (!check_for_nsfs_mounts(old_mnt)) > return ERR_PTR(-EINVAL); > } > might be easier to follow. I think it's good idea. I will send a patch with that comment. Thanks