On Tue, Jul 01, 2025 at 08:38:07PM +0200, Mickaël Salaün wrote: > We can get disconnected files or directories when they are visible and > opened from a bind mount, before being renamed/moved from the source of > the bind mount in a way that makes them inaccessible from the mount > point (i.e. out of scope). > > Until now, access rights tied to files or directories opened through a > disconnected directory were collected by walking the related hierarchy > down to the root of this filesystem because the mount point couldn't be > found. This could lead to inconsistent access results, and > hard-to-debug renames, especially because such paths cannot be printed. > > For a sandboxed task to create a disconnected directory, it needs to > have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to > the underlying source of the bind mount, and read access to the related > mount point. Because a sandboxed task cannot get more access than those > defined by its Landlock domain, this could only lead to inconsistent > access rights because of missing those that should be inherited from the > mount point hierarchy and inheriting from the hierarchy of the mounted > filesystem instead. > > Landlock now handles files/directories opened from disconnected > directories like the mount point these disconnected directories were > opened from. This gives the guarantee that access rights on a > file/directory cannot be more than those at open time. The rationale is > that disconnected hierarchies might not be visible nor accessible to a > sandboxed task, and relying on the collected access rights from them > could introduce unexpected results, especially for rename actions > because of the access right comparison between the source and the > destination (see LANDLOCK_ACCESS_FS_REFER). This new behavior is much > less surprising to users and safer from an access point of view. > > Unlike follow_dotdot(), we don't need to check for each directory if it > is part of the mount's root, but instead this is only checked when we > reached a root dentry (not a mount point), or when the access > request is about to be allowed. This limits the number of calls to > is_subdir() which walks down the hierarchy (again). This also avoids > checking path connection at the beginning of the walk for each mount > point, which would be racy. > > Make path_connected() public to stay consistent with the VFS. This > helper is used when we are about to allowed an access. > > This change increases the stack size with two Landlock layer masks > backups that are needed to reset the collected access rights to the > latest mount point. > > Because opened files have their access rights stored in the related file > security properties, their is no impact for disconnected or unlinked > files. > > A following commit will document handling of disconnected files and > directories. > > Cc: Günther Noack <gnoack@xxxxxxxxxx> > Cc: Song Liu <song@xxxxxxxxxx> > Reported-by: Tingmao Wang <m@xxxxxxxxxx> > Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@xxxxxxxxxx > Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") > Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control") > Signed-off-by: Mickaël Salaün <mic@xxxxxxxxxxx> > --- > > This replaces this patch: > landlock: Remove warning in collect_domain_accesses() > https://lore.kernel.org/r/20250618134734.1673254-1-mic@xxxxxxxxxxx > > I'll probably split this commit into two to ease backport (same for > tests). > > This patch series applies on top of my next branch: > https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git/log/?h=next > > TODO: Add documentation > > TODO: Add Landlock erratum > --- > fs/namei.c | 2 +- > include/linux/fs.h | 1 + > security/landlock/fs.c | 121 +++++++++++++++++++++++++++++++++++------ > 3 files changed, 105 insertions(+), 19 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 4bb889fc980b..7853a876fc1c 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -716,7 +716,7 @@ static bool nd_alloc_stack(struct nameidata *nd) > * Rename can sometimes move a file or directory outside of a bind > * mount, path_connected allows those cases to be detected. > */ > -static bool path_connected(struct vfsmount *mnt, struct dentry *dentry) > +bool path_connected(struct vfsmount *mnt, struct dentry *dentry) > { > struct super_block *sb = mnt->mnt_sb; > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 4ec77da65f14..3c0e324a9272 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -3252,6 +3252,7 @@ extern struct file * open_exec(const char *); > /* fs/dcache.c -- generic fs support functions */ > extern bool is_subdir(struct dentry *, struct dentry *); > extern bool path_is_under(const struct path *, const struct path *); > +extern bool path_connected(struct vfsmount *mnt, struct dentry *dentry); Drop the "extern" please. We generally don't do that anymore for new additions. Otherwise, Acked-by: Christian Brauner <brauner@xxxxxxxxxx>