Hi! This is the second version of this series. The individual commits contains changelogs (most of them are in the first patch), but overall, most significantly cached mode (loose or metadata) is now unchanged, there is no longer a "don't reuse inodes at all" mode, bug fixes, using the right functions, basic rename handling, and new documentation. Thanks in advance for the review effort :) v1: https://lore.kernel.org/all/cover.1743971855.git.m@xxxxxxxxxx/ Background ---------- (This section has basically the same content as the v1 cover letter) Previously [1], I noticed that when using 9pfs filesystems, the Landlock LSM is blocking access even for files / directories allowed by rules, and that this has something to do with 9pfs creating new inodes despite Landlock holding a reference to the existing one. Because Landlock uses inodes' in-memory state (i_security) to identify allowed fs objects/hierarchies, this causes Landlock to partially break on 9pfs, at least in uncached mode, which is the default: # mount -t 9p -o trans=virtio test /mnt # env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash Executing the sandboxed command... # cat /mnt/readme cat: /mnt/readme: Permission denied This, however, works if somebody is holding onto the dentry (and it also works with cache=loose), as in both cases the inode is reused: # tail -f /mnt/readme & [1] 196 # env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash Executing the sandboxed command... # cat /mnt/readme aa It also works on directories if one have a shell that cd into the directory. Note that this means only certain usage of Landlock are affected - for example, sandboxing applications that takes a list of files to allow, landlocks itself, then evecve. On the other hand, this does not affect applications that opens a file, then Landlocks itself while keeping the file it needs open. While the above is a very simple example, this is problematic in real-world use cases if Landlock is used to sandox applications on system that has files mounted via 9pfs, or use 9pfs as the root filesystem. In addition, this also affects fanotify / inotify when using inode mark (for local access): root@d8c28a676d72:/# ./fanotify-basic-open /readme & # on virtiofs [1] 173 root@d8c28a676d72:/# cat readme aa FAN_OPEN: File /readme root@d8c28a676d72:/# mount -t 9p -o trans=virtio test /mnt root@d8c28a676d72:/# ./fanotify-basic-open /mnt/readme & # on 9pfs [2] 176 root@d8c28a676d72:/# cat /mnt/readme aa root@d8c28a676d72:/# Same can be demonstrated with inotifywait. The source code for fanotify-basic-open, adopted from the fanotify man page, is available at https://fileshare.maowtm.org/9pfs-landlock-fix/20250903/fanotify-basic-open.c [2]. Note that this is not a security bug for Landlock since it can only cause legitimate access to be denied, but might be a problem for fanotify perm (although I do recognize that using perm on individual inodes is already perhaps a bit unreliable?) It seems that there was an attempt at making 9pfs reuse inodes on uncached mode as well, based on qid.path, however it was reverted [3] due to issues with servers that present duplicate qids, for example on a QEMU host that has multiple filesystems mounted under a single 9pfs export without multidevs=remap, or in the case of other servers that doesn't necessarily support remapping qids ([4] and more). I've done some testing on v6.12-rc4 which has the simplified 9pfs inode code before it was reverted, and found that Landlock works (however, we of course then have the issue demonstrated in [3]). What this series do ------------------- (Changes since v1: added more reasoning for the ino_path struct) With the above in mind, I have a proposal for 9pfs to: 1. Reuse inodes even in uncached mode 2. However, reuse them based on qid.path AND the actual pathname, by doing the appropriate testing in v9fs_test_inode(_dotl)? The main problem here is how to store the pathname in a sensible way and tie it to the inode. For now I opted with an array of names acquired with take_dentry_name_snapshot, which reuses the same memory as the dcache to store the actual strings, but doesn't tie the lifetime of the dentry with the inode (I thought about holding a reference to the dentry in the v9fs_inode, but it seemed like a wrong approach and would cause dentries to not be evicted/released). Additional discussions ---------------------- (New section) >From some QEMU documentation I read [5] it seems like there is a plan to resolve these kind of problems in a new version of the protocol, by expanding the qid to include the filesystem identifier of a file on the host, so maybe this can be disabled after a successful protocol version check with the host? For now, inodeident=path will be the default for uncached filesystems, which can be set to 'qid' to instead to reuse based only on server-provided inode numbers. This patchset currently uses strncmp to compare paths but this might be able to be optimized into a hash comparison first (not done yet). Alternatively the path can be stored more compactly in the form of a single string with `/` in it (like normal paths). However, we should normally only need to do this comparison for one pair of filenames, as the test is only done if qid.path matches in the first place. This patchset currently does not support enabling path-based inodes in cached mode. Additional care needs to be taken to ensure we can refresh an inode that potentially has data cached, but since Dominique is happy with cached mode behaving as-is (reusing inodes via qid only), this is not done. The current implementation will handle client-side renames of a single file (or empty directory) correctly, but server side renames, or renaming a non-empty directory (client or server side), will cause the files being renamed (or files under the renamed directory) to use new inodes (unless they are renamed back). The decision to not update the children of a client-renamed directory is purely to reduce the complexity of this patch, but is in principle possible. Testing and explanations ------------------------ (New section) # mount -t 9p -o ... test /mnt with the following options: - trans=virtio - trans=virtio,inodeident=qid - trans=virtio,cache=loose # env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash Executing the sandboxed command... # cat /mnt/readme hi ^^ landlock works # mount -t 9p -o trans=virtio test /mnt # mkdir /mnt/dir # mv /mnt/readme /mnt/dir/readme # env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/dir/readme LL_FS_RW= /sandboxer bash Executing the sandboxed command... # cat /mnt/dir/readme hi ^^ landlock works # # another terminal in guest: mv /mnt/dir/readme /mnt/dir/readme.2 # cat /mnt/dir/readme.2 hi ^^ ino_path is carried with renames # # host: mv 9pfs/dir/readme.2 9pfs/dir/readme # cat /mnt/dir/readme.2 cat: /mnt/dir/readme.2: No such file or directory # cat /mnt/dir/readme cat: /mnt/dir/readme: Permission denied ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ we can't track renames on the server side # # host: mv 9pfs/dir/readme 9pfs/dir/readme.2 # cat /mnt/dir/readme.2 hi ^^ once the file is back at its original place it works as expected. # # another terminal in guest: mv /mnt/dir/readme.2 /mnt/dir/readme # cat /mnt/dir/readme hi ^^ we can track renames of the file directly... # # another terminal in guest: mv /mnt/dir /mnt/dir.2 # cat /mnt/dir.2/readme cat: /mnt/dir.2/readme: Permission denied ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ but not renames of the parent directory, even if done client-side # # another terminal in guest: mv /mnt/dir.2 /mnt/dir # cat /mnt/dir/readme hi ^^ works once it's back # # another terminal in guest: mv /mnt/dir /mnt/dir.2 && mkdir /mnt/dir && echo hi2 > /mnt/dir/readme # cat /mnt/dir/readme cat: /mnt/dir/readme: Permission denied ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a different file uses a different inode even if same path # # another terminal in guest: mv /mnt/dir.2/readme /mnt/dir/readme # cat /mnt/dir/readme hi # # host: rm 9pfs/dir/readme && echo hi3 > 9pfs/dir/readme # cat /mnt/dir/readme cat: /mnt/dir/readme: Permission denied ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a different file (identified by server-side qid changes) uses different inode fanotify also works, as tested with the program attached at the end. In addition, I ran xfstests on a uncached 9pfs mount, and while there are some test failures, it is the same set of failures as on the current mainline. Test logs at https://fileshare.maowtm.org/9pfs-landlock-fix/20250903/index.html Tested also with Mickaël's new v9fs landlock tests [6] (unmerged yet): # RUN layout3_fs.v9fs.tag_inode_dir_parent ... # OK layout3_fs.v9fs.tag_inode_dir_parent ok 129 layout3_fs.v9fs.tag_inode_dir_parent # RUN layout3_fs.v9fs.tag_inode_dir_mnt ... # OK layout3_fs.v9fs.tag_inode_dir_mnt ok 130 layout3_fs.v9fs.tag_inode_dir_mnt # RUN layout3_fs.v9fs.tag_inode_dir_child ... # OK layout3_fs.v9fs.tag_inode_dir_child ok 131 layout3_fs.v9fs.tag_inode_dir_child # RUN layout3_fs.v9fs.tag_inode_file ... # OK layout3_fs.v9fs.tag_inode_file ok 132 layout3_fs.v9fs.tag_inode_file # RUN layout3_fs.v9fs.release_inodes ... # OK layout3_fs.v9fs.release_inodes ok 133 layout3_fs.v9fs.release_inodes This patch series was based on, and mostly tested on v6.17-rc1 + [7] Kind regards, Tingmao [1]: https://github.com/landlock-lsm/linux/issues/45 [2]: https://fileshare.maowtm.org/9pfs-landlock-fix/20250903/fanotify-basic-open.c [3]: https://lore.kernel.org/all/20241024-revert_iget-v1-4-4cac63d25f72@xxxxxxxxxxxxx/ [4]: https://lore.kernel.org/all/20240923100508.GA32066@willie-the-truck/ [5]: https://wiki.qemu.org/Documentation/9p#Protocol_Plans [6]: https://lore.kernel.org/all/20250704171345.1393451-1-mic@xxxxxxxxxxx/ [7]: https://lore.kernel.org/all/cover.1743956147.git.m@xxxxxxxxxx/ Tingmao Wang (7): fs/9p: Add ability to identify inode by path for .L in uncached mode fs/9p: add option for path-based inodes fs/9p: Add ability to identify inode by path for non-.L in uncached mode fs/9p: .L: Refresh stale inodes on reuse fs/9p: non-.L: Refresh stale inodes on reuse fs/9p: update the target's ino_path on rename docs: fs/9p: Document the "inodeident" option Documentation/filesystems/9p.rst | 42 +++++++ fs/9p/Makefile | 3 +- fs/9p/ino_path.c | 111 ++++++++++++++++++ fs/9p/v9fs.c | 59 +++++++++- fs/9p/v9fs.h | 87 ++++++++++---- fs/9p/vfs_inode.c | 195 ++++++++++++++++++++++++++----- fs/9p/vfs_inode_dotl.c | 171 +++++++++++++++++++++++---- fs/9p/vfs_super.c | 13 ++- 8 files changed, 611 insertions(+), 70 deletions(-) create mode 100644 fs/9p/ino_path.c base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585 prerequisite-patch-id: 3dae487a4b3d676de7c20b269553e3e2176b1e36 prerequisite-patch-id: 93ab54c52a41fa44b8d0baf55df949d0ad27e99a prerequisite-patch-id: 5f558bf969e6eaa3d011c98de0806ca8ad369efe -- 2.51.0