Hi 9p-fs maintainers, (CC'ing Landlock and Fanotify/inotify people as this affects both use cases, although most of my investigation has been on Landlock) Previously [1], I noticed that when using 9pfs filesystems, the Landlock LSM is blocking access even for files / directories allowed by rules, and that this has something to do with 9pfs creating new inodes despite Landlock holding a reference to the existing one. Because Landlock uses inodes' in-memory state (i_security) to identify allowed fs objects/hierarchies, this causes Landlock to partially break on 9pfs, at least in uncached mode, which is the default: # mount -t 9p -o trans=virtio test /mnt # env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash Executing the sandboxed command... bash: cannot set terminal process group (164): Inappropriate ioctl for device bash: no job control in this shell # cat /mnt/readme cat: /mnt/readme: Permission denied This, however, works if somebody is holding onto the dentry (and it also works with cache=loose), as in both cases the inode is reused: # tail -f /mnt/readme & [1] 196 # env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash Executing the sandboxed command... bash: cannot set terminal process group (164): Inappropriate ioctl for device bash: no job control in this shell # cat /mnt/readme aa It also works on directories if one have a shell that cd into the directory. Note that this means only certain usage of Landlock are affected - for example, sandboxing applications that takes a list of files to allow, landlocks itself, then evecve. On the other hand, this does not affect applications that opens a file, then Landlocks itself while keeping the file it needs open. While the above is a very simple example, this is problematic in real-world use cases if Landlock is used to sandox applications on system that has files mounted via 9pfs, or use 9pfs as the root filesystem. In addition, this also affects fanotify / inotify when using inode mark (for local access): root@d8c28a676d72:/# ./fanotify-basic-open /readme & # on virtiofs [1] 173 root@d8c28a676d72:/# cat readme aa FAN_OPEN: File /readme root@d8c28a676d72:/# mount -t 9p -o trans=virtio test /mnt root@d8c28a676d72:/# ./fanotify-basic-open /mnt/readme & # on 9pfs [2] 176 root@d8c28a676d72:/# cat /mnt/readme aa root@d8c28a676d72:/# Same can be demonstrated with inotifywait. The source code for fanotify-basic-open, adopted from the fanotify man page, is available at the end of this email. Note that this is not a security bug for Landlock since it can only cause legitimate access to be denied, but might be a problem for fanotify perm (although I do recognize that using perm on individual inodes is already perhaps a bit unreliable?) It seems that there was an attempt at making 9pfs reuse inodes, based on qid.path, however it was reverted [2] due to issues with servers that present duplicate qids, for example on a QEMU host that has multiple filesystems mounted under a single 9pfs export without multidevs=remap, or in the case of other servers that doesn't necessarily support remapping qids ([3] and more). I've done some testing on v6.12-rc4 which has the simplified 9pfs inode code before it was reverted, and found that Landlock works (however, we of course then have the issue demonstrated in [2]). Unrelated to the above problem, it also seems like even with the revert in [2], because in cached mode inode are still reused based on qid (and type, version (aka mtime), etc), the setup mentioned in [2] still causes problems in th latest kernel with cache=loose: host # cd /tmp/linux-test host # mkdir m1 m2 host # mount -t tmpfs tmpfs m1 host # mount -t tmpfs tmpfs m2 host # mkdir m1/dir m2/dir # needs to be done together so that they have the same mtime host # echo foo > m1/dir/foo host # echo bar > m2/dir/bar guest # mount -t 9p -o trans=virtio,cache=loose test /mnt guest # cd /mnt/m1/dir qemu-system-x86_64: warning: 9p: Multiple devices detected in same VirtFS export, which might lead to file ID collisions and severe misbehaviours on guest! You should either use a separate export for each device shared from host or use virtfs option 'multidevs=remap'! guest # ls foo guest # ls /mnt/m2/dir foo # <- should be bar guest # uname -a Linux d8c28a676d72 6.14.0-dev #92 SMP PREEMPT_DYNAMIC Sun Apr 6 18:47:54 BST 2025 x86_64 GNU/Linux With the above in mind, I have a proposal for 9pfs to: 1. Reuse inodes even in uncached mode 2. However, reuse them based on qid.path AND the actual pathname, by doing the appropriate testing in v9fs_test(_new)?_inode(_dotl)? The main problem here is how to store the pathname in a sensible way and tie it to the inode. For now I opted with an array of names acquired with take_dentry_name_snapshot, which reuses the same memory as the dcache to store the actual strings, but doesn't tie the lifetime of the dentry with the inode (I thought about holding a reference to the dentry in the v9fs_inode, but it seemed like a wrong approach and would cause dentries to not be evicted/released). Maybe ideally there could be a general way for filesystems to tell VFS/dcache to "pin" a dentry as long as the inode is alive, but still allows the dentry and inode to be evicted based on memory pressure? In fact, if the dentry is alive, we might not even need to do this in 9pfs, as we will automatically get the same inode pointed to by the dentry. Currently this pathname is immutable once attached to an inode, and therefore it is not protected by locks or RCU, but this might have to change for us to support renames that preserve the inode on next access. This is not in this patchset yet. Also let me know if I've missed any locks etc (I'm new to VFS, or for that matter, the kernel). Storing one pathname per inode also means we don't reuse the same inode for hardlinks -- maybe this can be fixed as well in a future version, if this approach sounds good?