On 4/22/2025 10:55 PM, David Hildenbrand wrote: > On 10.04.25 10:41, Christoph Hellwig wrote: >> On Tue, Apr 08, 2025 at 11:23:57AM +0000, Shivank Garg wrote: >>> KVM guest_memfd is implementing its own inodes to store metadata for >>> backing memory using a custom filesystem. This requires the ability to >>> initialize anonymous inode using security_inode_init_security_anon(). >>> >>> As guest_memfd currently resides in the KVM module, we need to export this >>> symbol for use outside the core kernel. In the future, guest_memfd might be >>> moved to core-mm, at which point the symbols no longer would have to be >>> exported. When/if that happens is still unclear. >> >> This really should be a EXPORT_SYMBOL_GPL, if at all. >> >> But you really should look into a new interface in anon_inode.c that >> can be reused instead of duplicating anonymouns inode logic in kvm.ko. > > I assume you mean combining the alloc_anon_inode()+ > security_inode_init_security_anon(), correct? > > I can see mm/secretmem.c doing the same thing, so agreed that > we're duplicating it. > > > Regarding your other mail, I am also starting to wonder where/why > we want security_inode_init_security_anon(). At least for > mm/secretmem.c, it was introduced by: > > commit 2bfe15c5261212130f1a71f32a300bcf426443d4 > Author: Christian Göttsche <cgzones@xxxxxxxxxxxxxx> > Date: Tue Jan 25 15:33:04 2022 +0100 > > mm: create security context for memfd_secret inodes > Create a security context for the inodes created by memfd_secret(2) via > the LSM hook inode_init_security_anon to allow a fine grained control. > As secret memory areas can affect hibernation and have a global shared > limit access control might be desirable. > Signed-off-by: Christian Göttsche <cgzones@xxxxxxxxxxxxxx> > Signed-off-by: Paul Moore <paul@xxxxxxxxxxxxxx> > > > In combination with Paul's review comment [1] > > " > This seems reasonable to me, and I like the idea of labeling the anon > inode as opposed to creating a new set of LSM hooks. If we want to > apply access control policy to the memfd_secret() fds we are going to > need to attach some sort of LSM state to the inode, we might as well > use the mechanism we already have instead of inventing another one. > " > > > IIUC, we really only want security_inode_init_security_anon() when there > might be interest to have global access control. > > > Given that guest_memfd already shares many similarities with guest_memfd > (e.g., pages not swappable/migratable) and might share even more in the future > (e.g., directmap removal), I assume that we want the same thing for guest_memfd. > > > Would something like the following seem reasonable? We should be adding some > documentation for the new function, and I wonder if S_PRIVATE should actually > be cleared for secretmem + guest_memfd (I have no idea what this "fs-internal" flag > affects). > Here's my understanding of S_PRIVATE flag: 1. S_PRIVATE tells the kernel that an inode is special and it should skip the LSM permission checks (via IS_PRIVATE()): For instance, int security_inode_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev) { if (unlikely(IS_PRIVATE(dir))) return 0; return call_int_hook(inode_mknod, dir, dentry, mode, dev); } 2. In landlock LSM: S_PRIVATE inodes are denied from opening file and getting path from fd syscall: open_by_handle_at calls do_handle_open -> handle_to_path -> get_path_from_fd static int get_path_from_fd(const s32 fd, struct path *const path) { ... /* * Forbids ruleset FDs, internal filesystems (e.g. nsfs), including * pseudo filesystems that will never be mountable (e.g. sockfs, * pipefs). */ if ((fd_file(f)->f_op == &ruleset_fops) || (fd_file(f)->f_path.mnt->mnt_flags & MNT_INTERNAL) || (fd_file(f)->f_path.dentry->d_sb->s_flags & SB_NOUSER) || d_is_negative(fd_file(f)->f_path.dentry) || IS_PRIVATE(d_backing_inode(fd_file(f)->f_path.dentry))) return -EBADFD; Using is_nouser_or_private() in is_access_to_paths_allowed() (allows accesses for requests with a common path) static bool is_access_to_paths_allowed() { ... if (is_nouser_or_private(path->dentry)) return true; 3. S_PRIVATE skips security attribute initialization in SELinux: security/selinux/hooks.c sb_finish_set_opts(){ ... if (inode) { if (!IS_PRIVATE(inode)) inode_doinit_with_dentry(inode, NULL); 4. mm/shmem.c /** * shmem_kernel_file_setup - get an unlinked file living in tmpfs which must be * kernel internal. There will be NO LSM permission checks against the * underlying inode. So users of this interface must do LSM checks at a * higher layer. The users are the big_key and shm implementations. LSM * checks are provided at the key or shm level rather than the inode. * @name: name for dentry (to be seen in /proc/<pid>/maps) * @size: size to be set for the file * @flags: VM_NORESERVE suppresses pre-accounting of the entire object size */ struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags) { return __shmem_file_setup(shm_mnt, name, size, flags, S_PRIVATE);