On 2025-05-02, Allison Karlitskaya <lis@xxxxxxxxxx> wrote: > hi, > > Please excuse me if these are dumb questions. I'm not great at this stuff. :) > > In fuse_backing_open() there's a check with an interesting comment: > > /* TODO: relax CAP_SYS_ADMIN once backing files are visible to lsof */ > res = -EPERM; > if (!fc->passthrough || !capable(CAP_SYS_ADMIN)) > goto out; > > I've done some research into this but I wasn't able to find any > original discussion about what led to this, or about current plans to > "relax" this restriction -- only speculation about it being a > potential mechanism to "hide" open files. > > It would be nice to have an official story about this, on the record. > What's the concrete problem here, and what would it take to solve it? > Are there plans? Is help required? Would it be possible to relax the > check to having CAP_SYS_ADMIN in the userns which owns the mount (ie: > ns_capable(...))? What would it take to do that? It would be > wonderful to be able to use this inside of containers. > > The most obvious guess about direction (based on the comment) is that > we need to do something to make sure that fds that are registered with > backing IDs remain visible in the output of `lsof` even after the > original fd is closed? > > Thanks in advance for any information you can give. Even if the > answer is "no, it's impossible" it would be great to have that on > record. My guess is that the issue is that we don't want an unprivileged process to be able to create a file reference that cannot be found (with something like lsof) and forcefully closed/killed by a sysadmin. Otherwise you could end up with a DOS with an admin being unable to unmount a filesystem or otherwise figure out what process is holding on to garbage. My hot take is that this is already possible in several ways, though admittedly the ones I can think of all require unprivileged user namespaces. (You can create bind-mount that is kept alive but not visible to any user-space process. The simplest way is to do mounts and chroot. Another is with open_tree().) Now, these won't block umount outright but you'll get the same effect as umount -l, which can be a problem. -- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/
Attachment:
signature.asc
Description: PGP signature