On Fri, Sep 12, 2025 at 10:20 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > On Thu, Sep 11, 2025 at 01:36:28PM +0200, Amir Goldstein wrote: > > On Thu, Sep 11, 2025 at 11:31 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > On Wed, Sep 10, 2025 at 07:21:22PM +0200, Amir Goldstein wrote: > > > > On Wed, Sep 10, 2025 at 4:39 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > > > > > A while ago we added support for file handles to pidfs so pidfds can be > > > > > encoded and decoded as file handles. Userspace has adopted this quickly > > > > > and it's proven very useful. > > > > > > > > > Pidfd file handles are exhaustive meaning > > > > > they don't require a handle on another pidfd to pass to > > > > > open_by_handle_at() so it can derive the filesystem to decode in. > > > > > > > > > > Implement the exhaustive file handles for namespaces as well. > > > > > > > > I think you decide to split the "exhaustive" part to another patch, > > > > so better drop this paragraph? > > > > > > Yes, good point. I've dont that. > > > > > > > I am missing an explanation about the permissions for > > > > opening these file handles. > > > > > > > > My understanding of the code is that the opener needs to meet one of > > > > the conditions: > > > > 1. user has CAP_SYS_ADMIN in the userns owning the opened namespace > > > > 2. current task is in the opened namespace > > > > > > Yes. > > > > > > > > > > > But I do not fully understand the rationale behind the 2nd condition, > > > > that is, when is it useful? > > > > > > A caller is always able to open a file descriptor to it's own set of > > > namespaces. File handles will behave the same way. > > > > > > > I understand why it's safe, and I do not object to it at all, > > I just feel that I do not fully understand the use case of how ns file handles > > are expected to be used. > > A process can always open /proc/self/ns/mnt > > What's the use case where a process may need to open its own ns by handle? > > > > I will explain. For CAP_SYS_ADMIN I can see why keeping handles that > > do not keep an elevated refcount of ns object could be useful in the same > > way that an NFS client keeps file handles without keeping the file object alive. > > > > But if you do not have CAP_SYS_ADMIN and can only open your own ns > > by handle, what is the application that could make use of this? > > and what's the benefit of such application keeping a file handle instead of > > ns fd? > > A process is not always able to open /proc/self/ns/. That requires > procfs to be mounted and for /proc/self/ or /proc/self/ns/ to not be > overmounted. However, they can derive a namespace fd from their own > pidfd. And that also always works if it's their own namespace. > > There's no need to introduce unnecessary behavioral differences between > /proc/self/ns/, pidfd-derived namespace fs, and file-handle-derived > namespace fds. That's just going to be confusing. > > The other thing is that there are legitimate use-case for encoding your > own namespace. For example, you might store file handles to your set of > namespaces in a file on-disk so you can verify when you get rexeced that > they're still valid and so on. This is akin to the pidfd use-case. > > Or just plainly for namespace comparison reasons where you keep a file > handle to your own namespaces and can then easily check against others. OK. As I said no objections I was just curious about this use case. FWIW, comparing current ns to a stored file handle does not really require permission to open_by_handle_at(). name_to_handle_at() the current ns and binary compare to the stored file handle should be a viable option. This was exactly the reason for introducing AT_HANDLE_FID, so that fanotify unprivileged watcher with no permission to open_by_handle_at() could compare an fid reported in an event with another fid they obtained earlier with name_to_handle_at() and kept in a map. Thanks for the explanation! Amir.