On Fri, May 02, 2025 at 04:04:28PM +0200, Jann Horn wrote: > On Fri, May 2, 2025 at 2:42 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > I need some help with the following questions: > > > > (i) The core_pipe_limit setting is of vital importance to userspace > > because it allows it to a) limit the number of concurrent coredumps > > and b) causes the kernel to wait until userspace closes the pipe and > > thus prevents the process from being reaped, allowing userspace to > > parse information out of /proc/<pid>/. > > > > Pipes already support this. I need to know from the networking > > people (or Oleg :)) how to wait for the userspace side to shutdown > > the socket/terminate the connection. > > > > I don't want to just read() because then userspace can send us > > SCM_RIGHTS messages and it's really ugly anyway. > > > > (ii) The dumpability setting is of importance for userspace in order to > > know how a given binary is dumped: as regular user or as root user. > > This helps guard against exploits abusing set*id binaries. The > > setting needs to be the same as used at the time of the coredump. > > > > I'm exposing this as part of PIDFD_GET_INFO. I would like some > > input whether it's fine to simply expose the dumpability this way. > > I'm pretty sure it is. But it'd be good to have @Jann give his > > thoughts here. > > My only concern here is that if we expect the userspace daemon to look > at the dumpability field and treat nondumpable tasks as "this may > contain secret data and resources owned by various UIDs mixed > together, only root should see the dump", we should have at least very > clear documentation around this. > > [...] > > Userspace can get a stable handle on the task generating the coredump by > > using the SO_PEERPIDFD socket option. SO_PEERPIDFD uses the thread-group > > leader pid stashed during connect(). Even if the task generating the > > Unrelated to this series: Huh, I think I haven't seen SO_PEERPIDFD > before. I guess one interesting consequence of that feature is that if It's very heavily used by dbus-broker, polkit and systemd to safely authenticate clients instead of by PIDs. (Fyi, it's even supported for bluetooth sockets so they could benefit from this as well I'm sure.) > you get a unix domain socket whose peer is in another PID namespace, > you can call pidfd_getfd() on that peer, which wouldn't normally be > possible? Though of course it'll still be subject to the normal ptrace > checks. I think that was already possible because you could send pidfds via SCM_RIGHTS. That's a lot more cooperative than SO_PEERPIDFD of course but still. But if that's an issue we could of course enforce that pidfd_getfd() may only work if the target is within your pidns hierarchy just as we do for the PIDFD_GET_INFO ioctl() already. But I'm not sure it's an issue.