On Fri, Apr 25, 2025 at 01:31:56PM +0200, Benjamin Drung wrote: > Hi, > > On Mon, 2025-04-14 at 15:55 +0200, Christian Brauner wrote: > > Give userspace a way to instruct the kernel to install a pidfd into the > > usermode helper process. This makes coredump handling a lot more > > reliable for userspace. In parallel with this commit we already have > > systemd adding support for this in [1]. > > > > We create a pidfs file for the coredumping process when we process the > > corename pattern. When the usermode helper process is forked we then > > install the pidfs file as file descriptor three into the usermode > > helpers file descriptor table so it's available to the exec'd program. > > > > Since usermode helpers are either children of the system_unbound_wq > > workqueue or kthreadd we know that the file descriptor table is empty > > and can thus always use three as the file descriptor number. > > > > Note, that we'll install a pidfd for the thread-group leader even if a > > subthread is calling do_coredump(). We know that task linkage hasn't > > been removed due to delay_group_leader() and even if this @current isn't > > the actual thread-group leader we know that the thread-group leader > > cannot be reaped until @current has exited. > > > > Link: https://github.com/systemd/systemd/pull/37125 [1] > > Tested-by: Luca Boccassi <luca.boccassi@xxxxxxxxx> > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> > > --- > > fs/coredump.c | 59 ++++++++++++++++++++++++++++++++++++++++++++---- > > include/linux/coredump.h | 1 + > > 2 files changed, 56 insertions(+), 4 deletions(-) > > > > diff --git a/fs/coredump.c b/fs/coredump.c > > index 9da592aa8f16..403be0ff780e 100644 > > --- a/fs/coredump.c > > +++ b/fs/coredump.c > > @@ -43,6 +43,9 @@ > > #include <linux/timekeeping.h> > > #include <linux/sysctl.h> > > #include <linux/elf.h> > > +#include <linux/pidfs.h> > > +#include <uapi/linux/pidfd.h> > > +#include <linux/vfsdebug.h> > > > > #include <linux/uaccess.h> > > #include <asm/mmu_context.h> > > @@ -60,6 +63,12 @@ static void free_vma_snapshot(struct coredump_params *cprm); > > #define CORE_FILE_NOTE_SIZE_DEFAULT (4*1024*1024) > > /* Define a reasonable max cap */ > > #define CORE_FILE_NOTE_SIZE_MAX (16*1024*1024) > > +/* > > + * File descriptor number for the pidfd for the thread-group leader of > > + * the coredumping task installed into the usermode helper's file > > + * descriptor table. > > + */ > > +#define COREDUMP_PIDFD_NUMBER 3 > > > > static int core_uses_pid; > > static unsigned int core_pipe_limit; > > @@ -339,6 +348,27 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm, > > case 'C': > > err = cn_printf(cn, "%d", cprm->cpu); > > break; > > + /* pidfd number */ > > + case 'F': { > > + /* > > + * Installing a pidfd only makes sense if > > + * we actually spawn a usermode helper. > > + */ > > + if (!ispipe) > > + break; > > + > > + /* > > + * Note that we'll install a pidfd for the > > + * thread-group leader. We know that task > > + * linkage hasn't been removed yet and even if > > + * this @current isn't the actual thread-group > > + * leader we know that the thread-group leader > > + * cannot be reaped until @current has exited. > > + */ > > + cprm->pid = task_tgid(current); > > + err = cn_printf(cn, "%d", COREDUMP_PIDFD_NUMBER); > > + break; > > + } > > default: > > break; > > } > > > > I tried this change with Apport: I took the Ubuntu mainline kernel build > https://kernel.ubuntu.com/mainline/daily/2025-04-24/ (that refers to > mainline commit e54f9b0410347c49b7ffdd495578811e70d7a407) and applied > these three patches on top. Then I modified Apport to take the > additional `-F%F` and tested that on Ubuntu 25.04 (plucky). The result > is the coredump failed as long as there was `-F%F` on I have no clue what -F%F is and whether that leading -F is something specific to Apport but the specifier is %F not -F%F. For example: > cat /proc/sys/kernel/core_pattern |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %F And note that this requires the pipe logic to be used, aka "|" needs to be specified. Without it this doesn't make sense.