Re: [PATCH v7 4/9] coredump: add coredump socket

Christian Brauner <brauner@xxxxxxxxxx> · Fri, 16 May 2025 12:20:59 +0200

On Fri, May 16, 2025 at 12:09:21PM +0200, Christian Brauner wrote:
> On Thu, May 15, 2025 at 10:54:14PM +0200, Jann Horn wrote:
> > On Thu, May 15, 2025 at 12:04 AM Christian Brauner <brauner@xxxxxxxxxx> wrote:
> > > diff --git a/fs/coredump.c b/fs/coredump.c
> > > index a70929c3585b..e1256ebb89c1 100644
> > > --- a/fs/coredump.c
> > > +++ b/fs/coredump.c
> > [...]
> > > @@ -393,11 +428,20 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
> > >          * If core_pattern does not include a %p (as is the default)
> > >          * and core_uses_pid is set, then .%pid will be appended to
> > >          * the filename. Do not do this for piped commands. */
> > > -       if (!(cn->core_type == COREDUMP_PIPE) && !pid_in_pattern && core_uses_pid) {
> > > -               err = cn_printf(cn, ".%d", task_tgid_vnr(current));
> > > -               if (err)
> > > -                       return err;
> > > +       if (!pid_in_pattern && core_uses_pid) {
> > > +               switch (cn->core_type) {
> > > +               case COREDUMP_FILE:
> > > +                       return cn_printf(cn, ".%d", task_tgid_vnr(current));
> > > +               case COREDUMP_PIPE:
> > > +                       break;
> > > +               case COREDUMP_SOCK:
> > > +                       break;
> > 
> > This branch is dead code, we can't get this far down with
> > COREDUMP_SOCK. Maybe you could remove the "break;" and fall through to
> > the default WARN_ON_ONCE() here. Or better, revert this hunk and
> > instead just change the check to check for "cn->core_type ==
> > COREDUMP_FILE" (in patch 1), since this whole block is legacy logic
> > specific to dumping into files (COREDUMP_FILE).
> 
> Ok, folded:
> 
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 368751d98781..45725465c299 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -393,11 +393,8 @@ static int format_corename(struct core_name *cn, struct coredump_params *cprm,
>          * If core_pattern does not include a %p (as is the default)
>          * and core_uses_pid is set, then .%pid will be appended to
>          * the filename. Do not do this for piped commands. */
> -       if (!(cn->core_type == COREDUMP_PIPE) && !pid_in_pattern && core_uses_pid) {
> -               err = cn_printf(cn, ".%d", task_tgid_vnr(current));
> -               if (err)
> -                       return err;
> -       }
> +       if (cn->core_type == COREDUMP_FILE && !pid_in_pattern && core_uses_pid)
> +               return cn_printf(cn, ".%d", task_tgid_vnr(current));
>         return 0;
>  }
> 
> into the first patch.
> 
> > 
> > > +               default:
> > > +                       WARN_ON_ONCE(true);
> > > +                       return -EINVAL;
> > > +               }
> > >         }
> > > +
> > >         return 0;
> > >  }
> > >
> > > @@ -801,6 +845,55 @@ void do_coredump(const kernel_siginfo_t *siginfo)
> > >                 }
> > >                 break;
> > >         }
> > > +       case COREDUMP_SOCK: {
> > > +#ifdef CONFIG_UNIX
> > > +               struct file *file __free(fput) = NULL;
> > > +               struct sockaddr_un addr = {
> > > +                       .sun_family = AF_UNIX,
> > > +               };
> > > +               ssize_t addr_len;
> > > +               struct socket *socket;
> > > +
> > > +               retval = strscpy(addr.sun_path, cn.corename, sizeof(addr.sun_path));
> > 
> > nit: strscpy() explicitly supports eliding the last argument in this
> > case, thanks to macro magic:
> > 
> >  * The size argument @... is only required when @dst is not an array, or
> >  * when the copy needs to be smaller than sizeof(@dst).
> 
> Ok.
> 
> > 
> > > +               if (retval < 0)
> > > +                       goto close_fail;
> > > +               addr_len = offsetof(struct sockaddr_un, sun_path) + retval + 1;
> > 
> > nit: On a 64-bit system, strscpy() returns a 64-bit value, and
> > addr_len is also 64-bit, but retval is 32-bit. Implicitly moving
> > length values back and forth between 64-bit and 32-bit is slightly
> > dodgy and might generate suboptimal code (it could force the compiler
> > to emit instructions to explicitly truncate the value if it can't
> > prove that the value fits in 32 bits). It would be nice to keep the
> > value 64-bit throughout by storing the return value in a ssize_t.
> > 
> > And actually, you don't have to compute addr_len here at all; that's
> > needed for abstract unix domain sockets, but for path-based unix
> > domain socket, you should be able to just use sizeof(struct
> > sockaddr_un) as addrlen. (This is documented in "man 7 unix".)
> 
> Ok, folded:
> 
> @@ -845,10 +845,10 @@ void do_coredump(const kernel_siginfo_t *siginfo)
>                 ssize_t addr_len;
>                 struct socket *socket;
> 
> -               retval = strscpy(addr.sun_path, cn.corename);
> -               if (retval < 0)
> +               addr_len = strscpy(addr.sun_path, cn.corename);
> +               if (addr_len < 0)
>                         goto close_fail;
> -               addr_len = offsetof(struct sockaddr_un, sun_path) + retval + 1;
> +               addr_len += offsetof(struct sockaddr_un, sun_path) + 1;
> 
> > 
> > > +
> > > +               /*
> > > +                * It is possible that the userspace process which is
> > > +                * supposed to handle the coredump and is listening on
> > > +                * the AF_UNIX socket coredumps. Userspace should just
> > > +                * mark itself non dumpable.
> > > +                */
> > > +
> > > +               retval = sock_create_kern(&init_net, AF_UNIX, SOCK_STREAM, 0, &socket);
> > > +               if (retval < 0)
> > > +                       goto close_fail;
> > > +
> > > +               file = sock_alloc_file(socket, 0, NULL);
> > > +               if (IS_ERR(file)) {
> > > +                       sock_release(socket);
> > 
> > I think you missed an API gotcha here. See the sock_alloc_file() documentation:
> > 
> >  * On failure @sock is released, and an ERR pointer is returned.
> 
> Thanks, fixed.
> 
> > 
> > So I think basically sock_alloc_file() always consumes the socket
> > reference provided by the caller, and the sock_release() in this
> > branch is a double-free?
> 
> > 
> > > +                       goto close_fail;
> > > +               }
> > [...]
> > > diff --git a/include/linux/net.h b/include/linux/net.h
> > > index 0ff950eecc6b..139c85d0f2ea 100644
> > > --- a/include/linux/net.h
> > > +++ b/include/linux/net.h
> > > @@ -81,6 +81,7 @@ enum sock_type {
> > >  #ifndef SOCK_NONBLOCK
> > >  #define SOCK_NONBLOCK  O_NONBLOCK
> > >  #endif
> > > +#define SOCK_COREDUMP  O_NOCTTY
> > 
> > Hrrrm. I looked through all the paths from which the ->connect() call
> > can come, and I think this is currently safe; but I wonder if it would
> 
> Yes, I made sure that unknown bits are excluded.

See the appended updated version for completeness sake.