Re: [PATCH v6 4/9] coredump: add coredump socket

Lennart Poettering <mzxreary@xxxxxxxxxxx> · Tue, 13 May 2025 10:56:03 +0200

On Mo, 12.05.25 19:14, Kuniyuki Iwashima (kuniyu@xxxxxxxxxx) wrote:

> > > Note this version does not use prefix.  Now it requires users to
> > > just pass the socket cookie via core_pattern so that the kernel
> > > can verify the peer.
> >
> > Exactly - this means the pattern cannot be static in a sysctl.d early
> > on boot anymore, and has to be set dynamically by <something>.
>
> You missed the socket has to be created dynamically by <something>.

systemd implements socket activation: the generic code in PID 1 can
bind a socket, and then generically forks off a process (or instances
of processes for connection-based sockets) once traffic is seen on
that socket. On a typical, current systemd system, PID 1 does this for
~40 sockets by default. The code to bind AF_UNIX or AF_INET/AF_INET6
sockets is entirely generic.

Currently, in the existing systemd codebase coredumping is implemented
via socket activation: the core_pattern handler binary quickly hands
off the coredump fds to an AF_UNIX socket bound that way, and the
service behind that does the heavy lifting. Our hope is that with
Christian's work we can make the kernel deliver the coredumps directly
to the socket PID1 generically binds, getting rid of one middle man.

By requiring userspace to echo the SO_COOKIE value into the
core_pattern sysctl in a special formatting, you define a bespoke
protocol: it's not just enough to bind a socket (for which the generic
code in PID1 is good enough), and to write a fixed
string into a sysctl (for which the generic code in the current
/etc/sysctl.d/ manager, i.e. systemd-sysctl, works fine). But you
suddenly are asking from userspace, that some specific tool runs at
early boot, extracts the socket cookie from PID1 somehow, and writes
that into sysctl. We'd have to come up with a new tool for that, we
can no longer use generic tools. And that's the part that Luca doesn't
like.

To a large degree I agree with Luca about this. I would much prefer
Christian's earlier proposal (i.e. to simply define some prefix of
AF_UNIX abstract namespace addresses as requiring privs to bind),
because that would enable us to do generic handling in userspace: the
existing socket binding logic in PID 1, and the existing sysctl.d
handling in the systemd suite would be good enough to set up
everything for the coredump handling.

That said, I'd take what we can get. If enforcing privs on some
abstract namespace socket address prefix is not acceptable, then we
can probably make the SO_COOKIE proposal work (Luca: we'd just hook
some small tool into ExecStartPost= of the .socket unit, and make PID1
pass the cookie in some env var or so to it; the tool would then just
echo that env var into the sysctl with the fixed prefix). In my eyes,
it's not ideal though: it would mean the sysctl data on every instance
of the system system image would necessarily deviate (because the
socket cookie is going to be different), which mgmt tools won't like
(as you cannot compare sysctl state anymore), and we'd have a weak
conflict of ownership: right now most sysctl settings are managed by
/etc/sysctl.d/, but the core_pattern suddenly wouldn't be
anymore. This will create conflicts because suddenly two components
write to the thing, and will start fighting.

Hence: I'd *much* prefer Christian's original approach as it does not
have these issues. But I'll take what I can get, we can make the
cookie thing work, but it's much uglier.

I am not sure I understand why enforcing privs on some abstract
namespace socke address prefix is such an unacceptable idea though.

Lennart

--
Lennart Poettering, Berlin