On Tue, May 13, 2025 at 10:56:03AM +0200, Lennart Poettering wrote: > On Mo, 12.05.25 19:14, Kuniyuki Iwashima (kuniyu@xxxxxxxxxx) wrote: > > > > > Note this version does not use prefix. Now it requires users to > > > > just pass the socket cookie via core_pattern so that the kernel > > > > can verify the peer. > > > > > > Exactly - this means the pattern cannot be static in a sysctl.d early > > > on boot anymore, and has to be set dynamically by <something>. > > > > You missed the socket has to be created dynamically by <something>. > > systemd implements socket activation: the generic code in PID 1 can > bind a socket, and then generically forks off a process (or instances > of processes for connection-based sockets) once traffic is seen on > that socket. On a typical, current systemd system, PID 1 does this for > ~40 sockets by default. The code to bind AF_UNIX or AF_INET/AF_INET6 > sockets is entirely generic. > > Currently, in the existing systemd codebase coredumping is implemented > via socket activation: the core_pattern handler binary quickly hands > off the coredump fds to an AF_UNIX socket bound that way, and the > service behind that does the heavy lifting. Our hope is that with > Christian's work we can make the kernel deliver the coredumps directly > to the socket PID1 generically binds, getting rid of one middle man. > > By requiring userspace to echo the SO_COOKIE value into the > core_pattern sysctl in a special formatting, you define a bespoke > protocol: it's not just enough to bind a socket (for which the generic > code in PID1 is good enough), and to write a fixed > string into a sysctl (for which the generic code in the current > /etc/sysctl.d/ manager, i.e. systemd-sysctl, works fine). But you > suddenly are asking from userspace, that some specific tool runs at > early boot, extracts the socket cookie from PID1 somehow, and writes > that into sysctl. We'd have to come up with a new tool for that, we > can no longer use generic tools. And that's the part that Luca doesn't > like. > > To a large degree I agree with Luca about this. I would much prefer > Christian's earlier proposal (i.e. to simply define some prefix of > AF_UNIX abstract namespace addresses as requiring privs to bind), > because that would enable us to do generic handling in userspace: the > existing socket binding logic in PID 1, and the existing sysctl.d > handling in the systemd suite would be good enough to set up > everything for the coredump handling. > > That said, I'd take what we can get. If enforcing privs on some > abstract namespace socket address prefix is not acceptable, then we > can probably make the SO_COOKIE proposal work (Luca: we'd just hook > some small tool into ExecStartPost= of the .socket unit, and make PID1 > pass the cookie in some env var or so to it; the tool would then just > echo that env var into the sysctl with the fixed prefix). In my eyes, > it's not ideal though: it would mean the sysctl data on every instance > of the system system image would necessarily deviate (because the > socket cookie is going to be different), which mgmt tools won't like > (as you cannot compare sysctl state anymore), and we'd have a weak > conflict of ownership: right now most sysctl settings are managed by > /etc/sysctl.d/, but the core_pattern suddenly wouldn't be > anymore. This will create conflicts because suddenly two components > write to the thing, and will start fighting. > > Hence: I'd *much* prefer Christian's original approach as it does not > have these issues. But I'll take what I can get, we can make the > cookie thing work, but it's much uglier. > > I am not sure I understand why enforcing privs on some abstract > namespace socke address prefix is such an unacceptable idea though. I prefer the prefix approach as well. It's clean, simple and is safe by itself and elegant. And it fits into the generic socket activation and system administration models. I mainly show-cased the cookie model as an elaborate workaround. It can be done but it's ugly and more difficult to use. I do have one more idea how to solve this problem cleanly using regular socket paths that hopefully pleases everyone.