On Tue, 2025-06-03 at 15:31 +0200, Christian Brauner wrote: > In addition to the extensive selftests I've already written a > (non-production ready) simple Rust coredump server for this in > userspace: > > https://github.com/brauner/dumdum.git > > Extend the coredump socket to allow the coredump server to tell the > kernel how to process individual coredumps. This allows for fine-grained > coredump management. Userspace can decide to just let the kernel write > out the coredump, or generate the coredump itself, or just reject it. > > When the crashing task connects to the coredump socket the kernel will > send a struct coredump_req to the coredump server. The kernel will set > the size member of struct coredump_req allowing the coredump server how > much data can be read. > > The coredump server uses MSG_PEEK to peek the size of struct > coredump_req. If the kernel uses a newer struct coredump_req the > coredump server just reads the size it knows and discard any remaining > bytes in the buffer. If the kernel uses an older struct coredump_req > the coredump server just reads the size the kernel knows. > > The returned struct coredump_req will inform the coredump server what > features the kernel supports. The coredump_req->mask member is set to > the currently know features. > > The coredump server may only use features whose bits were raised by the > kernel in coredump_req->mask. > > In response to a coredump_req from the kernel the coredump server sends > a struct coredump_ack to the kernel. The kernel informs the coredump > server what version of struct coredump_ack it supports by setting struct > coredump_req->size_ack to the size it knows about. The coredump server > may only send as many bytes as coredump_req->size_ack indicates (a > smaller size is fine of course). The coredump server must set > coredump_ack->size accordingly. > > The coredump server sets the features it wants to use in struct > coredump_ack->mask. Only bits returned in struct coredump_req->mask may > be used. > > In case an invalid struct coredump_ack is sent to the kernel an > out-of-band byte will be sent by the kernel indicating the reason why > the coredump_ack was rejected. > > The out-of-band markers allow advanced userspace to infer failure. They > are optional and can be ignored by not listening for POLLPRI events and > aren't necessary for the coredump server to function correctly. > > In the initial version the following features are supported in > coredump_{req,ack}->mask: > > * COREDUMP_KERNEL > The kernel will write the coredump data to the socket. > > * COREDUMP_USERSPACE > The kernel will not write coredump data but will indicate to the > parent that a coredump has been generated. This is used when userspace > generates its own coredumps. > > * COREDUMP_REJECT > The kernel will skip generating a coredump for this task. > > * COREDUMP_WAIT > The kernel will prevent the task from exiting until the coredump > server has shutdown the socket connection. > How do you envision COREDUMP_WAIT being used? I took a look at the trivial server, but it wasn't clear to me why you'd want to block the task from exiting. > The flexible coredump socket can be enabled by using the "@@" prefix > instead of the single "@" prefix for the regular coredump socket: > > @@/run/systemd/coredump.socket > > will enable flexible coredump handling. Current kernels already enforce > that "@" must be followed by "/" and will reject anything else. So > extending this is backward and forward compatible. > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> > --- > Changes in v2: > - Add epoll-based concurrent coredump handling selftests. > - Improve cover letter. > - Ensure that enum coredump_oob is packed aka a single byte and add a > static_assert() verifying that. > - Simplify helper functions making the patch even smaller. > - Link to v1: https://lore.kernel.org/20250530-work-coredump-socket-protocol-v1-0-20bde1cd4faa@xxxxxxxxxx > > --- > Christian Brauner (5): > coredump: allow for flexible coredump handling > selftests/coredump: fix build > selftests/coredump: cleanup coredump tests > tools: add coredump.h header > selftests/coredump: add coredump server selftests > > fs/coredump.c | 130 +- > include/uapi/linux/coredump.h | 104 ++ > tools/include/uapi/linux/coredump.h | 104 ++ > tools/testing/selftests/coredump/Makefile | 2 +- > tools/testing/selftests/coredump/config | 4 + > tools/testing/selftests/coredump/stackdump_test.c | 1705 ++++++++++++++++++--- > 6 files changed, 1799 insertions(+), 250 deletions(-) > --- > base-commit: 3e406741b19890c3d8a2ed126aa7c23b106ca9e1 > change-id: 20250520-work-coredump-socket-protocol-6980d1f54c2f -- Jeff Layton <jlayton@xxxxxxxxxx>