Hi Mathieu, On Mon, 21 Jul 2025 11:20:34 -0400 Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote: > Hi! > > I've written up an RFC for a new system call to handle sframe registration > for shared libraries. There has been interest to cover both sframe in > the short term, but also JIT use-cases in the long term, so I'm > covering both here in this RFC to provide the full context. Implementation > wise we could start by only covering the sframe use-case. > > I've called it "codectl(2)" for now, but I'm of course open to feedback. Nice idea for JIT, but I doubt we need this for ELF. > > For ELF, I'm including the optional pathname, build id, and debug link > information which are really useful to translate from instruction pointers > to executable/library name, symbol, offset, source file, line number. For ELF file, does the kernel already know how to parse the elf header? I just wonder what happen if user sends different information to the kernel. > This is what we are using in LTTng-UST and Babeltrace debug-info filter > plugin [1], and I think this would be relevant for kernel tracers as well > so they can make the resulting stack traces meaningful to users. > > sys_codectl(2) > ================= > > * arg0: unsigned int @option: > > /* Additional labels can be added to enum code_opt, for extensibility. */ > > enum code_opt { > CODE_REGISTER_ELF, > CODE_REGISTER_JIT, > CODE_UNREGISTER, > }; > > * arg1: void * @info > > /* if (@option == CODE_REGISTER_ELF) */ > > /* > * text_start, text_end, sframe_start, sframe_end allow unwinding of the > * call stack. > * > * elf_start, elf_end, pathname, and either build_id or debug_link allows > * mapping instruction pointers to file, symbol, offset, and source file > * location. > */ > struct code_elf_info { > : __u64 elf_start; > __u64 elf_end; > __u64 text_start; > __u64 text_end; What happen if there are multiple .text.* sections? Or, does it used for each text section? > __u64 sframe_start; > __u64 sframe_end; > __u64 pathname; /* char *, NULL if unavailable. */ > > __u64 build_id; /* char *, NULL if unavailable. */ > __u64 debug_link_pathname; /* char *, NULL if unavailable. */ > __u32 build_id_len; > __u32 debug_link_crc; > }; > > > /* if (@option == CODE_REGISTER_JIT) */ > > /* > * Registration of sorted JIT unwind table: The reserved memory area is > * of size reserved_len. Userspace increases used_len as new code is > * populated between text_start and text_end. This area is populated in > * increasing address order, and its ABI requires to have no overlapping > * fre. This fits the common use-case where JITs populate code into > * a given memory area by increasing address order. The sorted unwind > * tables can be chained with a singly-linked list as they become full. > * Consecutive chained tables are also in sorted text address order. > * > * Note: if there is an eventual use-case for unsorted jit unwind table, > * this would be introduced as a new "code option". > */ > > struct code_jit_info { > __u64 text_start; /* text_start >= addr */ > __u64 text_end; /* addr < text_end */ > __u64 unwind_head; /* struct code_jit_unwind_table * */ > }; > > struct code_jit_unwind_fre { > /* > * Contains info similar to sframe, allowing unwind for a given Hmm, why not just the sframe? (Is there any library to generate sframe online for JIT?) Thank you, > * code address range. > */ > __u32 size; > __u32 ip_off; /* offset from text_start */ > __s32 cfa_off; > __s32 ra_off; > __s32 fp_off; > __u8 info; > }; > > struct code_jit_unwind_table { > __u64 reserved_len; > __u64 used_len; /* > * Incremented by userspace (store-release), read by > * the kernel (load-acquire). > */ > __u64 next; /* Chain with next struct code_jit_unwind_table. */ > struct code_jit_unwind_fre fre[]; > }; > > /* if (@option == CODE_UNREGISTER) */ > > void *info > > * arg2: size_t info_size > > /* > * Size of @info structure, allowing extensibility. See > * copy_struct_from_user(). > */ > > * arg3: unsigned int flags (0) > > /* Flags for extensibility. */ > > Your feedback is welcome, > > Thanks, > > Mathieu > > [1] https://babeltrace.org/docs/v2.0/man7/babeltrace2-filter.lttng-utils.debug-info.7/ > > -- > Mathieu Desnoyers > EfficiOS Inc. > https://www.efficios.com > -- Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>