Hi Thierry, Great progress! Some high level notes below. On Wed, Apr 16, 2025 at 07:20:34PM +0000, Thierry Treyer via B4 Relay wrote: > This proposal extends BTF to list the locations of inlined functions and > their arguments in a new '.BTF.inline` section. > > == Background == > > Inline functions are often a blind spot for profiling and tracing tools: > * They cannot probe fully inlined functions. > The BTF contains no data about them. > * They miss calls to partially inlined functions, > where a function has a symbol, but is also inlined in some callers. > * They cannot account for time spent in inlined calls. > Instead, they report the time to the caller. > * They don't provide a way to access the arguments of an inlined call. > > The issue is exacerbated by Link-Time Optimization, which enables more > inlining across Object files. One workaround is to disable inlining for > the profiled functions, but that requires a whole kernel compilation and > doesn't allow for iterative exploration. > > The information required to solve the above problems is not easily > accessible. It requires parsing most of the DWARF's '.debug_info` section, > which is time consuming and not trivial. > Instead, this proposal leverages and extends the existing information > contained in '.BTF` (for typing) and '.BTF.ext` (for caller location), > with information from a new section called '.BTF.inline`, > listing inlined instances. > > == .BTF.inline Section == > > The new '.BTF.inline` section has a layout similar to '.BTF`. > > off |0-bit |16-bits |24-bits |32-bits | > -----+-----------+---------+---------+----------------------------------+ > 0x00 | magic | version | flags | header length | > 0x08 | inline info offset | inline info length | > 0x10 | location offset | location length | > -----+------------------------------------------------------------------+ > ~ inline info section ~ > -----+------------------------------------------------------------------+ > ~ location section ~ > -----+------------------------------------------------------------------+ > > It starts with a header (see 'struct btf_inline_header`), > followed by two subsections: > 1. The 'Inline Info' section contains an entry for each inlined function. > Each entry describes the instance's location in its caller and is > followed by the offsets in the 'Location' section of the parameters > location expressions. See 'struct btf_inline_instance`. > 2. The 'Location' section contains location expressions describing how > to retrieve the value of a parameter. The expressions are NULL- > terminated and are adressed similarly to '.BTF`'s string table. > > struct btf_inline_header { > uint16_t magic; > uint8_t version, flags; > uint32_t header_length; > uint32_t inline_info_offset, inline_info_length; > uint32_t location_offset, location_length; > }; > > struct btf_inline_instance { > type_id_t callee_id; // BTF id of the inlined function > type_id_t caller_id; // BTF id of the caller > uint32_t caller_offset; // offset of the callee within the caller > uint16_t nr_parms; // number of parameters > //uint32_t parm_location[nr_parms]; // offset of the location expression > }; // in 'Location' for each parameter > > == Location Expressions == > > We looked at the DWARF location expressions for the arguments of inlined > instances having <= 100 instances, on a production kernel v6.9.0. This > yielded 176,800 instances with 269,327 arguments. We learned that most > expressions are simple register access, perhaps with an offset. We would > get access to 87% of the arguments by implementing literal and register. > > Op. Category Expr. Count Expr. % > ---------------------------------------- > literal 10714 3.98% > register+above 234698 87.14% > arithmetic+above 239444 88.90% > composite+above 240394 89.26% > stack+above 242075 89.88% > empty 27252 10.12% > > We propose to re-encode DWARF location expressions into a custom BTF > location expression format. It operates on a stack of values, similar to > DWARF's location expressions, but is stripped of unused operators, > while allowing future expansions. A stack machine seems overkill. I'm certainly not an expert on DWARF location expressions, but I think we need to get away from arbitrarily complex expressions, even if they are simpler than DWARF ones. I don't think we want consumers implementing any kind of interpreter or VM. I'd vote for something extremely prescriptive, even if it means adding a lot of enum variants. At least this way, consumers can be sure they've fully implemented the spec and detect when more complex support is added. > > A location expression is composed of a series of operations, terminated > by a NULL-byte/LOC_END_OF_EXPR operator. The very first expression in the > 'Location' subsection must be an empty expression constisting only of > LOC_END_OF_EXPR. > > An operator is a tagged union: the tag describes the operation to carry > out and the union contains the operands. > > ID | Operator Name | Operands[...] > ----+----------------------+------------------------------------------- > 0 | LOC_END_OF_EXPR | _none_ > 1 | LOC_SIGNED_CONST_1 | s8: constant's value > 2 | LOC_SIGNED_CONST_2 | s16: constant's value > 3 | LOC_SIGNED_CONST_4 | s32: constant's value > 4 | LOC_SIGNED_CONST_8 | s64: constant's value > 5 | LOC_UNSIGNED_CONST_1 | u8: constant's value > 6 | LOC_UNSIGNED_CONST_2 | u16: constant's value > 7 | LOC_UNSIGNED_CONST_4 | u32: constant's value > 8 | LOC_UNSIGNED_CONST_8 | u64: constant's value > 9 | LOC_REGISTER | u8: DWARF register number from the ABI > 10 | LOC_REGISTER_OFFSET | u8: DWARF register number from the ABI > | s64: offset added to the register's value > 11 | LOC_DEREF | u8: size of the deref'd type > > This list should be further expanded to include arithmetic operations. > > Example: accessing a field at offset 12B from a struct whose adresse is > in the '%rdi` register, on amd64, has the following encoding: > > [0x0a 0x05 0x000000000000000c] [0x0b 0x04] [0x00] > | | ` Offset Added | | ` LOC_END_OF_EXPR > | ` Register Number | ` Size of Deref. > ` LOC_REGISTER_OFFSET ` LOC_DEREF > > == Summary == > > Combining the new information from '.BTF.inline` with the existing data > from '.BTF` and '.BTF.ext`, tools will be able to locate inline functions > and their arguments. Symbolizer can also use the data to display the > functions inlined at a given address. > > Fully inlined functions are not part of the BTF and thus are not covered > by this proposal. Adding them to the BTF would enable their coverage and > should be considered. I think supporting fully inlined functions as part of this work would add a lot of value to users. It doesn't necessarily have to happen in the first patchset, but we probably want to plan on doing it. Regarding BTF, another option is to just leave `callee_id` unset, right? Consumers should be able to recognize BTF for the inlined function isn't available and then act accordingly. For bpftrace, that probably means not allowing function argument access. [...] Thanks, Daniel