hi folks I just wanted to try and capture some of the discussion from last week's BPF office hours where we talked about this and hopefully we can together plot a path forward that supports inline representation and helps us fix some other long-standing issues with more complex function representation. If I've missed anything important or if anything looks wrong, please do chime in! In discussing this, we concluded that - separating the complex function representations into a separate .BTF section (.BTF.func_aux or something like it) would be valuable since it means tracers can continue to interact with existing function representations that have a straightforward relationship between their parameters and calling conventions stored in the .BTF section, and can optionally also utilize the auxiliary function information in .BTF.func_aux - this gives us a bit more freedom to add new kinds etc to that auxiliary function info, and also to control unauthorized access that might be able to retrieve a function address or other potentially sensitive info from the aux function data - it also means that the only kernel support we would likely initially need to add would be to allow reading of /sys/kernel/btf/vmlinux.func_aux , likely via a dummy module supporting sysfs read. - for modules, we would need to support multi-split BTF, i.e split BTF in .BTF.func_aux in the module that sits atop the .BTF section of the module which in turn sits atop the vmlinux BTF. Again only userspace and tooling support would likely be needed as a first step. I'm looking at this now and it may require no or minimal code changes to libbpf, just testing of the feature. bpftool and pahole would need to support a means of specifying multiple base BTFs in order, but that seems doable too. We were less conclusive on the final form of the representation, but it would ideally help support fully and partially inlined representations and other situations we have today where the calling convention-specified registers and the function parameters do not cleanly line up. Today we leave such representations out of BTF but a location representation would allow us to add them back in. Similarly for functions with the same name but different signatures, having a function address to clarify which signature goes with which site will help. Again we don't have to solve all these problems at once but having them in mind as we figure out the right form of the representation will help. Something along the lines of the variable section where we have triples of <function type id, site address, location BTF id> for each function site will play a role. Again the exact form of the location data is TBD, but we can experiment here to maximize compactness. Andrii pointed out a BTF kind representation may waste bytes; for example a location will likely not require a name offset string representation. Could be an index into an array of location descriptions perhaps. Would be nice to make use of dedup for locations too, likely within pahole rather than BTF dedup proper. An empirical question is how much dedup will help, likely we will just have to try and see. So based on this I think our next steps are: 1. add address info to pahole; I'm working on a proof-of-concept on this hope to have a newer version out this week. Address info would be needed for functions that we wish to represent in the aux section as a way of associating a function site with a location representation. 2. refine the representation of inline info, exploring adding new kind(s) to UAPI btf.h if needed. This would likely mean new APIs in libbpf to add locations and function site info. 3. explore multi-split BTF, adding libbpf-related tests for creation/manipulation of split BTF where the base is another split BTF. Multi-split BTF would be needed for module function aux info I'm hoping we can remove any blocks to further progress; task 3 above can be tackled in parallel while we explore vmlinux inline representation (multi-split is only needed for the module case where the aux info is created atop the module split BTF). I'm hoping to have a bit more done on task 1 later this week. So hopefully there's nothing here that impedes making progress on the inline problem. Again if there's anything I've missed above or that seems unclear, please do follow up. It's really positive that we're tackling this issue so I want to make sure that nothing gets in the way of progressing this. Thanks again! Alan On 16/04/2025 20:20, Thierry Treyer via B4 Relay wrote: > This proposal extends BTF to list the locations of inlined functions and > their arguments in a new '.BTF.inline` section. > > == Background == > > Inline functions are often a blind spot for profiling and tracing tools: > * They cannot probe fully inlined functions. > The BTF contains no data about them. > * They miss calls to partially inlined functions, > where a function has a symbol, but is also inlined in some callers. > * They cannot account for time spent in inlined calls. > Instead, they report the time to the caller. > * They don't provide a way to access the arguments of an inlined call. > > The issue is exacerbated by Link-Time Optimization, which enables more > inlining across Object files. One workaround is to disable inlining for > the profiled functions, but that requires a whole kernel compilation and > doesn't allow for iterative exploration. > > The information required to solve the above problems is not easily > accessible. It requires parsing most of the DWARF's '.debug_info` section, > which is time consuming and not trivial. > Instead, this proposal leverages and extends the existing information > contained in '.BTF` (for typing) and '.BTF.ext` (for caller location), > with information from a new section called '.BTF.inline`, > listing inlined instances. > > == .BTF.inline Section == > > The new '.BTF.inline` section has a layout similar to '.BTF`. > > off |0-bit |16-bits |24-bits |32-bits | > -----+-----------+---------+---------+----------------------------------+ > 0x00 | magic | version | flags | header length | > 0x08 | inline info offset | inline info length | > 0x10 | location offset | location length | > -----+------------------------------------------------------------------+ > ~ inline info section ~ > -----+------------------------------------------------------------------+ > ~ location section ~ > -----+------------------------------------------------------------------+ > > It starts with a header (see 'struct btf_inline_header`), > followed by two subsections: > 1. The 'Inline Info' section contains an entry for each inlined function. > Each entry describes the instance's location in its caller and is > followed by the offsets in the 'Location' section of the parameters > location expressions. See 'struct btf_inline_instance`. > 2. The 'Location' section contains location expressions describing how > to retrieve the value of a parameter. The expressions are NULL- > terminated and are adressed similarly to '.BTF`'s string table. > > struct btf_inline_header { > uint16_t magic; > uint8_t version, flags; > uint32_t header_length; > uint32_t inline_info_offset, inline_info_length; > uint32_t location_offset, location_length; > }; > > struct btf_inline_instance { > type_id_t callee_id; // BTF id of the inlined function > type_id_t caller_id; // BTF id of the caller > uint32_t caller_offset; // offset of the callee within the caller > uint16_t nr_parms; // number of parameters > //uint32_t parm_location[nr_parms]; // offset of the location expression > }; // in 'Location' for each parameter > > == Location Expressions == > > We looked at the DWARF location expressions for the arguments of inlined > instances having <= 100 instances, on a production kernel v6.9.0. This > yielded 176,800 instances with 269,327 arguments. We learned that most > expressions are simple register access, perhaps with an offset. We would > get access to 87% of the arguments by implementing literal and register. > > Op. Category Expr. Count Expr. % > ---------------------------------------- > literal 10714 3.98% > register+above 234698 87.14% > arithmetic+above 239444 88.90% > composite+above 240394 89.26% > stack+above 242075 89.88% > empty 27252 10.12% > > We propose to re-encode DWARF location expressions into a custom BTF > location expression format. It operates on a stack of values, similar to > DWARF's location expressions, but is stripped of unused operators, > while allowing future expansions. > > A location expression is composed of a series of operations, terminated > by a NULL-byte/LOC_END_OF_EXPR operator. The very first expression in the > 'Location' subsection must be an empty expression constisting only of > LOC_END_OF_EXPR. > > An operator is a tagged union: the tag describes the operation to carry > out and the union contains the operands. > > ID | Operator Name | Operands[...] > ----+----------------------+------------------------------------------- > 0 | LOC_END_OF_EXPR | _none_ > 1 | LOC_SIGNED_CONST_1 | s8: constant's value > 2 | LOC_SIGNED_CONST_2 | s16: constant's value > 3 | LOC_SIGNED_CONST_4 | s32: constant's value > 4 | LOC_SIGNED_CONST_8 | s64: constant's value > 5 | LOC_UNSIGNED_CONST_1 | u8: constant's value > 6 | LOC_UNSIGNED_CONST_2 | u16: constant's value > 7 | LOC_UNSIGNED_CONST_4 | u32: constant's value > 8 | LOC_UNSIGNED_CONST_8 | u64: constant's value > 9 | LOC_REGISTER | u8: DWARF register number from the ABI > 10 | LOC_REGISTER_OFFSET | u8: DWARF register number from the ABI > | s64: offset added to the register's value > 11 | LOC_DEREF | u8: size of the deref'd type > > This list should be further expanded to include arithmetic operations. > > Example: accessing a field at offset 12B from a struct whose adresse is > in the '%rdi` register, on amd64, has the following encoding: > > [0x0a 0x05 0x000000000000000c] [0x0b 0x04] [0x00] > | | ` Offset Added | | ` LOC_END_OF_EXPR > | ` Register Number | ` Size of Deref. > ` LOC_REGISTER_OFFSET ` LOC_DEREF > > == Summary == > > Combining the new information from '.BTF.inline` with the existing data > from '.BTF` and '.BTF.ext`, tools will be able to locate inline functions > and their arguments. Symbolizer can also use the data to display the > functions inlined at a given address. > > Fully inlined functions are not part of the BTF and thus are not covered > by this proposal. Adding them to the BTF would enable their coverage and > should be considered. > > Signed-off-by: Thierry Treyer <ttreyer@xxxxxxxx> > --- > Thierry Treyer (3): > dwarf_loader: Add parameters list to inlined expansion > dwarf_loader: Add name to inline expansion > inline_encoder: Introduce inline encoder to emit BTF.inline > > CMakeLists.txt | 3 +- > btf_encoder.c | 5 + > btf_encoder.h | 2 + > btf_inline.pk | 55 ++++++ > dwarf_loader.c | 176 ++++++++++++-------- > dwarves.c | 26 +++ > dwarves.h | 7 + > inline_encoder.c | 496 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > inline_encoder.h | 25 +++ > pahole.c | 40 ++++- > 10 files changed, 765 insertions(+), 70 deletions(-) > --- > base-commit: 4ef47f84324e925051a55de10f9a4f44ef1da844 > change-id: 20250416-btf_inline-e5047eea9b6f > > Best regards,