On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@xxxxxxxx> wrote: > > Hello everyone, > > Here are the estimates for the different encoding schemes we discussed: > - parameters' location takes ~1MB without de-duplication, > - parameters' location shrinks to ~14kB when de-duplicated, > - instead of de-duplicating the individual locations, > de-duplicating functions' parameter lists yields 187kB of locations data. > > We also need to take into account the size of the corresponding funcsec > table, which starts at 3.6MB. The full details follows: > > 1) // params_offset points to the first parameter's location > struct fn_info { u32 type_id, offset, params_offset; }; > 2) // param_offsets point to each parameters' location > struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; }; > 3) // locations are stored inline, in the funcsec table > struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; }; > > Params encoding Locations Size Funcsec Size Total Size > ====================================================================== > (1) param list, no dedup 1,017,654 5,467,824 6,485,478 > (1) param list, w/ dedup 187,379 5,467,824 5,655,203 > (2) param offsets, w/ dedup 14,526 4,808,838 4,823,364 This one is almost as good as (3) below, but fits better into the existing kind+vlen model where there is a variable number of fixed sized elements (but locations can still be variable-sized and keep evolving much more easily). I'd go with this one, unless I'm missing some important benefit of other representations. > (3) param list inline 1,017,654 3,645,216 4,662,870 > > Estimated size in bytes of the new .BTF.func_aux section, from a > production kernel v6.9. It includes both partially and fully inlined > functions in the funcsec tables, with all their parameters, either inline > or in their own sub-section. It does not include type information that > would be required to handle fully inlined functions, functions with > conflicting name, and functions with conflicting prototypes. > > The deduplicated locations in 2) are small enough to be indexed by a u16. > > Storing the locations inline uses the least amount of space. Followed by > storing inline a list of offsets to the locations. Neither of these > approaches have fixed size records in funcsec. "param list, w/ dedup" is > ~1MB larger than inlined locations, but has fixed size records. > > In all cases, the funcsec table uses the most space, compared to the > locations. The size of the `type` sub-section will also grow when we add > the missing type information for fully inlined functions, functions with > conflicting name, and functions with conflicting prototypes. > > With fixed size records in the funcsec table, we'd get faster lookup by > sorting by `type_id` or `offset`. bpftrace could efficiently search the > lower bound of a `type_id` to instrument all its inline instances. > Symbolication tools could efficiently search for inline functions at a > given offset. > > However, it would rule out the most efficient encoding. > How do we want to approach this tradeoff? > > > 2. refine the representation of inline info, exploring adding new > > kind(s) to UAPI btf.h if needed. This would likely mean new APIs in > > libbpf to add locations and function site info. > > > I currently have a pahole prototype to emit "param list, no dedup" and am > close to a patch adding FUNCSEC to libbpf. I was wondering if it would make > sense for FUNCSEC to be a DATASEC with its 'kind_flag` set? Why abuse DATASEC if we are extending BTF with new types anyways? I'd go with a dedicated FUNCSEC (or FUNCSET, maybe?..) BTW, Alan, you've been working on self-describing BTF (size per fixed part of kind + size per vlen items). Any update on that one? Did you get blocked on it somewhere? > > Let me know if you have any questions or have new ideas for the encoding! > > Have a great day, > Thierry > > > > On 19 May 2025, at 13:02, Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > > > hi folks > > > > I just wanted to try and capture some of the discussion from last week's > > BPF office hours where we talked about this and hopefully we can > > together plot a path forward that supports inline representation and > > helps us fix some other long-standing issues with more complex function > > representation. If I've missed anything important or if anything looks > > wrong, please do chime in! > > > > In discussing this, we concluded that > > > > - separating the complex function representations into a separate .BTF > > section (.BTF.func_aux or something like it) would be valuable since it > > means tracers can continue to interact with existing function > > representations that have a straightforward relationship between their > > parameters and calling conventions stored in the .BTF section, and can > > optionally also utilize the auxiliary function information in .BTF.func_aux > > > > - this gives us a bit more freedom to add new kinds etc to that > > auxiliary function info, and also to control unauthorized access that > > might be able to retrieve a function address or other potentially > > sensitive info from the aux function data > > > > - it also means that the only kernel support we would likely initially > > need to add would be to allow reading of > > /sys/kernel/btf/vmlinux.func_aux , likely via a dummy module supporting > > sysfs read. > > > > - for modules, we would need to support multi-split BTF, i.e split BTF > > in .BTF.func_aux in the module that sits atop the .BTF section of the > > module which in turn sits atop the vmlinux BTF. Again only userspace > > and tooling support would likely be needed as a first step. I'm looking > > at this now and it may require no or minimal code changes to libbpf, > > just testing of the feature. bpftool and pahole would need to support a > > means of specifying multiple base BTFs in order, but that seems doable too. > > > > We were less conclusive on the final form of the representation, but it > > would ideally help support fully and partially inlined representations > > and other situations we have today where the calling > > convention-specified registers and the function parameters do not > > cleanly line up. Today we leave such representations out of BTF but a > > location representation would allow us to add them back in. Similarly > > for functions with the same name but different signatures, having a > > function address to clarify which signature goes with which site will help. > > > > Again we don't have to solve all these problems at once but having them > > in mind as we figure out the right form of the representation will help. > > > > Something along the lines of the variable section where we have triples > > of <function type id, site address, location BTF id> for each function > > site will play a role. Again the exact form of the location data is TBD, > > but we can experiment here to maximize compactness. Andrii pointed out a > > BTF kind representation may waste bytes; for example a location will > > likely not require a name offset string representation. Could be an > > index into an array of location descriptions perhaps. Would be nice to > > make use of dedup for locations too, likely within pahole rather than > > BTF dedup proper. An empirical question is how much dedup will help, > > likely we will just have to try and see. > > > > So based on this I think our next steps are: > > > > 1. add address info to pahole; I'm working on a proof-of-concept on this > > hope to have a newer version out this week. Address info would be needed > > for functions that we wish to represent in the aux section as a way of > > associating a function site with a location representation. > > 2. refine the representation of inline info, exploring adding new > > kind(s) to UAPI btf.h if needed. This would likely mean new APIs in > > libbpf to add locations and function site info. > > 3. explore multi-split BTF, adding libbpf-related tests for > > creation/manipulation of split BTF where the base is another split BTF. > > Multi-split BTF would be needed for module function aux info > > > > I'm hoping we can remove any blocks to further progress; task 3 above > > can be tackled in parallel while we explore vmlinux inline > > representation (multi-split is only needed for the module case where the > > aux info is created atop the module split BTF). I'm hoping to have a bit > > more done on task 1 later this week. So hopefully there's nothing here > > that impedes making progress on the inline problem. > > > > Again if there's anything I've missed above or that seems unclear, > > please do follow up. It's really positive that we're tackling this issue > > so I want to make sure that nothing gets in the way of progressing this. > > Thanks again! > > > > Alan >