Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline

Thierry Treyer <ttreyer@xxxxxxxx> · Thu, 22 May 2025 17:56:29 +0000

Hello everyone,

Here are the estimates for the different encoding schemes we discussed:
- parameters' location takes ~1MB without de-duplication,
- parameters' location shrinks to ~14kB when de-duplicated,
- instead of de-duplicating the individual locations,
  de-duplicating functions' parameter lists yields 187kB of locations data.

We also need to take into account the size of the corresponding funcsec
table, which starts at 3.6MB. The full details follows:

  1) // params_offset points to the first parameter's location
     struct fn_info { u32 type_id, offset, params_offset; };
  2) // param_offsets point to each parameters' location
     struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; };
  3) // locations are stored inline, in the funcsec table
     struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; };

  Params encoding             Locations Size   Funcsec Size   Total Size
  ======================================================================
  (1) param list, no dedup         1,017,654      5,467,824    6,485,478
  (1) param list, w/ dedup           187,379      5,467,824    5,655,203
  (2) param offsets, w/ dedup         14,526      4,808,838    4,823,364
  (3) param list inline            1,017,654      3,645,216    4,662,870

  Estimated size in bytes of the new .BTF.func_aux section, from a
  production kernel v6.9. It includes both partially and fully inlined
  functions in the funcsec tables, with all their parameters, either inline
  or in their own sub-section. It does not include type information that
  would be required to handle fully inlined functions, functions with
  conflicting name, and functions with conflicting prototypes.

  The deduplicated locations in 2) are small enough to be indexed by a u16.

Storing the locations inline uses the least amount of space. Followed by
storing inline a list of offsets to the locations. Neither of these
approaches have fixed size records in funcsec. "param list, w/ dedup" is
~1MB larger than inlined locations, but has fixed size records.

In all cases, the funcsec table uses the most space, compared to the
locations. The size of the `type` sub-section will also grow when we add
the missing type information for fully inlined functions, functions with
conflicting name, and functions with conflicting prototypes.

With fixed size records in the funcsec table, we'd get faster lookup by
sorting by `type_id` or `offset`.  bpftrace could efficiently search the
lower bound of a `type_id` to instrument all its inline instances.
Symbolication tools could efficiently search for inline functions at a
given offset.

However, it would rule out the most efficient encoding.
How do we want to approach this tradeoff?

> 2. refine the representation of inline info, exploring adding new
> kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
> libbpf to add locations and function site info.

I currently have a pahole prototype to emit "param list, no dedup" and am
close to a patch adding FUNCSEC to libbpf. I was wondering if it would make
sense for FUNCSEC to be a DATASEC with its 'kind_flag` set?

Let me know if you have any questions or have new ideas for the encoding!

Have a great day,
Thierry

> On 19 May 2025, at 13:02, Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
> 
> hi folks
> 
> I just wanted to try and capture some of the discussion from last week's
> BPF office hours where we talked about this and hopefully we can
> together plot a path forward that supports inline representation and
> helps us fix some other long-standing issues with more complex function
> representation. If I've missed anything important or if anything looks
> wrong, please do chime in!
> 
> In discussing this, we concluded that
> 
> - separating the complex function representations into a separate .BTF
> section (.BTF.func_aux or something like it) would be valuable since it
> means tracers can continue to interact with existing function
> representations that have a straightforward relationship between their
> parameters and calling conventions stored in the .BTF section, and can
> optionally also utilize the auxiliary function information in .BTF.func_aux
> 
> - this gives us a bit more freedom to add new kinds etc to that
> auxiliary function info, and also to control unauthorized access that
> might be able to retrieve a function address or other potentially
> sensitive info from the aux function data
> 
> - it also means that the only kernel support we would likely initially
> need to add would be to allow reading of
> /sys/kernel/btf/vmlinux.func_aux , likely via a dummy module supporting
> sysfs read.
> 
> - for modules, we would need to support multi-split BTF, i.e split BTF
> in .BTF.func_aux in the module that sits atop the .BTF section of the
> module which in turn sits atop the vmlinux BTF.  Again only userspace
> and tooling support would likely be needed as a first step. I'm looking
> at this now and it may require no or minimal code changes to libbpf,
> just testing of the feature.  bpftool and pahole would need to support a
> means of specifying multiple base BTFs in order, but that seems doable too.
> 
> We were less conclusive on the final form of the representation, but it
> would ideally help support fully and partially inlined representations
> and other situations we have today where the calling
> convention-specified registers and the function parameters do not
> cleanly line up. Today we leave such representations out of BTF but a
> location representation would allow us to add them back in. Similarly
> for functions with the same name but different signatures, having a
> function address to clarify which signature goes with which site will help.
> 
> Again we don't have to solve all these problems at once but having them
> in mind as we figure out the right form of the representation will help.
> 
> Something along the lines of the variable section where we have triples
> of <function type id, site address, location BTF id> for each function
> site will play a role. Again the exact form of the location data is TBD,
> but we can experiment here to maximize compactness. Andrii pointed out a
> BTF kind representation may waste bytes; for example a location will
> likely not require a name offset string representation. Could be an
> index into an array of location descriptions perhaps. Would be nice to
> make use of dedup for locations too, likely within pahole rather than
> BTF dedup proper. An empirical question is how much dedup will help,
> likely we will just have to try and see.
> 
> So based on this I think our next steps are:
> 
> 1. add address info to pahole; I'm working on a proof-of-concept on this
> hope to have a newer version out this week. Address info would be needed
> for functions that we wish to represent in the aux section as a way of
> associating a function site with a location representation.
> 2. refine the representation of inline info, exploring adding new
> kind(s) to UAPI btf.h if needed. This would likely mean new APIs in
> libbpf to add locations and function site info.
> 3. explore multi-split BTF, adding libbpf-related tests for
> creation/manipulation of split BTF where the base is another split BTF.
> Multi-split BTF would be needed for module function aux info
> 
> I'm hoping we can remove any blocks to further progress; task 3 above
> can be tackled in parallel while we explore vmlinux inline
> representation (multi-split is only needed for the module case where the
> aux info is created atop the module split BTF). I'm hoping to have a bit
> more done on task 1 later this week. So hopefully there's nothing here
> that impedes making progress on the inline problem.
> 
> Again if there's anything I've missed above or that seems unclear,
> please do follow up. It's really positive that we're tackling this issue
> so I want to make sure that nothing gets in the way of progressing this.
> Thanks again!
> 
> Alan