Re: [PATCH RFC 0/3] list inline expansions in .BTF.inline

Thierry Treyer <ttreyer@xxxxxxxx> · Thu, 1 May 2025 19:38:32 +0000

> 1. multiple functions with the same name and different function
> signatures. Since we have no mechanism currently to associate function
> and site we simply refuse to encode them in BTF today
> 2. functions with inconsistent representations. If a function does not
> use the expected registers for its function signature due to
> optimizations we leave it out of BTF representation; and of course
> 3. inline functions are not currently represented at all.
> 
> I think we can do a better job with 1 and 2 while solving 3 as well.
> Here's my suggestion.

I see how your approach covers all these problems!
I would also add the following issue, which is a variant of 2 and 3:

4. Partially inlined functions: functions having a symbol, but is also
        inlined at some call sites. Currently not represented either.

> First, we separate functions which have complicated relationships with
> their parameters (cases 1, 2 and 3) into a separate .BTF.func_aux
> section or similar. That can be delivered in vmlinux or via a
> special-purpose module; for modules it would be just a separate ELF
> section as it would likely be small. We can control access to ensure
> unprivileged users cannot get address information, hence the separation
> from vmlinux BTF. But it is just (split) BTF, so no new format required.
> 
> The advantage of this is that tracers today can do the straightforward
> tracing of functions from /sys/kernel/btf/vmlinux, and if a function is
> not there and the tracer supports handling more complex cases, it can
> know to look in /sys/kernel/btf/vmlinux.func_aux.

Sounds good to me!
Laying out the format of this new .BTF.func_aux section:

+---------------------+
| BTF.func_aux header |
+---------------------+
~  type info section  ~
+---------------------+
~   string section    ~
+---------------------+
~  location section   ~
+---------------------+

> In that section we have a split BTF representation in which function
> signatures for cases 1, 2, and 3 are represented in the usual way (FUNC
> pointing at FUNC_PROTO). However since we know that the relationship
> between function and its site(s) is complex, we need some extra info.

We have the same base as the BTF section, so we can encode FUNC and
FUNC_PROTO in the 'type info section'. The strings for the new functions'
names get deduplicated and stored in the 'string section'.

The 'location section' lists location expressions to locate the arguments.
As discussed with Alexei, _one_ LOC_* operation will describe the location
of _one_ argument; there is no series of operations to carry out in order
to retrieve the argument's value. This also makes re-using location
expressions across multiple arguments/functions through de-duplication.

> I'd propose we add a DATASEC containing functions and their addresses. A
> FUNC datasec it could be laid out as follows
> 
> struct btf_func_secinfo {
> __u32 type;
> __u32 offset;
> __u32 loc;
> };

We'd have a new BTF_KIND_FUNCSEC type followed by 'vlen' btf_func_secinfo.
I see how 'type' and 'offset' can be used to disambiguate between functions
sharing the same name, but I'm confused by 'loc'. Functions with multiple
arguments will need a location expression for each of them.
How about having another 'vlen', followed by the offsets into the location
section?

struct btf_func_secinfo {
__u32 type;
__u32 offset;
__u32 vlen;
// Followed by: __u32 locs[vlen];
}

Or did you have something else in mind?

> In the case of 1 (multiple signatures for a function name) the DATASEC
> will have entries for each site which tie it to its associated FUNC.
> This allows us to clarify which function is associated with which
> address. So the type is the BTF_KIND_FUNC, the offset the address and
> the loc is 0 since we don't need it for this case since the functions
> have parameters in expected locations.
> 
> In the case of 2 (functions with inconsistent representations) we use
> the type to point at the FUNC, the offset the address of the function
> and the loc to represent location info for that site. By leaving out
> caller/callee info from location data we could potentially exploit the
> fact that a lot of locations have similar layouts in terms of where
> parameters are available, making dedup of location info possible.
> Caller/callee relationship can still be inferred via the site address.
> 
> Finally in case 3 we have inlines which would be represented similarly
> to case 2; i.e. we marry a representation of the function (the type) to
> the associated inline site via location data in the loc field.

Here's how it could look like:

[1] FUNC_PROTO ...
      ...args
[2] FUNC 'foo' type_id=1   # 1. name collision with [4]
[3] FUNC_PROTO ...
      ...args
[4] FUNC 'foo' type_id=3   # 1. name collision with [2]
[5] FUNC_PROTO ...
      ...args
[6] FUNC 'bar' type_id=5   # 2. non-standard arguments location
[7] FUNC_PROTO ...
      ...args
[8] FUNC 'baz' type_id=7   # 3-4. partially/fully inlined function
[9] FUNCSEC '.text', vlen=5
  - type_id=2, offset=0x1000, loc=0 # 1. share the same name, but
  - type_id=4, offset=0x2000, loc=0 #    differentiate with the offset
  - type_id=6, offset=0x3000, loc=??? # 2. non-standard args location
    * offset of arg0 locexpr: 0x1234  #    each arg gets a loc offset
    * offset of arg1 locexpr: 0x5678  #    or some other encoding?
  - type_id=8, offset=0x4000, loc=0   # 4. non-inlined instance
  - type_id=8, offset=0x1050, loc=??? # 3. inlined instance
    * # ...args loc offsets

> If so, the question becomes what are we missing today? As far as I can
> see we need
> 
> - support for new kinds BTF_KIND_FUNC_DATASEC, or simply use the kind
> flag for existing BTF datasec to indicate function info
> - support for new location kind
> - pahole support to generate address-based datasec and location separately
> - for modules, we would eventually need multi-split BTF that would allow
> the func aux section to be split BTF on top of existing module BTF, i.e.
> a 3-level split BTF

Do you think locations should be part of the 'type info section'?
Or should they have their own 'location section'?

For modules, I'm less familiar with them.
Would you have some guidance about their requirements?

> As I think some of the challenges you ran into implementing this
> indicate, the current approach of matching ELF and DWARF info via name
> only is creaking at the seams, and needs to be reworked (in fact it is
> the source of a bug Alexei ran into around missing kfuncs). So I'm
> hoping to get a patch out this week that uses address info to aid the
> matching between ELF/DWARF, and from there it's a short jump to using it
> in DATASEC representations.
> 
> Anyway let me know what you think. If it sounds workable we could
> perhaps try prototyping the pieces and see if we can get them working
> with location info.

I'll look into emitting functions that are currently not represented,
because they fall in the pitfalls 1-4. That will give us the base for
the new .BTF.func_aux section.
I'm looking forward to use your patch to simplify the linking between
DWARF and BTF.

Thanks for your time and have a great day,
Thierry