On 23/05/2025 19:57, Thierry Treyer wrote: >>>> 2) // param_offsets point to each parameters' location >>>> struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; }; >>>> [...] >>>> (2) param offsets, w/ dedup 14,526 4,808,838 4,823,364 >>> >>> This one is almost as good as (3) below, but fits better into the >>> existing kind+vlen model where there is a variable number of fixed >>> sized elements (but locations can still be variable-sized and keep >>> evolving much more easily). I'd go with this one, unless I'm missing >>> some important benefit of other representations. >> >> Thierry, could you please provide some details for the representation >> of both fn_info and parameters for this case? > > The locations are stored in their own sub-section, like strings, using the > encoding described previously. A location is a tagged union of an operation > and its operands describing how to find to parameter’s value. > > The locations for nil, ’%rdi’ and ’*(%rdi + 32)’ are encoded as follow: > > [0x00] [0x09 0x05] [0x0a 0x05 0x00000020] > # `NIL `REG #5 | `Reg#5 `Offset added to Reg’s value > # `ADDR_REG_OFF > > The funcsec table starts with a `struct btf_type` of type FUNCSEC, followed by > vlen `struct btf_func_secinfo` (referred previously as fn_info): > > .align(4) > struct btf_func_secinfo { > __u32 type_id; // Type ID of FUNC > __u32 offset; // Offset in section > __u16 parameter_offsets[proto.vlen]; // Offsets to params’ location > }; > > To know how many parameters a function has, you’d use its type_id to retrieve > its FUNC, then its FUNC_PROTO to finally get the FUNC_PROTO vlen. > Optimized out parameters won’t have a location, so we need a NIL to skip them. > > > Given a function with arg0 optimized out, arg1 at *(%rdi + 32) and arg2 in %rdi. > You’d get the following encoding: > > [1] FUNC_PROTO, vlen=3 > ...args > [2] FUNC 'foo' type_id=1 > [3] FUNCSEC '.text', vlen=1 # ,NIL ,*(%rdi + 32) > - type_id=n, offset=0x1234, params=[0x0, 0x3, 0x1] > # `%rdi > > # Regular BTF encoding for 1 and 2 > ... > # ,FUNCSEC ’.text’, vlen=1 > [0x000001 0x14000001 0x00000000] > # ,btf_func_secinfo ,params=[0x0, 0x3, 0x1] + extra nil for alignment > [0x00000002 0x00001234 0x0000 0x0003 0x0001 0x0000] > > Note: I didn’t take into account the 4-bytes padding requirement of BTF. > I’ve sent the correct numbers when responding to Alexei. > >> I'm curious how far this version is from exhausting u16 limit. > > > We’re already using 22% of the 64 kiB addressable by u16. > >> Why abuse DATASEC if we are extending BTF with new types anyways? I'd >> go with a dedicated FUNCSEC (or FUNCSET, maybe?..) > > I'm not sure that a 'set' describes the table best, since a function > can have multiple entries in the table. > FUNCSEC is ugly, but it conveys that the offsets are from a section’s base. I totally agree that we have more freedom to define new representations here, so don't feel too constrained by existing representations like DATASEC if they are not helpful. One thing I hadn't really thought about before you suggested it is having the locations in a separate section from types as we have for strings. Do we need that? Or could we have a BTF_KIND_LOC_SEC that is associated with the FUNC_SEC via a type id (loc sec points at the type of the associated func sec) and contains the packed location info? In other words [3] FUNCSEC '.text', vlen= ... <func_id, offset, param_location_offsets[]> ... [4] LOCSEC '.text', type_id=3 <packed locations> ...