On Thu, May 22, 2025 at 10:56 AM Thierry Treyer <ttreyer@xxxxxxxx> wrote: > > Hello everyone, > > Here are the estimates for the different encoding schemes we discussed: > - parameters' location takes ~1MB without de-duplication, > - parameters' location shrinks to ~14kB when de-duplicated, > - instead of de-duplicating the individual locations, > de-duplicating functions' parameter lists yields 187kB of locations data. > > We also need to take into account the size of the corresponding funcsec > table, which starts at 3.6MB. The full details follows: > > 1) // params_offset points to the first parameter's location > struct fn_info { u32 type_id, offset, params_offset; }; > 2) // param_offsets point to each parameters' location > struct fn_info { u32 type_id, offset; u16 param_offsets[proto.arglen]; }; > 3) // locations are stored inline, in the funcsec table > struct fn_info { u32 type_id, offset; loc inline_locs[proto.arglen]; }; > > Params encoding Locations Size Funcsec Size Total Size > ====================================================================== > (1) param list, no dedup 1,017,654 5,467,824 6,485,478 > (1) param list, w/ dedup 187,379 5,467,824 5,655,203 > (2) param offsets, w/ dedup 14,526 4,808,838 4,823,364 > (3) param list inline 1,017,654 3,645,216 4,662,870 I feel u16 offset isn't really viable. Sooner or later we'd need to bump it, and then we will have a mix of u32 and u16 offsets. The main question I have is why funcsec size is bigger for (1) ? struct fn_info { u32 type_id, offset, params_offset; }; this is fixed size record and the number of them should be the same as in (2) and (3), so single u32 params_offset should be smaller than u16[arg_cnt], assuming that on average arg_cnt >= 2. Or you meant that average arg_cnt <= 1, but then the math is suspicious, since struct fn_info should be 4-byte aligned as everything in BTF. Also for (3), if locs are inlined, why "Locations Size" is not zero ? Or the math for (3) is actually: struct fn_info { u32 type_id, offset } * num_of_funcs ?