On 29/04/2025 16:37, Andrii Nakryiko wrote: > On Mon, Apr 28, 2025 at 11:59 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: >> >> On Mon, Apr 28, 2025 at 5:33 PM Andrii Nakryiko >> <andrii.nakryiko@xxxxxxxxx> wrote: >>> >>> On Mon, Apr 28, 2025 at 3:12 PM Alexei Starovoitov >>> <alexei.starovoitov@xxxxxxxxx> wrote: >>>> >>>> On Mon, Apr 28, 2025 at 8:21 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: >>>>> >>>>> <1><4bd05>: Abbrev Number: 58 (DW_TAG_pointer_type) >>>>> <4bd06> DW_AT_byte_size : 8 >>>>> <4bd07> DW_AT_address_class: 2 >>>>> <4bd08> DW_AT_type : <0x301cd> >>>>> >>>>> ...which points at an int >>>>> >>>>> <1><301cd>: Abbrev Number: 214 (DW_TAG_base_type) >>>>> <301cf> DW_AT_byte_size : 4 >>>>> <301d0> DW_AT_encoding : 5 (signed) >>>>> <301d1> DW_AT_name : int >>>>> <301d5> DW_AT_name : int >>>>> >>>>> ...but note the the DW_AT_address_class attribute in the latter case and >>>>> the two DW_AT_name values. We don't use that address attribute in pahole >>>>> as far as I can see, but it might be enough to cause problems. >>>> >>>> DW_AT_address_class is there because it's an actual address space >>>> qualifier in C. The dwarf is correct, but I thought pahole >>>> will ignore it while converting to BTF, so it shouldn't matter >>>> from dedup pov. >>>> >>>> And since dedup is working for vmlinux BTF, I doubt there are CUs >>>> where the same type is represented with different dwarf id-s. >>>> Otherwise dedup wouldn't have worked for vmlinux. >>>> >>>> DW_AT_name is concerning. Sounds like it's a gcc bug, but it >>>> shouldn't be causing dedup issues for modules. >>>> >>>> So what is the workaround? >>> >>> I'm thinking of generalizing Alan's proposed fix so that all our >>> existing special equality cases (arrays, identical structs, and now >>> pointers to identical types) are handled a bit more generically. I'll >>> try to get a patch out later tonight. >> >> So I ran out of time, but I'm thinking something like below. It >> results in identical bpf_testmod.ko compared to Alan's proposed fix, >> so perhaps we should just go with the simpler approach. But this one >> should stand the test of time a bit better. >> >> In any case, I'd like to be able to handle not just PTR -> TYPE chain, >> but also some PTR -> MODIFIER* -> TYPE chains at the very least. >> Because any const in the chain will throw off Alan's heuristic. >> > > Ok, so sleeping on this a bit more, I'm hesitant to do this more > generic approach, as now we'll be running a risk of potentially > looping indefinitely (the max_depth check I added doesn't completely > prevent this). > > So, Alan, do you mind sending your proposed patch for formal review > and BPF CI testing? Just please use btf_is_ptr() check instead of > explicit kind equality, thanks! > Sure, will do! Was about to reply that I'd tested your patch and all worked well. I'm a bit worried we may end up playing whack-a-mole with issues in this area so I do like the idea of a more generic approach, but that doesn't have to happen immediately. Note that the missing kfunc issue that Alexei also reported appears to be a separate problem, and I'm seeing that despite having dedup-related fixes. My suspicion that the associated .cold functions were triggering our inconsistent function detection in pahole seems more likely given that the following change: diff --git a/btf_encoder.c b/btf_encoder.c index 0bc2334..a72c9c3 100644 --- a/btf_encoder.c +++ b/btf_encoder.c @@ -1456,7 +1456,7 @@ static void elf_functions__collect_function(struct elf_functions *functions, GEl return; name = elf_sym__name(sym, functions->symtab); - if (!name) + if (!name || strstr(name, ".cold")) return; func = &functions->entries[functions->cnt]; ...appears to be enough to make those kfuncs reappear in BTF. I think ideally we should have a better means of mapping from ELF function -> DWARF rather than purely via name prefix. That part requires a bit more thought, hopefuly we can get that fixed ASAP too. Alan >> I'll try to benchmark and polish the patch tomorrow and post it for >> proper review. >> >> >> diff --git a/src/btf.c b/src/btf.c >> index e9673c0ecbe7..e4a3e3183742 100644 >> --- a/src/btf.c >> +++ b/src/btf.c >> @@ -4310,6 +4310,8 @@ static bool btf_dedup_identical_arrays(struct >> btf_dedup *d, __u32 id1, __u32 id2 >> return btf_equal_array(t1, t2); >> } >> >> +static bool btf_dedup_identical_types(struct btf_dedup *d, __u32 id1, >> __u32 id2); >> + >> /* Check if given two types are identical STRUCT/UNION definitions */ >> static bool btf_dedup_identical_structs(struct btf_dedup *d, __u32 >> id1, __u32 id2) >> { >> @@ -4329,14 +4331,93 @@ static bool btf_dedup_identical_structs(struct >> btf_dedup *d, __u32 id1, __u32 id >> m1 = btf_members(t1); >> m2 = btf_members(t2); >> for (i = 0, n = btf_vlen(t1); i < n; i++, m1++, m2++) { >> - if (m1->type != m2->type && >> - !btf_dedup_identical_arrays(d, m1->type, m2->type) && >> - !btf_dedup_identical_structs(d, m1->type, m2->type)) >> + if (m1->type != m2->type && !btf_dedup_identical_types(d, m1->type, m2->type)) >> + return false; >> + } >> + return true; >> +} >> + >> +static bool btf_dedup_identical_fnprotos(struct btf_dedup *d, __u32 >> id1, __u32 id2) >> +{ >> + const struct btf_param *p1, *p2; >> + struct btf_type *t1, *t2; >> + int n, i; >> + >> + t1 = btf_type_by_id(d->btf, id1); >> + t2 = btf_type_by_id(d->btf, id2); >> + >> + if (!btf_is_func_proto(t1) || !btf_is_func_proto(t2)) >> + return false; >> + >> + if (!btf_compat_fnproto(t1, t2)) >> + return false; >> + >> + if (!btf_dedup_identical_types(d, t1->type, t2->type)) >> + return false; >> + >> + p1 = btf_params(t1); >> + p2 = btf_params(t2); >> + for (i = 0, n = btf_vlen(t1); i < n; i++, p1++, p2++) { >> + if (p1->type != p2->type && !btf_dedup_identical_types(d, p1->type, p2->type)) >> return false; >> } >> return true; >> } >> >> +static bool btf_dedup_identical_types(struct btf_dedup *d, __u32 id1, >> __u32 id2) >> +{ >> + int max_depth = 32; >> + >> + while (max_depth-- > 0) { >> + struct btf_type *t1, *t2; >> + int k1, k2; >> + >> + t1 = btf_type_by_id(d->btf, id1); >> + t2 = btf_type_by_id(d->btf, id2); >> + >> + k1 = btf_kind(t1); >> + k2 = btf_kind(t2); >> + if (k1 != k2) >> + return false; >> + >> + switch (k1) { >> + case BTF_KIND_UNKN: /* VOID */ >> + return true; >> + case BTF_KIND_INT: >> + return btf_equal_int_tag(t1, t2); >> + case BTF_KIND_ENUM: >> + case BTF_KIND_ENUM64: >> + return btf_compat_enum(t1, t2); >> + case BTF_KIND_FWD: >> + case BTF_KIND_FLOAT: >> + return btf_equal_common(t1, t2); >> + case BTF_KIND_CONST: >> + case BTF_KIND_VOLATILE: >> + case BTF_KIND_RESTRICT: >> + case BTF_KIND_PTR: >> + case BTF_KIND_TYPEDEF: >> + case BTF_KIND_FUNC: >> + case BTF_KIND_TYPE_TAG: >> + if (t1->info != t2->info) >> + return 0; >> + id1 = t1->type; >> + id2 = t2->type; >> + continue; >> + case BTF_KIND_ARRAY: >> + return btf_equal_array(t1, t2); >> + case BTF_KIND_STRUCT: >> + case BTF_KIND_UNION: >> + return btf_dedup_identical_structs(d, id1, id2); >> + case BTF_KIND_FUNC_PROTO: >> + return btf_dedup_identical_fnprotos(d, id1, id2); >> + default: >> + return false; >> + } >> + } >> + return false; >> +} >> + >> + >> /* >> * Check equivalence of BTF type graph formed by candidate struct/union (we'll >> * call it "candidate graph" in this description for brevity) to a type graph >> @@ -4458,8 +4539,6 @@ static int btf_dedup_is_equiv(struct btf_dedup >> *d, __u32 cand_id, >> * types within a single CU. So work around that by explicitly >> * allowing identical array types here. >> */ >> - if (btf_dedup_identical_arrays(d, hypot_type_id, cand_id)) >> - return 1; >> /* It turns out that similar situation can happen with >> * struct/union sometimes, sigh... Handle the case where >> * structs/unions are exactly the same, down to the referenced >> @@ -4467,7 +4546,7 @@ static int btf_dedup_is_equiv(struct btf_dedup >> *d, __u32 cand_id, >> * types are different, but equivalent) is *way more* >> * complicated and requires a many-to-many equivalence mapping. >> */ >> - if (btf_dedup_identical_structs(d, hypot_type_id, cand_id)) >> + if (btf_dedup_identical_types(d, hypot_type_id, cand_id)) >> return 1; >> return 0; >> } >> >>> >>>> >>>> We need to find it asap. Since at present we cannot build >>>> kernels with gcc-14, since modules won't dedup BTF. >>>> Hence a bunch of selftests/bpf are failing. >>>> We want to upgrade BPF CI to gcc-14 to catch nginx-like issues, >>>> but we cannot until this pahole/dedup issue is resolved. >