Re: pahole and gcc-14 issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 25, 2025 at 10:50 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On 25/04/2025 15:50, Alexei Starovoitov wrote:
> > Hi All,
> >
> > Looks like pahole fails to deduplicate BTF when kernel and
> > kernel module are built with gcc-14.
> > I see this issue with various kernel .config-s on bpf and
> > bpf-next trees.
> > I tried pahole 1.28 and the latest master. Same issues.
> >
> > BTF in bpf_testmod.ko built with gcc-14 has 2849 types.
> > When built with gcc-13 it has 454 types.
> > So something is confusing dedup logic.
> > Would be great if dedup experts can take a look,
> > since this dedup issue is breaking a lot of selftests/bpf.
> >
> > Also vmlinux.h generated out of the kernel compiled with gcc-13
> > and out of the kernel compiled with gcc-14 shows these differences:
> >
> > --- vmlinux13.h    2025-04-24 21:33:50.556884372 -0700
> > +++ vmlinux14.h    2025-04-24 21:39:10.310488992 -0700
> > @@ -148815,7 +148815,6 @@
> >  extern int hid_bpf_input_report(struct hid_bpf_ctx *ctx, enum
> > hid_report_type type, u8 *buf, const size_t buf__sz) __weak __ksym;
> >  extern void hid_bpf_release_context(struct hid_bpf_ctx *ctx) __weak __ksym;
> >  extern int hid_bpf_try_input_report(struct hid_bpf_ctx *ctx, enum
> > hid_report_type type, u8 *buf, const size_t buf__sz) __weak __ksym;
> > -extern bool scx_bpf_consume(u64 dsq_id) __weak __ksym;
> >  extern int scx_bpf_cpu_node(s32 cpu) __weak __ksym;
> >  extern struct rq *scx_bpf_cpu_rq(s32 cpu) __weak __ksym;
> >  extern u32 scx_bpf_cpuperf_cap(s32 cpu) __weak __ksym;
> > @@ -148825,12 +148824,8 @@
> >  extern void scx_bpf_destroy_dsq(u64 dsq_id) __weak __ksym;
> >  extern void scx_bpf_dispatch(struct task_struct *p, u64 dsq_id, u64
> > slice, u64 enq_flags) __weak __ksym;
> >  extern void scx_bpf_dispatch_cancel(void) __weak __ksym;
> > -extern bool scx_bpf_dispatch_from_dsq(struct bpf_iter_scx_dsq
> > *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak
> > __ksym;
> > -extern void scx_bpf_dispatch_from_dsq_set_slice(struct
> > bpf_iter_scx_dsq *it__iter, u64 slice) __weak __ksym;
> >  extern void scx_bpf_dispatch_from_dsq_set_vtime(struct
> > bpf_iter_scx_dsq *it__iter, u64 vtime) __weak __ksym;
> >  extern u32 scx_bpf_dispatch_nr_slots(void) __weak __ksym;
> > -extern void scx_bpf_dispatch_vtime(struct task_struct *p, u64 dsq_id,
> > u64 slice, u64 vtime, u64 enq_flags) __weak __ksym;
> > -extern bool scx_bpf_dispatch_vtime_from_dsq(struct bpf_iter_scx_dsq
> > *it__iter, struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak
> > __ksym;
> >  extern void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64
> > slice, u64 enq_flags) __weak __ksym;
> >  extern void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64
> > dsq_id, u64 slice, u64 vtime, u64 enq_flags) __weak __ksym;
> >  extern bool scx_bpf_dsq_move(struct bpf_iter_scx_dsq *it__iter,
> > struct task_struct *p, u64 dsq_id, u64 enq_flags) __weak __ksym;
> >
> > gcc-14's kernel is clearly wrong.
> > These 5 kfuncs still exist in the kernel.
> > I manually checked there is no if __GNUC__ > 13 in the code.
> > Also:
> > nm bld/vmlinux|grep -w scx_bpf_consume
> > ffffffff8159d4b0 T scx_bpf_consume
> > ffffffff8120ea81 t scx_bpf_consume.cold
> >
> > I suspect the second issue is not related to the dedup problem.
> > All 5 missing kfuncs have ".cold" optimized bodies.
> > But ".cold" maybe a red herring, since
> > nm bld/vmlinux|grep -w scx_bpf_dispatch
> > ffffffff8159d020 T scx_bpf_dispatch
> > ffffffff8120ea0f t scx_bpf_dispatch.cold
> > but this kfunc is present in vmlinux14.h
> >
> > If it makes a difference I have these configs:
> > # CONFIG_DEBUG_INFO_DWARF4 is not set
> > # CONFIG_DEBUG_INFO_DWARF5 is not set
> > # CONFIG_DEBUG_INFO_REDUCED is not set
> > CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
> > # CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
> > # CONFIG_DEBUG_INFO_SPLIT is not set
> > CONFIG_DEBUG_INFO_BTF=y
> > CONFIG_PAHOLE_HAS_SPLIT_BTF=y
> > CONFIG_DEBUG_INFO_BTF_MODULES=y
>
> thanks for the report! I've just reproduced this now with gcc 14; my
> initial theory was it might be DWARF5-related, but dedup issues occur
> for modules with CONFIG_DEBUG_INFO_DWARF4=y also. I'm seeing task_struct
> duplicates in module BTF among other things, so I will try and dig
> further and report back when I find something. Like you I suspect the

This is a bizarre case. I have a custom small tool that recursively
traverses two parallel subgraphs of BTF types and prints anything that
differs between them ([0]). (I had to disable distilled BTF to make
use of this, the issue is present both with distilled BTF and
without).

I see that struct sock both in vmlinux and bpf_testmod.ko are
*IDENTICAL*. There is no difference I could detect. So very weird. I'm
thinking of bisecting, as this didn't happen before with exactly the
same compiler and pahole, so this must be a kernel-side change.

  [0] https://github.com/anakryiko/libbpf-bootstrap/tree/btfdiff-hack

> issues with missing kfuncs are different; may be an issue with our logic
> handling inconsistent functions getting confused by the .cold
> components. But right now understanding dedup issues is the top priority.
>
> Alan





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux