On Thu, Apr 03, 2025 at 07:40:19PM +0100, Alan Maguire wrote: > On 02/04/2025 21:35, Arnaldo Carvalho de Melo wrote: > > + // Right now encoding BTF has to be from DWARF, so enforce that, otherwise > > + // the loading process can fall back to other formats, BTF being the case > > + // and as this is at this point unintended, avoid that. > > + // Next we need to just skip object files that don't have the format we > > + // expect as the source for BTF encoding, i.e. no DWARF, no BTF, no problema. > > + if (btf_encode && conf_load.format_path == NULL) > > + conf_load.format_path = "dwarf"; > > if (show_running_kernel_vmlinux) { > > const char *vmlinux = vmlinux_path__find_running_kernel(); > LGTM > Acked-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > In the same spirit, should we error out if the format path is explicitly > set to anything other than DWARF for now when btf_encode is set? Probably, but I focused just on this part as I'm trying to follow up on an idea I described to folks about doing the BTF encoding right after the .o file is generated, to try and exploit all being in memory and the natural parallelism of the kernel build process. If DWARF isn't asked for, which some developers do to avoid having tons of things hitting the disk, then this gets discarded right away, the kernel build patch I have right now is super simple and doesn't trow away DWARF yet: acme@x1:~/git/linux$ git diff diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 4d543054f72356a4..02a595b82b299151 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -313,7 +313,7 @@ cmd_ld_single = $(if $(objtool-enabled)$(is-single-obj-m), ; $(LD) $(ld_flags) - endif quiet_cmd_cc_o_c = CC $(quiet_modtag) $@ - cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< \ + cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< && ${PAHOLE} --btf_encode ${PAHOLE_FLAGS} $@ \ $(cmd_ld_single) \ $(cmd_objtool) acme@x1:~/git/linux$ With a the above patch and the following: acme@x1:~/git/pahole$ git show commit ee3e75da682eb6bcc191127ea4324677ca2ab7c4 (HEAD -> master) Author: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> Date: Thu Apr 3 15:28:35 2025 -0300 pahole: Don't fail if no DWARF is available when encoding BTF, just skip While building the kernel there are a few files that have no DWARF and thus needs to get skipped while doing early BTF generation, i.e. per .o file instead of vmlinux. Things like arch/x86/purgatory/purgatory.c. Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> diff --git a/pahole.c b/pahole.c index f34e1ecd5542c245..59b7418e3eff89e5 100644 --- a/pahole.c +++ b/pahole.c @@ -3609,6 +3609,11 @@ try_sole_arg_as_class_names: err = cus__load_files(cus, &conf_load, argv + remaining); if (err != 0) { + if (btf_encode && strcmp(conf_load.format_path, "dwarf") == 0) { + // Assume no DWARF is available and thus just skip this object file + goto out_ok; + } + if (class_name == NULL && !btf_encode && !ctf_encode) { class_name = argv[remaining]; acme@x1:~/git/pahole$ I end up with a vmlinux.o that has .BTF file that combines all the .BTF for the .o files that comprise vmlinux and then the next step is to use some --btf_feature=dedup_archive that will basically notice that no DWARF to BTF is needed, just make btf_parse_elf() to see that the .BTF section is bigger than what its in the btf_header and then go on in a loop using btf__add_btf() to go on merging all BTFs to then do the dedup. This process is kinda what will happen when gcc -gbtf takes place, i.e. its generated by the compiler, then after the linker merges all of the .BTF files (as it does with my patches) it will do the dedup, be it in the linker, compiler or using libbpf or libctf or whatever other method. So doing it now is paving the way for this future and may end up speeding up building kernels while we wait for the whole cc+ld doing it gets in place and is dependable. There are issues with the above that needs to be addressed after we have some basic stuff in place, which I'm getting to, but lets see if I (or somebody else with less distractions tham myself) do it. Some numbers: acme@x1:~/git/pahole$ readelf -SW ../build/v6.14.0+/vmlinux.o | grep .BTF [125] .BTF_ids PROGBITS 0000000000000000 2a701f8 001178 00 A 0 0 1 [253] .BTF PROGBITS 0000000000000000 3ac86c8 15b07fe1 00 0 0 1 acme@x1:~/git/pahole$ 363.888.609 acme@number:~$ ls -lah /sys/kernel/btf/vmlinux -r--r--r--. 1 root root 6.466.451 Apr 4 14:49 /sys/kernel/btf/vmlinux acme@number:~$ Way more, as its not deduped, if we ask pahole to pretty print it, it will see just the first BTF header: acme@x1:~/git/pahole$ pahole -F btf ../build/v6.14.0+/vmlinux.o | wc -l 12080 acme@x1:~/git/pahole$ pahole -F dwarf ../build/v6.14.0+/vmlinux.o | wc -l 167204 acme@x1:~/git/pahole$ I.e. not dedup'ed For builds without CONFIG_DEBUG_INFO=y, just with CONFIG_DEBUG_INFO_BTF=y, which isn't possible now, but should be for people not wanting DWARF, when using a compiler that doesn't produce BTF for vmlinux (or produces bad BTF and you want to avoid it, a transition, point in time situation), this: - cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< \ + cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< && ${PAHOLE} --btf_encode ${PAHOLE_FLAGS} $@ \ should additionally strip DWARF after --btf_encode. - Arnaldo