Re: [PATCH 1/1] pahole: When trying to encode BTF avoid DWARF less files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 03, 2025 at 07:40:19PM +0100, Alan Maguire wrote:
> On 02/04/2025 21:35, Arnaldo Carvalho de Melo wrote:
> > +	// Right now encoding BTF has to be from DWARF, so enforce that, otherwise
> > +	// the loading process can fall back to other formats, BTF being the case
> > +	// and as this is at this point unintended, avoid that.
> > +	// Next we need to just skip object files that don't have the format we
> > +	// expect as the source for BTF encoding, i.e. no DWARF, no BTF, no problema.
> > +	if (btf_encode && conf_load.format_path == NULL)
> > +		conf_load.format_path = "dwarf";

> >  	if (show_running_kernel_vmlinux) {
> >  		const char *vmlinux = vmlinux_path__find_running_kernel();
 
> LGTM
 
> Acked-by: Alan Maguire <alan.maguire@xxxxxxxxxx>

> In the same spirit, should we error out if the format path is explicitly
> set to anything other than DWARF for now when btf_encode is set?

Probably, but I focused just on this part as I'm trying to follow up on
an idea I described to folks about doing the BTF encoding right after
the .o file is generated, to try and exploit all being in memory and the
natural parallelism of the kernel build process.

If DWARF isn't asked for, which some developers do to avoid having tons
of things hitting the disk, then this gets discarded right away, the
kernel build patch I have right now is super simple and doesn't trow
away DWARF yet:

acme@x1:~/git/linux$ git diff
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 4d543054f72356a4..02a595b82b299151 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -313,7 +313,7 @@ cmd_ld_single = $(if $(objtool-enabled)$(is-single-obj-m), ; $(LD) $(ld_flags) -
 endif
 
 quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
-      cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< \
+      cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< && ${PAHOLE} --btf_encode ${PAHOLE_FLAGS} $@ \
                $(cmd_ld_single) \
                $(cmd_objtool)
 
acme@x1:~/git/linux$

With a the above patch and the following:

acme@x1:~/git/pahole$ git show
commit ee3e75da682eb6bcc191127ea4324677ca2ab7c4 (HEAD -> master)
Author: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Date:   Thu Apr 3 15:28:35 2025 -0300

    pahole: Don't fail if no DWARF is available when encoding BTF, just skip
    
    While building the kernel there are a few files that have no DWARF and
    thus needs to get skipped while doing early BTF generation, i.e. per .o
    file instead of vmlinux. Things like arch/x86/purgatory/purgatory.c.
    
    Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>

diff --git a/pahole.c b/pahole.c
index f34e1ecd5542c245..59b7418e3eff89e5 100644
--- a/pahole.c
+++ b/pahole.c
@@ -3609,6 +3609,11 @@ try_sole_arg_as_class_names:
 
        err = cus__load_files(cus, &conf_load, argv + remaining);
        if (err != 0) {
+               if (btf_encode && strcmp(conf_load.format_path, "dwarf") == 0) {
+                       // Assume no DWARF is available and thus just skip this object file
+                       goto out_ok;
+               }
+
                if (class_name == NULL && !btf_encode && !ctf_encode) {
                        class_name = argv[remaining];
 
acme@x1:~/git/pahole$ 

I end up with a vmlinux.o that has .BTF file that combines all the .BTF
for the .o files that comprise vmlinux and then the next step is to use
some --btf_feature=dedup_archive that will basically notice that no
DWARF to BTF is needed, just make btf_parse_elf() to see that the .BTF
section is bigger than what its in the btf_header and then go on in a
loop using btf__add_btf() to go on merging all BTFs to then do the
dedup.

This process is kinda what will happen when gcc -gbtf takes place, i.e.
its generated by the compiler, then after the linker merges all of the
.BTF files (as it does with my patches) it will do the dedup, be it in
the linker, compiler or using libbpf or libctf or whatever other method.

So doing it now is paving the way for this future and may end up
speeding up building kernels while we wait for the whole cc+ld doing it
gets in place and is dependable.

There are issues with the above that needs to be addressed after we have
some basic stuff in place, which I'm getting to, but lets see if I (or
somebody else with less distractions tham myself) do it.

Some numbers:

acme@x1:~/git/pahole$ readelf -SW ../build/v6.14.0+/vmlinux.o  | grep .BTF
  [125] .BTF_ids          PROGBITS        0000000000000000 2a701f8 001178 00   A  0   0  1
  [253] .BTF              PROGBITS        0000000000000000 3ac86c8 15b07fe1 00      0   0  1
acme@x1:~/git/pahole$

363.888.609

acme@number:~$ ls -lah /sys/kernel/btf/vmlinux 
-r--r--r--. 1 root root 6.466.451 Apr  4 14:49 /sys/kernel/btf/vmlinux
acme@number:~$

Way more, as its not deduped, if we ask pahole to pretty print it, it
will see just the first BTF header:

acme@x1:~/git/pahole$ pahole -F btf ../build/v6.14.0+/vmlinux.o | wc -l
12080
acme@x1:~/git/pahole$ pahole -F dwarf ../build/v6.14.0+/vmlinux.o | wc -l
167204
acme@x1:~/git/pahole$

I.e. not dedup'ed

For builds without CONFIG_DEBUG_INFO=y, just with
CONFIG_DEBUG_INFO_BTF=y, which isn't possible now, but should be for
people not wanting DWARF, when using a compiler that doesn't produce BTF
for vmlinux (or produces bad BTF and you want to avoid it, a transition,
point in time situation), this:

-      cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< \
+      cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< && ${PAHOLE} --btf_encode ${PAHOLE_FLAGS} $@ \

should additionally strip DWARF after --btf_encode.

- Arnaldo




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux