On 8 Aug 2025, Arnaldo Carvalho de Melo told this: > On August 8, 2025 3:28:13 PM GMT-03:00, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: >>On Thu, 2025-08-07 at 19:09 -0700, Alexei Starovoitov wrote: >> >>> Before you jump into 1,2,3 let's discuss the end goal. >>> I think the assumption here is that this btf-for-each-.o approach >>> is supposed to speed up the build, right ? Generating BTF directly in the compiler certainly does, in situations where we can avoid DWARF. We reduce the amount of data written out by something like 11GiB (!) in my tests. >>I'd like to second Alexei's question. >>In the cover letter Arnaldo points out that un-deduplicated BTF >>amounts for 325Mb, while total DWARF size is 365Mb. That very much depends on the kernels you build. In my tests of enterprise kernels (including modules) with the GCC+btfarchive toolchain (not feeding it to pahole yet), I found total DWARF of 11.2GiB, undeduplicated BTF of 550MiB (counting raw .o compiler output alone), and a final dedupicated BTF size (including all modules) of about 38MiB (which I'm sure I can reduce). >>The size of DWARF sections in the final vmlinux is comparable to yours: 307Mb. >>The total size of the generated binaries is 905Mb. >>So, unless the above calculations are messed up, the total gain here is: >>- save ~500Mb generated during build For me, 11GiB :) >>- save some time on pahole not needing to parse/convert DWARF In my tests, a *lot*. I think Arnaldo has recently improved this, but back in April when I was comparing things, I had to kill pahole when it was dedupping an allmodconfig kernel-plus-modules because it ate more than 70GiB of RAM and was still chewing on all 20 cores of my machine after two hours. btfdedup (which uses the libctf deduplicator used by GNU ld), despite being single-threaded and doing things like ambiguous type detection as well, used 12GiB and took 19 minutes. (Multithreading it is in progress, too). allyesconfig is faster. Anything sane is faster yet. Enterprise kernels take about four minutes, which is not too different from pahole. I was shocked by this: I thought libctf would be slower than pahole, and instead it turned out to be faster, sometimes much faster. I suspect much of this frankly ridiculous difference was DWARF conversion, and so would be improved by doing it in parallel (as here), but... still. Not having to generate and consume all that DWARF is bound to help! It's like 95% less work... >>So, I see several drawbacks: >>- As you note, there would be two avenues to generate BTF now: >> - DWARF + pahole >> - BTF + pahole (replaced by BTF + ld at some point?) The code exists... BTF + ld + dedupping the resulting ld-dedupped output together. Note that the code used to deduplicate BTF with libctf (as used by ld) is not large. Look: https://github.com/nickalcock/linux/blob/nix/btfa/scripts/btf/btfarchive.c (and of those functions, you don't need transform_module_names(), suck_in_modules(), or suck_in_lines(): it's really no more code than is needed to tell it which inputs map to which modules, then a couple of lines to trigger dedup and emit the resulting BTF archive). It's entirely reasonable for pahole in future to simply call libctf's deduplicator to dedup BTF if it sees that the linker hasn't done it, or to do what btfarchive does here itself to dedup the linker-deduplicated per-module output and the vmlinux BTF against each other (and then we don't need btfarchive at all, which means fewer build system changes). This would let pahole dedup BTF if needed while not wasting time on it if the linker already did it, *and* let you ditch the pahole deduplicator so you don't need to maintain it any more, even when clang et al are being used. (Obviously, you'd only do this once libctf's dedup is up to scratch and once it's in a release binutils, since I'm sure there will be bugs I need to fix!) >> This is a potential source of bugs. That's not a very good argument. *Everything* is a potential source of bugs. I will of course prioritize fixing any bugs in libctf that affect pahole's operation: not breaking pahole matters! >> Is the goal to forgo DWARF+pahole at some point in the future? > > I think the goal is to allow DWARF less builds, which can probably save time even if we do use pahole to convert DWARF generated from the compiler into BTF and right away strip DWARF. > > This is for use cases where DWARF isn't needed and we want to for example have CI systems running faster. Yep! Also this means that you can get new features like type and decl tags into BTF faster, because it's much quicker to get them into GCC and libctf (at least for recent compiler releases) than it is to get them into DWARF just so you can get them out of DWARF again and translate them into BTF. DWARF simply has many more consumers to think about, while the kernel is obviously a critical consumer of GCC's and libctf's generated BTF (we do need to consider userspace, but we don't need to be as conservative as a giant behemoth like DWARF must be. I'm confident enough in my testing to be willing to backport things to binutils release branches as needed, though probably not to points before the first release where BTF support is added to libctf because that change is pretty massive.) > My initial interest was to do minimal changes to pave the way for BTF > generated for vmlinux directly from the compiler, but the realization > that DWARF still has a lot of mileage, meaning distros will continue > to enable it for the foreseeable future makes me think that maybe > doing nothing and continue to use the current method is the sensible > thing to do. Speaking purely selfishly, I would be... unhappy to find that I'd spent all this effort on a BTF-capable deduplicator only to find you didn't want to use it no matter how good it ended up being :( this seems like a rather sudden change of heart... >>- I assume that it is much faster to land changes in pahole compared >> to changes in gcc, so future btf modifications/features might be a >> bit harder to execute. Wdyt? As noted, I think this is not really true, at least once the core BTF dedup stuff has landed: I can backport stuff on top of them without doing releases, and distros usually pick it up within a few days. The principal delay is testing... > Right, that too, even if we enable generation of BTF for native .o > files by the compiler we would still want to use pahole to augment it > with new features or to fixup compiler BTF generation bugs. And maybe > for generating tags that are only possible to have the necessary info > at the last moment. Well, yes. I thought it was always the plan for pahole to keep consuming and augmenting BTF! Among other things, the kernel uses a bunch of additional sections that reference BTF types that GNU ld has no idea how to generate, and which nobody is planning to use outside the kernel. That's also where a lot of the innovation is happening, and GCC and GNU ld don't need to get involved in that at all (unless and until you want them to). I can say that changing libctf to support *every difference from CTF that BTF has got* and teaching GNU ld to handle that took about two months, so implementing single changes in future doesn't seem like an insurmountable burden (and much of that two months was spent on infrastructural adjustments to allow easier changes in future -- the hardest single BTF feature to suppoert was probably datasecs and vars, and that took about a week including deduplication). Obviously there will be bugs, but when they show up I'll fix them. I am not worried about the maintenance burden of supporting new BTF stuff in binutils libctf and I don't think Jose is worried about it in GCC either. I mean, it's not like it's going to be an extra burden for long: the medium-term goal is to replace CTF with BTF entirely, even for userspace consumption. There are surprisingly few new features needed before we can consign CTF to history and converge on one type format to rule them all. (I think they're all entirely nondisruptive too.) > Now if we could have hooks in the linker associated with a given ELF > section name (.BTF) to use instead of just concatenating, and then at > the end have another hook that would finish the process by doing the > dedup, just like I do in this series, that would save one of those > linker calls. Yeah, we looked at that, but GNU ld's plugin support is totally focused on the needs of LTO and can't really handle what dedup needs at all: fixing that would likely be a substantial and fiddly change. As part of the CTF and BTF work there *are* internal hooks in ld and libbfd that do what is needed, but they're not exported outside the linker, and exporting them looks to be... painful. (But it seems unnecessary for GNU ld, since it will after all be able to dedup BTF with no plugins at all, and already can in my proof-of-concept branch on binutils-gdb git.) -- NULL && (void)