On Wed, 2025-02-12 at 10:09 +0100, Basile Starynkevitch wrote: > On Wed, 2025-02-12 at 09:25 +0100, Florian Weimer via Gcc-help wrote: > > * Bento Borges Schirmer: > > > > > [3] > > > https://github.com/bottle2/swf2c/blob/88f9ccb7912d55002e87f1efb11f21720d97e4ec/tests/thousands-of-functions.c > > > > You should turn L and B into proper functions instead of macros, then > > compilation time will decrease significantly. If compilation time is > > still too high, consider adopting a table-based approach. > > > And your compiled C or C++ code should preferably be made of translation units > (C or C++ files) not bigger than about ten thousands lines each. > > Observe that the C++ code of GCC don't have any source file bigger than 60KLOC > (the biggest one being ./gcc/cp/parser.cc and ./libstdc++- > v3/testsuite/20_util/to_chars/double.cc ...) and that generated C++ code (e.g. > _GccTrunk/gcc/insn-dfatab.c ...) has at most 210KLOC. > > > See also https://arxiv.org/abs/1109.0779 and (if you want to generate C code > which is huge to benchmark compile time) > https://github.com/bstarynk/misc-basile/blob/master/manydl.c > > My advice is to refactor your human written C or C++ files to have no more > than > ten thousands lines each. (with C++ templates the compilation time can still > take dozen of minutes in pathological cases). You mention also that: b) one 307725-line-long function [4] takes 2 hours on `clang -Oz` and 10 minutes on `clang -O0` But such a huge function cannot be written by a human being: it is not understandable and not readable. So that 307725-line-long function is by necessity generated by some software tool. In the (obsolete GCC MELT) plugin described in https://arxiv.org/abs/1109.0779 I did encounter the same issue. And I had to work on the code generator to split huge generated code functions into more manageable pieces. IIRC my C++ code generator (the GCC MELT plugin) did split (at C++ generation time) huge blocks into separate static functions. And this took me two weeks of work (or maybe three). It was almost 15 years ago, so I forgot the details. If your C++ code generator is not yours you should report that as a bug to its supplier. https://stackoverflow.com/a/36474352/841108 is indeed relevant. Notice also that huge C++ functions are obviously triggering the compiler. In my past informal experience on compilation time with gcc -O2 of randomly generated functions (by https://github.com/bstarynk/misc-basile/blob/master/manydl.c ...) the compilation time seems quadratic in the number of lines of the source of a single C or C++ function (more exactly quadratic in the number of GIMPLE statements). And register allocators cannot behave well on such huge functions. Actually any large function (more than a few thousand GIMPLE statements or lines) is very likely to be compiled to "slow" code (even by gcc -O3 or by clang -O3) So my advice is really to work on the generating tool (the software emitting C or C++ code) to make it generate functions not bigger than about ten thousand lines (or GIMPLE statements) each. This even apply to "obvious" C++ generators like https://www.fltk.org/doc-2.0/html/fluid.html FWIW, Jacques Pitrat (the French symbolic AI research pionner) also generated C code (see his last book Artificial Beings - The conscience of a conscious machine ISTE, Wiley, Mars 2009. ISBN 978-1848211018) and observed the same issue. His generated code is on the web on https://github.com/bstarynk/caia-pitrat and he had to spend time improving his code generator (the same system itself, an expert system in 1990 terminology) to lower the size of every generated routine to a reasonable number of lines. His blog is still online but sadly he is dead http://bootstrappingartificialintelligence.fr/WordPress3/ If you want to generate a few huge functions and don't care about performance (e.g. because they are initialization functions running once in a Linux process) you might use a simpler compiler (maybe nwcc or tinycc) to compile them, or generate naive machine code with libraries like asmjit.com or GNU lightning see https://www.gnu.org/software/lightning/ Alternatively emit also GCC specific #pragma-s to force that single huge C function to be compiled with the equivalent of -O0 I do recommend emitting its C or C++ code in a different translation unit, which would be compiled without optimization e.g. -O0. And you could use "worse" C compiler like tinycc (see http://download.savannah.gnu.org/releases/tinycc/ ...) or like nwcc (see https://nwcc.sourceforge.net/ ...) Same advices apply for using libgccjit on https://gcc.gnu.org/onlinedocs/jit/ My summary: dont expect the GCC (or Clang) compiler to compile "well" (e.g. quickly and with effective optimization) a generated C or C++ (or Fortran) function with a hundred thousand statements. Regards. -- Basile STARYNKEVITCH <basile@xxxxxxxxxxxxxxxxx> 8 rue de la Faïencerie 92340 Bourg-la-Reine, France http://starynkevitch.net/Basile & https://github.com/bstarynk