Re: Question about difference in assembly generation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2025-05-14 at 10:26 +0200, Laurent Carlier via Gcc-help wrote:
> Hello,
> 
> 
> 
> First of all I’m not sure where I should ask this question so I’m using
> this list. Please kindly direct me to the right place if asking here is not
> appropriate.
> 
> 
> 
> I have a question regarding GCC's behavior with respect to tail call
> optimization (TCO). I understand that the C++ standard does not mandate
> TCO, but I'm curious about the specific conditions under which GCC applies
> it in practice.


You could:

* try several versions of GCC on your example. Observe that GCC 15.1 was
released recently (but you probably need to configure and compile it from source
code), and that GCC 14 is packaged in most Linux distributions.

* compile your example with: g++ -O2 -fverbose-asm -S; try also with -O3 and/or
-Os instead of -O2 (the -fverbose-am enables you to understand what is
happening)

* use some GCC trunk (a.k.a. future GCC 16) version and try past versions 

* observe that GCC has now a musttail attribute and use it:
https://gcc.gnu.org/onlinedocs/gcc-15.1.0/gcc/Statement-Attributes.html#index-musttail-statement-attribute

* In some cases, use `asm volatile`

* Compile your example with the clang compiler to see if there are lots of
differences

* use the many GCC developer options
https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html

* use the GCC static analysis options (they explain what GCC "understand" on
your code)

* add some GCC specific builtins (if runtime performance matters a big lot).

* use link time optimizations and (when relevant) the -fwhole-program flag

* develop your GCC plugin (see https://arxiv.org/abs/1109.0779 for ideas) 

* perhaps generate more efficient code at runtime using GNU lightning or
libgccjit or asmjit.com; on Linux remember that you can do a big lot of dlopen &
dlsym.

In my perception GCC performs well on tail-call optimizations. In the general
case it is known to be a very difficult (perhaps intractable or undecidable)
issue. In some cases it might not even improve runtime performance (e.g. because
of weird cache effects, etc...)


-- 
Basile STARYNKEVITCH                            <basile@xxxxxxxxxxxxxxxxx>
8 rue de la Faïencerie                       http://starynkevitch.net/Basile/  
92340 Bourg-la-Reine                         https://github.com/bstarynk
France                                https://github.com/RefPerSys/RefPerSys




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux