Re: Questions about optimisation options

Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx> · Wed, 4 Jun 2025 13:19:45 -0500

Hi!

On Wed, Jun 04, 2025 at 09:10:02AM +0100, Tim Froggatt wrote:
> On 03/06/2025 21:57, Segher Boessenkool wrote:
> >It might pay off to write that one directly in assembler
> >code, don't try to manipulate the compiler into ending up with the code
> >you want, just write it!
> 
> Using assembler is certainly one thing I am considering, but it will not be
> possible for everything as this is 100,000s of lines of code.

I suggested focussing on the one or few functions that dominate
performance, for this reason :-)

> But at the moment, I am not yet interested in solving any specific problem -
> that will be a job for later. Before I'm ready for that, first I want to
> understand - what do these optimisation options actually do?

As the documentation says:
     Align the start of functions to the next power-of-two greater than
     or equal to N, skipping up to M-1 bytes.  This ensures that at
     least the first M bytes of the function can be fetched by the CPU
     without crossing an N-byte alignment boundary.  This is an
     optimization of code performance and alignment is ignored for
     functions considered cold.  If alignment is required for all
     functions, use '-fmin-function-alignment'.

> So my two questions are:
> 
> 1) Why is m=1 different from m=2,3,4? Because I thought ARM32 instructions
> were always 4-byte aligned. Shouldn't they produce the same code?

You will need to look at the generated code to see the actual
differences.  Seeing that option A results in a bit faster code than
option B does not show any "why".

> 2) And why are m=1 and m=2,3,4 different to -fno-align-functions? Because I
> would think m=1,2,3,4 would do no alignment.

Show us the generated code.  Or investigate that yourself.  But without
seeing it we cannot say much useful about it.

Segher