On 18/07/2025 18:04, Florian Weimer via Gcc-help wrote:
* David Brown via Gcc-help:
On 18/07/2025 16:32, Florian Weimer via Gcc-help wrote:
* Segher Boessenkool:
On Mon, Jul 14, 2025 at 03:03:46PM -0700, Florian Weimer wrote:
* Segher Boessenkool:
-fwrapv is a great way to get slower code, too. Is there something in
your code that does not work without this reality-distorting flag?
It really depends on the code. In many cases, -fwrapv enables
additional optimizations. For example, it's much easier to use (in C
code) the implicit sign bit many CPUs compute for free.
"-fwrapv" in itself does not enable any additional optimisations as
far as I know. In particular, any time you don't have that flag
activated, and the compiler could generate more efficient code by
using wrapping behaviour for two's complement arithmetic, then it is
free to do so - since signed overflow is undefined behaviour in C, the
compiler can treat it as defined to wrap if that's what suits.
While this is true in principle, it's not how -fwrapv (or undefined signed
overflow) is implemented in GCC. When writing optimizations, you have
to be careful not to introduce signed overflow that was not present in
the original code because there aren't separate tree operations for
wrapping and overflowing operations.
There aren't many examples like this in the code base today, perhaps
because -fwrapv is not the default and any such optimization would not
get used much. But here's one:
/* The last case is if we are a multiply. In that case, we can
apply the distributive law to commute the multiply and addition
if the multiplication of the constants doesn't overflow
and overflow is defined. With undefined overflow
op0 * c might overflow, while (op0 + orig_op1) * c doesn't.
But fold_plusminus_mult_expr would factor back any power-of-two
value so do not distribute in the first place in this case. */
if (code == MULT_EXPR
&& TYPE_OVERFLOW_WRAPS (ctype)
&& !(tree_fits_shwi_p (c) && pow2p_hwi (absu_hwi (tree_to_shwi (c)))))
return fold_build2 (tcode, ctype,
fold_build2 (code, ctype,
fold_convert (ctype, op0),
fold_convert (ctype, c)),
op1);
I am not at all well-versed in the internals of GCC, so I don't know
what is going on in that code. But I am not aware of any situation
where using wrapping instructions could introduce new overflows that
made it through to the final answer. In any combination of additions
and multiplications, it doesn't matter when you (logically) apply the
modulo operation to limit your range to your bit size - the result is
the same.
But what /could/ happen is that you have extra intermediary overflows.
If you have "-ftrapv" in action, then "a * (x - y)" and "(a * x) - (a *
y)" can have different behaviour if there are overflows in the
intermediary parts.
However, when "-ftrapv" is not in effect, I cannot see how "-fwrapv"
allows any extra optimisations.
Are you able to give an example of the C code for which the optimisation
above applies, and values for which the result is affected? (When
thinking about overflows, I always like to use 16-bit int because the
numbers are smaller and easier to work with.)
It would be just another extension, and one that many compilers already
enable by default. Even GCC makes casting from unsigned to int defined
in all cases because doing that in a standard-conforming way is way too
painful.
I may be misunderstanding what you wrote here. In cases where
something is undefined in the C standards, a compiler can define the
behaviour if it wants - that does not break standards conformation in
any way.
Converting from an unsigned integer type to a signed integer type is
fully defined in the C standards if the value can be represented and
does not change. If not (because it is too big), the result is
implementation-defined (or an implementation-defined trap). gcc does
this by two's complement wrapping and modulo (basically, it generally
does nothing as all its targets are two's complement) - that is
entirely standard-conforming.
It's standard-conforming, but GCC forgoes a lot of optimization
opportunities as a result. Like not doing -fwrapv by default, this
breaks quite a bit of code, of course. It's also a missed opportunity
for telling more programmers that they can't write correct C code.
Conversion to signed integer types is implementation-defined behaviour
in the C standards, not undefined behaviour. That means the compiler
must pick a specific tactic which is documented (in section 4.5 of the
gcc manual) and applied consistently. It is not undefined behaviour -
code that relies on two's complement conversion of unsigned types to
signed types is not incorrect code, merely non-portable code. (In
practice, of course, it is portable, as all real-world compilers use the
same tactic on two's complement targets.)
If conversion to signed integer types had had some undefined behaviour,
in the manner of signed integer arithmetic overflow, it would be a
different matter - then picking two's complement conversion would have
reduced optimisation opportunities and encouraged incorrect code.
David