Re: Is PRE architecture dependent? aarch64 vs x86_64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/07/2025 19:17, Segher Boessenkool wrote:
Hi!

On Fri, Jul 18, 2025 at 07:11:12PM +0200, David Brown wrote:
I'm getting the feeling that we've got our wires crossed somewhere.

Signed integer /arithmetic/ overflow is UB in the C standards and in gcc
(unless "-fwrapv" is in effect).

Yup.

Conversion to a signed integer type, when the value cannot be preserved, is
implementation-defined behaviour in the C standards, and in gcc (gcc defines
it to be two's complement wrapping).

And leaving it UB is a valid implementation.  An implementation is not
required to specify anything in particular.


The standard is pretty clear about what the different classes of "behaviour" mean - it's right there in section 3.4 of "Terms, definitions and symbols" :

"""
3.4.1
1 implementation-defined behavior

unspecified behavior where each implementation documents how the choice is made

2 Note 1 to entry: J.3 gives an overview over properties of C programs that lead to implementation-defined behavior.

3 EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer
is shifted right.

3.4.3
1 undefined behavior

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements

2 Note 1 to entry: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

3 Note 2 to entry: J.2 gives an overview over properties of C programs that lead to undefined behavior.

4 EXAMPLE An example of undefined behavior is the behavior on integer overflow.

3.4.4
1 unspecified behavior

behavior, that results from the use of an unspecified value, or other behavior upon which this document provides two or more possibilities and imposes no further requirements on which is chosen in any instance

2 Note 1 to entry: J.1 gives an overview over properties of C programs that lead to unspecified behavior.

3 EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are evaluated.

"""


When the standard says something is "undefined behaviour", the compiler can treat it any way it wants - including assuming it never happens for optimisation purposes, or giving trap instructions, or giving clear, documented and specific behaviour, or making the behaviour depend on compiler flags, or having the code email your boss and tell them that you can't program for peanuts. UB is a great idea that lets compilers optimise more, avoids asking them to do the impossible (such as ensure that an arbitrary pointer is "valid" before dereferencing), and encourages programmers to understand the "garbage in, garbage out" principle.

When the standard says something is "unspecified behaviour", the compiler can make a choice of the behaviour amongst several options. It does not need to document these choices, or be consistent about them. This lets the compiler re-arrange many things (such as order of evaluation) for optimisation purposes, but does not change the result. Padding bits and bytes are, in many circumstances, unspecified values - the compiler doesn't need to store particular values, but it does have to ensure they are not "trap values" and requires a certain level of consistency (i.e., the values can vary between calls to the same code, but at any given time, the values have to compare equal to themselves. That is not required for UB.)

"Implementation-defined" behaviour is like "unspecified behaviour", except that the implementation must document what it does. The compiler can pick strange choices of behaviours if it likes - it could say that "x >> y", where "x" is a signed integer type with negative value, works as though "x" were sign-extended for "int", zero-extended for "long int", and gives the result of 42 when "x" is a "long long int". That's allowed - but must be documented and consistent. Similarly, converting an out-of-range value to a signed integer type is allowed to be done in different ways in different circumstances, but it must be documented and it must be consistent. If the documentation says odd numbers use saturation and even numbers are converted to 0, that's conforming. If the documentation says it is "undefined", that is /not/ conforming. The programmer must be able to read the documentation and predict the effects of the implementation-defined behaviour, and rely on that behaviour working consistently.

"IB" and "UB" are a world apart. "IB" lets you write efficient low-level code tuned to a specific compiler, target or system, at the price of portability. It is perfectly fine - indeed crucial to most C programs - to rely on "IB". It is never appropriate to rely on the effects of "UB".


A "cast from unsigned to int" is a conversion, thus it is
implementation-defined.

Yup.

I think somewhere along the line in this thread, the conversions and signed
integer arithmetic overflows have been mixed together.

Oh, people do that all the time!


Segher





[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux