"brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> writes: > On 2025-06-11 at 22:14:40, Sebastian Andrzej Siewior wrote: >> The ntohl and htonl macros are redefined because the provided macros were >> not always optimal. Sometimes it was a function call, sometimes it was a >> macro which did the shifting. Using the 'bswap' opcode on x86 provides >> probably better performance than performing the shifting. > > I believe that the peephole optimizer will almost always optimize them > to the bswap or equivalent opcode, much like it recognizes how to > generate rotate opcodes from two shifts and an or, so they should > actually be equivalent. > > GCC and clang both emit simple bswap instructions with `-O2`, which is > the optimization level we use: https://godbolt.org/z/1r8P1Pqo7. Good observation. In short, we do not have to redefine these in terms of bswap32/64 for performance as the compilers should do a reasonable job. The updated organization to separate the two concerns in this file, namely, (1) figure out the best way to write bswap32/64, and (2) override (or supply on platforms that do not offer) host-network byte order helpers, does clean things up a lot, so I still like to see it, though.