On Thu, Apr 17, 2025 at 10:15 PM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > - There’s no clean way to force the optimized path - GCC only emits > > fast unaligned loads if tuned for a specific CPU (e.g., -mtune=size or > > -mtune=thead-c906), which the kernel doesn't typically do, even with > > HAVE_EFFICIENT_UNALIGNED_ACCESS. > > > > Maybe we should raise this with the GCC maintainers. An explicit > > option to enable optimized unaligned access could help. > > > > HAVE_EFFICIENT_UNALIGNED_ACCESS is a build time setting, so the > resulting kernel only runs correctly on hardware that implements > unaligned accesses in hardware. > > So that means you could pass this -mtune= option too in that case, no? GCC docs say -mtune=size is internal to -Os and not meant for direct use. So while it enables optimized unaligned access, relying on it feels a bit hacky. Clang is more explicit here: -mno-strict-align cleanly enables optimized unaligned accesses. It'd be great if GCC had something similar.. [1] https://gcc.gnu.org/onlinedocs/gcc-14.2.0/gcc/RISC-V-Options.html#index-mtune-12 > Then, you can just use a packed struct or an __aligned(1) annotation > and the compiler will emit the correct code for you, depending on > whether unaligned accesses are permitted.