On 02/06/2025 15:49, Jonathan Wakely wrote: > On Mon, 2 Jun 2025 at 14:24, Richard Earnshaw (lists) wrote: >> >> On 27/05/2025 09:52, Jonathan Wakely via Gcc-help wrote: >>> On Tue, 27 May 2025 at 09:38, Basile Starynkevitch wrote: >>>> >>>> On Tue, 2025-05-27 at 09:33 +0100, Jonathan Wakely wrote: >>>>> GCC assumes certain functions are always present, e.g. memcpy is needed for >>>>> copying large structs on the stack. The question is which functions does it >>>>> need. Your answer isn't really relevant to the question. >>>> >>>> >>>> memcpy is actually a builtin inside GCC. and has to be one because many >>>> processors (including x86-64 and ARM) has specific machine instructions to >>>> implement it efficiently. >>> >>> It *is* a built-in, but that doesn't mean it's implemented inside GCC. >>> You can verify this easily with the following code: >>> >>> #include <string.h> >>> void f(void* to, const void* from, unsigned long n) >>> { >>> memcpy(to, from, n); >>> } >>> >>> Even at -O3 this will be compiled to a call to the memcpy function in >>> libc. GCC doesn't expand it to machine instructions unless the size is >>> known (and small). >> >> No, it's up to the implementation to chose what limit there is, if any. On aarch64 with the FEAT_MOPS your example gives: >> >> $ /work/rearnsha/scratch/gnu/gcc/aarch64/master/gcc/xgcc -B /work/rearnsha/scratch/gnu/gcc/aarch64/master/gcc/ -I ~/gnusrc/newlib/master/newlib/libc/include/ -O2 -march=armv8-a+mops -o - -S /tmp/mem.c >> .arch armv8-a+mops >> f: >> cpyfp [x0]!, [x1]!, x2! >> cpyfm [x0]!, [x1]!, x2! >> cpyfe [x0]!, [x1]!, x2! >> ret > > Ah, thanks for the correction! > > For x86_64 both gcc and clang emit a call to memcpy: > > https://godbolt.org/z/hGvbM4df8 AArch64 would do as well if you don't have the MOPS extension. As I said, the limit, if any, is an target implementation choice; it's generally driven by the amount of code bloat that picking the best strategy would require. R.