On 8/26/25 11:52 PM, Eduard Zingerman wrote:
On Wed, 2025-08-27 at 07:41 +0200, Andrea Righi wrote:
On Tue, Aug 26, 2025 at 10:02:31PM -0700, Eduard Zingerman wrote:
On Tue, 2025-08-26 at 13:17 -0700, Yonghong Song wrote:
[...]
I tried with gcc14 and can reproduced the issue described in the above.
I build the kernel like below with gcc14
make KCFLAGS='-O3' -j
and get the following build error
WARN: resolve_btfids: unresolved symbol bpf_strnchr
make[2]: *** [/home/yhs/work/bpf-next/scripts/Makefile.vmlinux:91: vmlinux] Error 255
make[2]: *** Deleting file 'vmlinux'
Checking the symbol table:
22276: ffffffff81b15260 249 FUNC LOCAL DEFAULT 1 bpf_strnchr.cons[...]
235128: ffffffff81b1f540 296 FUNC GLOBAL DEFAULT 1 bpf_strnchr
and the disasm code:
bpf_strnchr:
...
bpf_strchr:
...
bpf_strnchr.constprop.0
...
So in symbol table, we have both bpf_strnchr.constprop.0 and bpf_strnchr.
For such case, pahole will skip func bpf_strnchr hence the above resolve_btfids
failure.
The solution in this patch can indeed resolve this issue.
It looks like instead of adding __noclone there is an option to
improve pahole's filtering of ambiguous functions.
Abstractly, there is nothing wrong with having a clone of a global
function that has undergone additional optimizations. As long as the
original symbol exists, everything should be fine.
Since kfuncs are global, this should guarantee that the compiler does not
change their signature, correct? Does this also hold for LTO builds?
If so, when pahole sees a set of symbols like [foo, foo.1, foo.2, ...],
with 'foo' being global and the rest local, then there is no real need
to filter out 'foo'.
Wdyt?
I think we should do both: fix resolve_btfids to ignore compiler
optimization suffixes (.isra., .constprop., .part., .cold, ...) and add
__noclone.
This feels like the safest path IMHO. Fixing resolve_btfids alone works
with current compilers, but future compiler versions, under aggressive
IPA/LTO optimizations, might decide that the main global symbol is
redundant and drop it altogether, leading to similar issues.
Basically, fixing the tool makes the BTF pipeline more robust, adding
__noclone also makes the exported symbols themselves more robust,
regardless of compiler optimizations.
If we are being really paranoid about LTO builds, is __noclone sufficient?
E.g. [1] does not imply that signature can't be changed.
We currently apply only __retain__, here is a little test with both attributes:
$ cat foo.c
__attribute__((__noclone__, __retain__))
int foo(int a) {
return a;
}
$ cat main.c
int foo(int);
int main(int argc, char **argv) {
return foo(0);
}
$ gcc -O3 -Wall -flto foo.c main.c -o a.out
Currently kernel does not support LTO for gcc build.
$ nm a.out | grep foo
$ objdump -Sdr a.out | grep foo
$ objdump -Sdr a.out | less
$ nm a.out | grep foo | wc -l
0
$ objdump -Sdr a.out | grep foo | wc -l
0
export.h:EXPORT_SYMBOL does the following trick:
extern typeof(cachemode2protval) cachemode2protval;
static void * __attribute__((__used__))
__attribute__((__section__(".discard.addressable")))
__UNIQUE_ID___addressable_cachemode2protval489 = (void *)(uintptr_t)&cachemode2protval;
asm(".section \".export_symbol\",\"a\" ;\
__export_symbol_cachemode2protval: ;\
.asciz \"\" ;\
.ascii \"\" \"\\0\" ;\
.balign 8 ;\
.quad cachemode2protval ;\
.previous");
Should we employ something similar?
[1] https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noclone-function-attribute