On 7/10/25 10:54 AM, Puranjay Mohan wrote:
As BPF doesn't include any barrier instructions, smp_mb() is implemented
by doing a dummy value returning atomic operation. Such an operation
acts a full barrier as enforced by LKMM and also by the work in progress
BPF memory model.
If the returned value is not used, clang[1] can optimize the value
returning atomic instruction in to a normal atomic instruction which
provides no ordering guarantees.
Mark the variable as volatile so the above optimization is never
performed and smp_mb() works as expected.
[1] https://godbolt.org/z/qzze7bG6z
You are using llvm19 in the above godbolt run.
But from llvm20, instead of 'lock ...' insn, 'atomic_fetch_or'
will be generated so barrier semantics will be preserved.
Since CI is using llvm20, so we should not have any problem.
But for llvm19 or lower, the patch does fix a problem for arm64 etc.
So in case that maintainer agrees with this patch, my ACK is below:
Acked-by: Yonghong Song <yonghong.song@xxxxxxxxx>
Fixes: 88d706ba7cc5 ("selftests/bpf: Introduce arena spin lock")
Signed-off-by: Puranjay Mohan <puranjay@xxxxxxxxxx>
---
tools/testing/selftests/bpf/bpf_atomic.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/bpf_atomic.h b/tools/testing/selftests/bpf/bpf_atomic.h
index a9674e544322..c550e5711967 100644
--- a/tools/testing/selftests/bpf/bpf_atomic.h
+++ b/tools/testing/selftests/bpf/bpf_atomic.h
@@ -61,7 +61,7 @@ extern bool CONFIG_X86_64 __kconfig __weak;
#define smp_mb() \
({ \
- unsigned long __val; \
+ volatile unsigned long __val; \
__sync_fetch_and_add(&__val, 0); \
})