This patch set introduces the BPF_F_CPU flag for percpu_array maps, as discussed in the thread of "[PATCH bpf-next v3 0/4] bpf: Introduce global percpu data"[1]. The goal is to reduce data caching overhead in light skeletons by allowing a single value to be reused across all CPUs. This avoids the M:N problem where M cached values are used to update a map on N CPUs kernel. The BPF_F_CPU flag is accompanied by a cpu field, which specifies the target CPUs for the operation: * For lookup operations: the flag and cpu field enable querying a value on the specified CPU. * For update operations: * If cpu == 0xFFFFFFFF, the provided value is copied to all CPUs. * Otherwise, the value is copied to the specified CPU only. Currently, this functionality is only supported for percpu_array maps. Links: [1] https://lore.kernel.org/bpf/20250526162146.24429-1-leon.hwang@xxxxxxxxx/ Leon Hwang (3): bpf: Introduce BPF_F_CPU flag for percpu_array map bpf, libbpf: Support BPF_F_CPU for percpu_array map selftests/bpf: Add case to test BPF_F_CPU include/linux/bpf.h | 5 +- include/uapi/linux/bpf.h | 6 + kernel/bpf/arraymap.c | 46 ++++- kernel/bpf/syscall.c | 56 ++++-- tools/include/uapi/linux/bpf.h | 6 + tools/lib/bpf/bpf.c | 37 ++++ tools/lib/bpf/bpf.h | 35 +++- tools/lib/bpf/libbpf.c | 56 ++++++ tools/lib/bpf/libbpf.h | 45 +++++ tools/lib/bpf/libbpf.map | 4 + tools/lib/bpf/libbpf_common.h | 12 ++ .../selftests/bpf/prog_tests/percpu_alloc.c | 169 ++++++++++++++++++ .../selftests/bpf/progs/percpu_array_flag.c | 24 +++ 13 files changed, 473 insertions(+), 28 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/percpu_array_flag.c -- 2.49.0