This patch set introduces the BPF_F_CPU flag for percpu_array maps, as discussed in the thread of "[PATCH bpf-next v3 0/4] bpf: Introduce global percpu data"[1]. The goal is to reduce data caching overhead in light skeletons by allowing a single value to be reused across all CPUs. This avoids the M:N problem where M cached values are used to update a map on N CPUs kernel. The BPF_F_CPU flag is accompanied by a cpu field, which specifies the target CPUs for the operation: * For lookup operations: the flag and cpu field enable querying a value on the specified CPU. * For update operations: * If cpu == (u32)~0, the provided value is copied to all CPUs. * Otherwise, the value is copied to the specified CPU only. Currently, this functionality is only supported for percpu_array maps. Links: [1] https://lore.kernel.org/bpf/20250526162146.24429-1-leon.hwang@xxxxxxxxx/ Changes: RFC v2 -> v1: * Address comments from Andrii: * Use '&=' and '|='. * Replace 'reuse_value' with simpler and less duplication code. * Replace 'ASSERT_FALSE' with two 'ASSERT_OK_PTR's in self test. RFC v1 -> RFC v2: * Address comments from Andrii: * Embed cpu to flags on kernel side. * Change BPF_ALL_CPU macro to BPF_ALL_CPUS enum. * Copy/update element within RCU protection. * Update bpf_map_value_size() including BPF_F_CPU case. * Use zero as default value to get cpu option. * Update documents of APIs to be generic. * Add size_t:0 to opts definitions. * Update validate_map_op() including BPF_F_CPU case. * Use LIBBPF_OPTS instead of DECLARE_LIBBPF_OPTS. Leon Hwang (3): bpf: Introduce BPF_F_CPU flag for percpu_array map bpf, libbpf: Support BPF_F_CPU for percpu_array map selftests/bpf: Add case to test BPF_F_CPU include/linux/bpf.h | 3 +- include/uapi/linux/bpf.h | 7 + kernel/bpf/arraymap.c | 54 ++++-- kernel/bpf/syscall.c | 52 ++++-- tools/include/uapi/linux/bpf.h | 7 + tools/lib/bpf/bpf.c | 23 +++ tools/lib/bpf/bpf.h | 36 +++- tools/lib/bpf/libbpf.c | 56 +++++- tools/lib/bpf/libbpf.h | 53 +++++- tools/lib/bpf/libbpf.map | 5 + tools/lib/bpf/libbpf_common.h | 14 ++ .../selftests/bpf/prog_tests/percpu_alloc.c | 172 ++++++++++++++++++ .../selftests/bpf/progs/percpu_array_flag.c | 24 +++ 13 files changed, 459 insertions(+), 47 deletions(-) create mode 100644 tools/testing/selftests/bpf/progs/percpu_array_flag.c -- 2.50.1