Re: [PATCH bpf-next v3 5/6] libbpf: Support BPF_F_CPU for percpu maps

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Fri, 22 Aug 2025 15:20:05 -0700

On Thu, Aug 21, 2025 at 9:09 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>
> Add libbpf support for the BPF_F_CPU flag for percpu maps by embedding the
> cpu info into the high 32 bits of:
>
> 1. **flags**: bpf_map_lookup_elem_flags(), bpf_map__lookup_elem(),
>    bpf_map_update_elem() and bpf_map__update_elem()
> 2. **opts->elem_flags**: bpf_map_lookup_batch() and
>    bpf_map_update_batch()
>
> And the flag can be BPF_F_ALL_CPUS, but cannot be
> 'BPF_F_CPU | BPF_F_ALL_CPUS'.
>
> Behavior:
>
> * If the flag is BPF_F_ALL_CPUS, the update is applied across all CPUs.
> * If the flag is BPF_F_CPU, it updates value only to the specified CPU.
> * If the flag is BPF_F_CPU, lookup value only from the specified CPU.
> * lookup does not support BPF_F_ALL_CPUS.
>
> Signed-off-by: Leon Hwang <leon.hwang@xxxxxxxxx>
> ---
>  tools/lib/bpf/bpf.h    |  8 ++++++++
>  tools/lib/bpf/libbpf.c | 25 +++++++++++++++++++------
>  tools/lib/bpf/libbpf.h | 21 ++++++++-------------
>  3 files changed, 35 insertions(+), 19 deletions(-)
>
> diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> index 7252150e7ad35..28acb15e982b3 100644
> --- a/tools/lib/bpf/bpf.h
> +++ b/tools/lib/bpf/bpf.h
> @@ -286,6 +286,14 @@ LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch,
>   *    Update spin_lock-ed map elements. This must be
>   *    specified if the map value contains a spinlock.
>   *
> + * **BPF_F_CPU**
> + *    As for percpu maps, update value on the specified CPU. And the cpu
> + *    info is embedded into the high 32 bits of **opts->elem_flags**.
> + *
> + * **BPF_F_ALL_CPUS**
> + *    As for percpu maps, update value across all CPUs. This flag cannot
> + *    be used with BPF_F_CPU at the same time.
> + *
>   * @param fd BPF map file descriptor
>   * @param keys pointer to an array of *count* keys
>   * @param values pointer to an array of *count* values
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index fe4fc5438678c..c949281984880 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -10603,7 +10603,7 @@ bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name)
>  }
>
>  static int validate_map_op(const struct bpf_map *map, size_t key_sz,
> -                          size_t value_sz, bool check_value_sz)
> +                          size_t value_sz, bool check_value_sz, __u64 flags)
>  {
>         if (!map_is_created(map)) /* map is not yet created */
>                 return -ENOENT;
> @@ -10630,6 +10630,19 @@ static int validate_map_op(const struct bpf_map *map, size_t key_sz,
>                 int num_cpu = libbpf_num_possible_cpus();
>                 size_t elem_sz = roundup(map->def.value_size, 8);
>
> +               if (flags & (BPF_F_CPU | BPF_F_ALL_CPUS)) {
> +                       if ((flags & BPF_F_CPU) && (flags & BPF_F_ALL_CPUS))
> +                               return -EINVAL;
> +                       if ((flags >> 32) >= num_cpu)
> +                               return -ERANGE;

The idea of validate_map_op() is to make it easier for users to
understand what's wrong with how they deal with the map, rather than
just getting indiscriminate -EINVAL from the kernel.

Point being: add human-readable pr_warn() explanations for all the new
conditions you are detecting, otherwise it's just meaningless.

> +                       if (value_sz != elem_sz) {
> +                               pr_warn("map '%s': unexpected value size %zu provided for per-CPU map, expected %zu\n",
> +                                       map->name, value_sz, elem_sz);
> +                               return -EINVAL;
> +                       }
> +                       break;
> +               }
> +
>                 if (value_sz != num_cpu * elem_sz) {
>                         pr_warn("map '%s': unexpected value size %zu provided for per-CPU map, expected %d * %zu = %zd\n",
>                                 map->name, value_sz, num_cpu, elem_sz, num_cpu * elem_sz);
> @@ -10654,7 +10667,7 @@ int bpf_map__lookup_elem(const struct bpf_map *map,
>  {
>         int err;
>
> -       err = validate_map_op(map, key_sz, value_sz, true);
> +       err = validate_map_op(map, key_sz, value_sz, true, flags);
>         if (err)
>                 return libbpf_err(err);
>
> @@ -10667,7 +10680,7 @@ int bpf_map__update_elem(const struct bpf_map *map,
>  {
>         int err;
>
> -       err = validate_map_op(map, key_sz, value_sz, true);
> +       err = validate_map_op(map, key_sz, value_sz, true, flags);
>         if (err)
>                 return libbpf_err(err);
>
> @@ -10679,7 +10692,7 @@ int bpf_map__delete_elem(const struct bpf_map *map,
>  {
>         int err;
>
> -       err = validate_map_op(map, key_sz, 0, false /* check_value_sz */);
> +       err = validate_map_op(map, key_sz, 0, false /* check_value_sz */, 0);

hard-coded 0 instead of flags, why?

>         if (err)
>                 return libbpf_err(err);
>
> @@ -10692,7 +10705,7 @@ int bpf_map__lookup_and_delete_elem(const struct bpf_map *map,
>  {
>         int err;
>
> -       err = validate_map_op(map, key_sz, value_sz, true);
> +       err = validate_map_op(map, key_sz, value_sz, true, 0);

same about flags

>         if (err)
>                 return libbpf_err(err);
>
> @@ -10704,7 +10717,7 @@ int bpf_map__get_next_key(const struct bpf_map *map,
>  {
>         int err;
>
> -       err = validate_map_op(map, key_sz, 0, false /* check_value_sz */);
> +       err = validate_map_op(map, key_sz, 0, false /* check_value_sz */, 0);
>         if (err)
>                 return libbpf_err(err);
>
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 2e91148d9b44d..6a972a8d060c3 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -1196,12 +1196,13 @@ LIBBPF_API struct bpf_map *bpf_map__inner_map(struct bpf_map *map);
>   * @param key_sz size in bytes of key data, needs to match BPF map definition's **key_size**
>   * @param value pointer to memory in which looked up value will be stored
>   * @param value_sz size in byte of value data memory; it has to match BPF map
> - * definition's **value_size**. For per-CPU BPF maps value size has to be
> - * a product of BPF map value size and number of possible CPUs in the system
> - * (could be fetched with **libbpf_num_possible_cpus()**). Note also that for
> - * per-CPU values value size has to be aligned up to closest 8 bytes for
> - * alignment reasons, so expected size is: `round_up(value_size, 8)
> - * * libbpf_num_possible_cpus()`.
> + * definition's **value_size**. For per-CPU BPF maps, value size can be
> + * definition's **value_size** if **BPF_F_CPU** or **BPF_F_ALL_CPUS** is
> + * specified in **flags**, otherwise a product of BPF map value size and number
> + * of possible CPUs in the system (could be fetched with
> + * **libbpf_num_possible_cpus()**). Note else that for per-CPU values value
> + * size has to be aligned up to closest 8 bytes for alignment reasons, so

nit: aligned up for alignment reasons... drop "for alignment reasons", I guess?

> + * expected size is: `round_up(value_size, 8) * libbpf_num_possible_cpus()`.
>   * @flags extra flags passed to kernel for this operation
>   * @return 0, on success; negative error, otherwise
>   *
> @@ -1219,13 +1220,7 @@ LIBBPF_API int bpf_map__lookup_elem(const struct bpf_map *map,
>   * @param key pointer to memory containing bytes of the key
>   * @param key_sz size in bytes of key data, needs to match BPF map definition's **key_size**
>   * @param value pointer to memory containing bytes of the value
> - * @param value_sz size in byte of value data memory; it has to match BPF map
> - * definition's **value_size**. For per-CPU BPF maps value size has to be
> - * a product of BPF map value size and number of possible CPUs in the system
> - * (could be fetched with **libbpf_num_possible_cpus()**). Note also that for
> - * per-CPU values value size has to be aligned up to closest 8 bytes for
> - * alignment reasons, so expected size is: `round_up(value_size, 8)
> - * * libbpf_num_possible_cpus()`.
> + * @param value_sz refer to **bpf_map__lookup_elem**'s description.'
>   * @flags extra flags passed to kernel for this operation
>   * @return 0, on success; negative error, otherwise
>   *
> --
> 2.50.1
>