Re: [PATCH bpf-next v4 4/7] bpf: Introduce BPF_F_CPU and BPF_F_ALL_CPUS flags for percpu_hash and lru_percpu_hash maps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 27, 2025 at 9:45 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote:
>
> Introduce BPF_F_ALL_CPUS flag support for percpu_hash and lru_percpu_hash
> maps to allow updating values for all CPUs with a single value for both
> update_elem and update_batch APIs.
>
> Introduce BPF_F_CPU flag support for percpu_hash and lru_percpu_hash
> maps to allow:
>
> * update value for specified CPU for both update_elem and update_batch
>   APIs.
> * lookup value for specified CPU for both lookup_elem and lookup_batch
>   APIs.
>
> The BPF_F_CPU flag is passed via:
>
> * map_flags along with embedded cpu info.
> * elem_flags along with embedded cpu info.
>
> Signed-off-by: Leon Hwang <leon.hwang@xxxxxxxxx>
> ---
>  include/linux/bpf.h  |  4 +-
>  kernel/bpf/hashtab.c | 95 +++++++++++++++++++++++++++-----------------
>  kernel/bpf/syscall.c |  2 +-
>  3 files changed, 63 insertions(+), 38 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index bc44b72129e59..c120b00448a13 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -2745,7 +2745,7 @@ int map_set_for_each_callback_args(struct bpf_verifier_env *env,
>                                    struct bpf_func_state *caller,
>                                    struct bpf_func_state *callee);
>
> -int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value);
> +int bpf_percpu_hash_copy(struct bpf_map *map, void *key, void *value, u64 flags);
>  int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value,
>                           u64 flags);
>  int bpf_percpu_hash_update(struct bpf_map *map, void *key, void *value,
> @@ -3763,6 +3763,8 @@ static inline bool bpf_map_supports_cpu_flags(enum bpf_map_type map_type)
>  {
>         switch (map_type) {
>         case BPF_MAP_TYPE_PERCPU_ARRAY:
> +       case BPF_MAP_TYPE_PERCPU_HASH:
> +       case BPF_MAP_TYPE_LRU_PERCPU_HASH:
>                 return true;
>         default:
>                 return false;
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 71f9931ac64cd..031a74c1b7fd7 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -937,24 +937,39 @@ static void free_htab_elem(struct bpf_htab *htab, struct htab_elem *l)
>  }
>
>  static void pcpu_copy_value(struct bpf_htab *htab, void __percpu *pptr,
> -                           void *value, bool onallcpus)
> +                           void *value, bool onallcpus, u64 map_flags)
>  {
> +       int cpu = map_flags & BPF_F_CPU ? map_flags >> 32 : 0;
> +       int current_cpu = raw_smp_processor_id();
> +
>         if (!onallcpus) {
>                 /* copy true value_size bytes */
> -               copy_map_value(&htab->map, this_cpu_ptr(pptr), value);
> +               copy_map_value(&htab->map, (map_flags & BPF_F_CPU) && cpu != current_cpu ?
> +                              per_cpu_ptr(pptr, cpu) : this_cpu_ptr(pptr), value);

FWIW, I still feel like this_cpu_ptr() micro-optimization is
unnecessary and is just a distraction. This code is called when
user-space updates/looks up per-CPU value, it's not a hot path by any
means where this_cpu_ptr() vs per_cpu_ptr() makes any measurable
difference

>         } else {
>                 u32 size = round_up(htab->map.value_size, 8);
> -               int off = 0, cpu;
> +               int off = 0;
> +
> +               if (map_flags & BPF_F_CPU) {
> +                       copy_map_value_long(&htab->map, cpu != current_cpu ?
> +                                           per_cpu_ptr(pptr, cpu) : this_cpu_ptr(pptr), value);
> +                       return;
> +               }
>
>                 for_each_possible_cpu(cpu) {
>                         copy_map_value_long(&htab->map, per_cpu_ptr(pptr, cpu), value + off);
> -                       off += size;
> +                       /* same user-provided value is used if
> +                        * BPF_F_ALL_CPUS is specified, otherwise value is
> +                        * an array of per-cpu values.
> +                        */
> +                       if (!(map_flags & BPF_F_ALL_CPUS))
> +                               off += size;
>                 }
>         }

this couldn't have been replaced by bpf_percpu_copy_to_user?.. (and I
just want to emphasize how hard did you make it to review all this by
putting those bpf_percpu_copy_to_user and bpf_percpu_copy_from_user on
its own in patch #2, instead of having a proper refactoring patch...)


[...]





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux