On 8/20/25 7:23 PM, Maciej Żenczykowski wrote:
On Mon, Aug 18, 2025 at 1:58 PM Yonghong Song
<yonghong.song@xxxxxxxxx> wrote:
> On 8/13/25 12:39 AM, Maciej Żenczykowski wrote:
> > BPF_MAP_LOOKUP_AND_DELETE_BATCH keys & values == NULL
> > seems like a nice way to simply quickly clear a map.
>
> This will change existing API as users will expect
> some error (e.g., -EFAULT) return when keys or values is NULL.
No reasonable user will call the current api with NULLs.
I do agree it is really unlikely users will have NULL keys or values.
This is a similar API change to adding a new system call
(where previously it returned -ENOSYS) - which *is* also a UAPI
change, but obviously allowed.
Or adding support for a new address family / protocol (where
previously it -EAFNOSUPPORT)
Or adding support for a new flag (where previously it returned -EINVAL)
Consider why userspace would ever pass in NULL, two possibilities:
(a) explicit NULL - you'd never do this since it would (till now)
always -EFAULT,
so this would only possibly show up in a very thorough test suite
(b) you're using dynamically allocated memory and it failed allocation.
that's already a program bug, you should catch that before you call
bpf().
Okay. What you describes make sense.
Could you add a selftest for this?
Could you add some comments in below uapi bpf.h header to new functionality?
> We have a 'flags' field in uapi header in
>
> struct { /* struct used by BPF_MAP_*_BATCH commands */
> __aligned_u64 in_batch; /* start batch,
> * NULL to start
from beginning
> */
> __aligned_u64 out_batch; /* output: next
start batch */
> __aligned_u64 keys;
> __aligned_u64 values;
> __u32 count; /* input/output:
> * input: # of
key/value
> * elements
> * output: # of
filled elements
> */
> __u32 map_fd;
> __u64 elem_flags;
> __u64 flags;
> } batch;
>
> we can add a flag in 'flags' like BPF_F_CLEAR_MAP_IF_KV_NULL with a
comment
> that if keys or values is NULL, the batched elements will be cleared.
I just don't see what value this provides.
> > BPF_MAP_LOOKUP keys/values == NULL might be useful if we just want
> > the values/keys and don't want to bother copying the keys/values...
> >
> > BPF_MAP_LOOKUP keys & values == NULL might be useful to count
> > the number of populated entries.
>
> bpf_map_lookup_elem() does not have flags field, so we probably
should not
> change existins semantics.
This is unrelated to this patch, since this only touches 'batch'
operation.
(unless I'm missing something)
> > Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > Cc: Stanislav Fomichev <sdf@xxxxxxxxxxx>
> > Signed-off-by: Maciej Żenczykowski <maze@xxxxxxxxxx>
> > ---
> > kernel/bpf/hashtab.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> > index 5001131598e5..8fbdd000d9e0 100644
> > --- a/kernel/bpf/hashtab.c
> > +++ b/kernel/bpf/hashtab.c
> > @@ -1873,9 +1873,9 @@ __htab_map_lookup_and_delete_batch(struct
bpf_map *map,
> >
> > rcu_read_unlock();
> > bpf_enable_instrumentation();
> > - if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
> > + if (bucket_cnt && (ukeys && copy_to_user(ukeys + total *
key_size, keys,
> > key_size * bucket_cnt) ||
> > - copy_to_user(uvalues + total * value_size, values,
> > + uvalues && copy_to_user(uvalues + total * value_size,
values,
> > value_size * bucket_cnt))) {
> > ret = -EFAULT;
> > goto after_loop;
>
--
Maciej Żenczykowski, Kernel Networking Developer @ Google