On Tue, May 27, 2025 at 7:35 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, May 27, 2025 at 4:25 PM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Tue, May 27, 2025 at 3:40 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Mon, May 26, 2025 at 9:22 AM Leon Hwang <leon.hwang@xxxxxxxxx> wrote: > > > > + > > > > + data_sz = map->def.value_size; > > > > + if (is_percpu) { > > > > + num_cpus = libbpf_num_possible_cpus(); > > > > + if (num_cpus < 0) { > > > > + err = num_cpus; > > > > + return err; > > > > + } > > > > + > > > > + data_sz = data_sz * num_cpus; > > > > + data = malloc(data_sz); > > > > + if (!data) { > > > > + err = -ENOMEM; > > > > + return err; > > > > + } > > > > + > > > > + elem_sz = map->def.value_size; > > > > + for (i = 0; i < num_cpus; i++) > > > > + memcpy(data + i * elem_sz, map->mmaped, elem_sz); > > > > + } else { > > > > + data = map->mmaped; > > > > + } > > > > > > > > if (obj->gen_loader) { > > > > bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps, > > > > - map->mmaped, map->def.value_size); > > > > + data, data_sz); > > > > > > I missed it earlier, but now I wonder how this is supposed to work ? > > > skel and lskel may be generated on a system with N cpus, > > > but loaded with M cpus. > > > > > > Another concern is num_cpus multiplier can be huge. > > > lksel adds all that init data into a global array. > > > Pls avoid this multiplier. > > > > Hm... For skel, the number of CPUs at runtime isn't a problem, it's > > only memory waste for this temporary data. But it is forced on us by > > kernel contract for MAP_UPDATE_ELEM for per-CPU maps. > > > > Should we have a flag for map update command for per-CPU maps that > > would mean "use this data as a value for each CPU"? Then we can > > provide just a small piece of initialization data and not have to rely > > on the number of CPUs. This will also make lskel part very simple. > > Initially it felt too specific, but I think it makes sense. > The contract was too restrictive. Let's add the flag. > > > Alternatively (and perhaps more flexibly) we can extend > > MAP_UPDATE_ELEM with ability to specify specific CPU for per-CPU maps. > > I'd probably have a MAP_LOOKUP_ELEM counterpart for this as well. Then > > skeleton/light skeleton code can iterate given number of times to > > initialize all CPUs using small initial data image. > > I guess this can be a follow up. > With extra flag lookup/update/delete can look into a new field > in that anonymous struct: > struct { /* anonymous struct used by BPF_MAP_*_ELEM and > BPF_MAP_FREEZE commands */ > __u32 map_fd; > __aligned_u64 key; > union { > __aligned_u64 value; > __aligned_u64 next_key; > }; > __u64 flags; > }; > Yep, we'd have two flags: one for "apply across all CPUs", and another meaning "apply for specified CPU" + new CPU number field. Or the same flag with a special CPU number value (0xffffffff?). > There is also "batch" version of lookup/update/delete. > They probably will need to be extended as well for consistency ? > So I'd only go with the "use data to update all CPUs" flag for now. Agreed. But also looking at generic_map_update_batch() it seems like it just routes everything through single-element updates, so it shouldn't be hard to add batch support for all this either.