On Tue, Jul 29, 2025 at 6:56 AM Matt Fleming <matt@xxxxxxxxxxxxxxxx> wrote: > > On Mon, Jul 28, 2025 at 3:35 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > Please make a full description of what the test does, > > since it's not trivial to decipher from the code. > > If I'm reading it correctly, first, the user space > > makes the LPM completely full and then lookup/update > > use random key-s within range. > > Yep, that's correct. I'll provide a more comprehensive description in v4. > > > But delete looks different. It seems the kernel delete > > operation can interleave with user space refilling the LPM, > > so ... > > > > > lookup: throughput 7.423 ± 0.023 M ops/s ( 7.423M ops/prod), latency 134.710 ns/op > > > update: throughput 2.643 ± 0.015 M ops/s ( 2.643M ops/prod), latency 378.310 ns/op > > > delete: throughput 0.712 ± 0.008 M ops/s ( 0.712M ops/prod), latency 1405.152 ns/op > > > > this comparison doesn't look like apples to apples, > > since delete will include user space refill time. > > Yeah, you're right. Though we only measure the delete time from in the > BPF prog, delete throughput is essentially blocked while the refill > happens and because things are measured with a 1-second timer > (bench.c:sigalarm_handler) the refill time gets accounted for anyway. > I could try extrapolating the delete time like I've done for the free > op, i.e. we calculate how many ops were completed in what fraction of > a second. > > > > free: throughput 0.574 ± 0.003 K ops/s ( 0.574K ops/prod), latency 1.743 ms/op > > > > Does this measure the free-ing of full LPM ? > > Yes, this measures the total time to free every element in the trie. > > > > +static void __lpm_validate(void) > > > > why double underscore ? > > Just lpm_validate() ? > > The double underscore is used for the common validation parts, but > I'll rename this to include "_common()" (along with all other uses of > __). No. It's also wrong. Double underscore or _common suffix are both meaningless. The helper name should describe what it's for. > > > + /* > > > + * Ideally we'd have access to the map ID but that's already > > > + * freed before we enter trie_free(). > > > + */ > > > + BPF_CORE_READ_STR_INTO(&name, map, name); > > > + if (bpf_strncmp(name, BPF_OBJ_NAME_LEN, "trie_free_map")) > > > + return 0; > > > + > > > + val = bpf_ktime_get_ns(); > > > + bpf_map_update_elem(&latency_free_start, &map, &val, BPF_ANY); > > > > Looks like there is only one lpm map. > > What's the point of that extra map ? > > Just have a global var for start time ? > > Sure, I can make this a global variable for the start time instead. > > > bpf_get_prandom_u32() is not free > > and modulo operation isn't free either. > > The benchmark includes their time. > > It's ok to have it, but add a mode where the bench > > tests linear lookup/update too with simple key.data++ > > Good idea. > > > Since the map suppose to full before we start all keys > > should be there, right? > > Yes. > > > Let's add a sanity check that update() succeeds. > > Will do. > > > > +static int delete (__u32 index, bool *need_refill) > > > +{ > > > + struct trie_key key = { > > > + .data = deleted_entries, > > > + .prefixlen = prefixlen, > > > + }; > > > + > > > + bpf_map_delete_elem(&trie_map, &key); > > > + > > > + /* Do we need to refill the map? */ > > > + if (++deleted_entries == nr_entries) { > > > + /* > > > + * Atomicity isn't required because DELETE only supports > > > + * one producer running concurrently. What we need is a > > > + * way to track how many entries have been deleted from > > > + * the trie between consecutive invocations of the BPF > > > + * prog because a single bpf_loop() call might not > > > + * delete all entries, e.g. when NR_LOOPS < nr_entries. > > > + */ > > > + deleted_entries = 0; > > > + *need_refill = true; > > > + return 1; > > > > This early break is another reason that makes > > 'delete' op different from 'lookup/update'. > > Pls make all 3 consistent, so they can be compared to each other. > > Hmm.. I'm not quite sure how to do that. lookup/update don't modify > the number of entries in the map whereas delete does (update only > updates existing entries, it doesn't create new ones). So when the map > is empty it needs to be refilled. You're right that somehow the time > to refill needs to be removed from the delete throughput/latency > numbers but fundamentally these 3 ops are not the same and I don't see > how to treat them as such. well, random-key update when the map is full is also quite different from random-key update when the map is empty. Instead doing an update from user space do timed ops: 1 start with empty map, update (aka insert) all keys sequentially 2 lookup all sequentially 3 delete all sequentially 4 update (aka insert) all sequentially 5 lookup random 6 update random 7 delete all random The elapsed time for 1 and 4 should be exactly the same. While all others might have differences, but all can be compared to each other and reasoned about.