On Wed, Jun 18, 2025 at 7:27 AM Ignat Korchagin <ignat@xxxxxxxxxxxxxx> wrote: > > On Wed, Jun 18, 2025 at 3:01 PM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Wed, Jun 18, 2025 at 5:29 AM Matt Fleming <matt@xxxxxxxxxxxxxxxx> wrote: > > > > > > On Tue, Jun 17, 2025 at 4:55 PM Alexei Starovoitov > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > On Tue, Jun 17, 2025 at 2:43 AM Matt Fleming <matt@xxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > soft lockup - CPU#41 stuck for 76s > > > > > > > > How many elements are in the trie that it takes 76 seconds?? > > > > > > We run our maps with potentially millions of entries, so it's the size > > > of the map plus the fact that kfree() does more work with KASAN that > > > triggers this for us. > > > > > > > I feel the issue is different. > > > > It seems the trie_free() algorithm doesn't scale. > > > > Pls share a full reproducer. > > > > > > Yes, the scalability of the algorithm is also an issue. Jesper (CC'd) > > > had some thoughts on this. > > > > > > But regardless, it seems like a bad idea to have an unbounded loop > > > inside the kernel that processes user-controlled data. > > > > 1M kfree should still be very fast even with kasan, lockdep, etc. > > 76 seconds is an algorithm problem. Address the root cause. > > What if later we have 1G? 100G? Apart from the root cause we still > have "scalability concerns" unless we can somehow reimplement this as > O(1) Do your homework pls. Set max_entries to 100G and report back. Then set max_entries to 1G _with_ cond_rescehd() hack and report back.