On Wed, 20 Aug 2025 16:47:36 +0200 Florian Westphal <fw@xxxxxxxxx> wrote: > Always prefer the avx2 implementation if its available. > This greatly improves insertion performance (each insertion > checks if the new element would overlap with an existing one): > > time nft -f - <<EOF > table ip pipapo { > set s { > typeof ip saddr . tcp dport > flags interval > size 800000 > elements = { 10.1.1.1 - 10.1.1.4 . 3996, > [.. 800k entries elided .. ] > > before: > real 1m55.993s > user 0m2.505s > sys 1m53.296s > > after: > real 0m42.586s > user 0m2.554s > sys 0m39.811s > > Fold patch from Sebastian: > > kernel_fpu_begin_mask()/ _end() remains in pipapo_get_avx2() where it is > required. > > A followup patch will add local_lock_t to struct nft_pipapo_scratch in > order to protect the map pointer. The lock can not be acquired in > preemption disabled context which is what kernel_fpu_begin*() does. > > Link: https://lore.kernel.org/netfilter-devel/20250818110213.1319982-2-bigeasy@xxxxxxxxxxxxx/ > Co-developed-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> Reviewed-by: Stefano Brivio <sbrivio@xxxxxxxxxx> -- Stefano