Re: [PATCH RFC nf-next 1/2] netfilter: nft_set_pipapo_avx2: fix skip of expired entries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 12 Sep 2025 15:19:59 +0200
Florian Westphal <fw@xxxxxxxxx> wrote:

> KASAN reports following splat:
> BUG: KASAN: slab-out-of-bounds in pipapo_get_avx2+0x941/0x25d0
> Read of size 1 at addr ffff88814c561be0 by task nft/3944
> 
> CPU: 7 UID: 0 PID: 3944 Comm: nft Not tainted 6.17.0-rc4+ #637 PREEMPT(full)
> Call Trace:
>  pipapo_get_avx2+0x941/0x25d0
>  ? __local_bh_enable_ip+0x116/0x1a0
>  ? pipapo_get_avx2+0xee/0x25d0
>  ? nft_pipapo_insert+0x22b/0x11b0
>  nft_pipapo_insert+0x440/0x11b0
>  nf_tables_newsetelem+0x220a/0x3a00
>  ..
> 
> This bisects down to
> 84c1da7b38d9 ("netfilter: nft_set_pipapo: use AVX2 algorithm for insertions too"),
> however, it merely uncovers this bug.
> 
> When we find a match but that match has expired or timed out, the AVX2
> implementation restarts the full match loop.
> 
> At that point, data (key element or start of register space with the key)
> has already been incremented to point to the last key field:
> out-of-bounds access occurs.

Oops.

By the way, you're referring to 'rp' here, and nothing else, right?
Could you mention that explicitly if that's the case? I have to say
that "data (...) has already been incremented" took me a while to
understand.

> The restart logic in AVX2 is different compared to the plain C
> implementation, but both should follow the same logic.

That's because I wanted to avoid calling pipapo_refill() from the AVX2
lookup path, as it was significantly slower than re-doing the full
lookup, at least for "net, port" sets which I assumed would be the most
common.

On the other hand, it should always be a corner case (right?), so I
guess simplicity / consistency should prevail.

> The C implementation just calls pipapo_refill() again to check the next
> entry.  Do the same in the AVX2 implementation.

An alternative would be to reset rp = key, but maybe what you're doing
is actually saner, see above.

> Note that with this change, due to implementation differences of
> pipapo_refill vs. nft_pipapo_avx2_refill, the refill call will return
> the same element again, then, on the next call it will move to the next
> entry as expected.  This is because avx2_refill doesn't clear the bitmap
> in the 'last' conditional.  This is harmless.
> 
> A selftest test case comes in a followup patch.
> 
> Sent as RFC tag because it needs to be revamped after net -> net-next
> merge, there are conflicting changes in these two trees at the moment.
> 
> Another alternative is to retarget this patch to nf, but given its
> a day-0 bug that only got exposed due to the use of AVX2 in insertion
> path added recently I think -next is fine.
> 
> Cc: Stefano Brivio <sbrivio@xxxxxxxxxx>
> Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
> Signed-off-by: Florian Westphal <fw@xxxxxxxxx>
> ---
>  net/netfilter/nft_set_pipapo_avx2.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
> index 7559306d0aed..d97b67a4de16 100644
> --- a/net/netfilter/nft_set_pipapo_avx2.c
> +++ b/net/netfilter/nft_set_pipapo_avx2.c
> @@ -1179,7 +1179,6 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  	nft_pipapo_avx2_prepare();
>  
> -next_match:
>  	nft_pipapo_for_each_field(f, i, m) {
>  		bool last = i == m->field_count - 1, first = !i;
>  		int ret = 0;
> @@ -1226,6 +1225,7 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  #undef NFT_SET_PIPAPO_AVX2_LOOKUP
>  
> +next_match:
>  		if (ret < 0) {
>  			scratch->map_index = map_index;
>  			kernel_fpu_end();
> @@ -1238,8 +1238,10 @@ struct nft_pipapo_elem *pipapo_get_avx2(const struct nft_pipapo_match *m,
>  
>  			e = f->mt[ret].e;
>  			if (unlikely(__nft_set_elem_expired(&e->ext, tstamp) ||
> -				     !nft_set_elem_active(&e->ext, genmask)))
> -				goto next_match;
> +				     !nft_set_elem_active(&e->ext, genmask))) {
> +				ret = pipapo_refill(res, f->bsize, f->rules, fill, f->mt, last);

It could be wrapped like this:

				ret = pipapo_refill(res, f->bsize, f->rules,
						    fill, f->mt, last);

> +				goto next_match:
> +			}
>  
>  			scratch->map_index = map_index;
>  			kernel_fpu_end();

In any case:

Reviewed-by: Stefano Brivio <sbrivio@xxxxxxxxxx>

-- 
Stefano





[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux