Re: [PATCH nf 1/2] netfilter: nf_nat: also check reverse tuple to obtain clashing entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 30, 2025 at 6:34 PM Florian Westphal <fw@xxxxxxxxx> wrote:
>
> The logic added in the blamed commit was supposed to only omit nat source
> port allocation if neither the existing nor the new entry are subject to
> NAT.
>
> However, its not enough to lookup the conntrack based on the proposed
> tuple, we must also check the reverse direction.
>
> Otherwise there are esoteric cases where the collision is in the reverse
> direction because that colliding connection has a port rewrite, but the
> new entry doesn't.  In this case, we only check the new entry and then
> erronously conclude that no clash exists anymore.
>
>  The existing (udp) tuple is:
>   a:p -> b:P, with nat translation to s:P, i.e. pure daddr rewrite,
>   reverse tuple in conntrack table is s:P -> a:p.
>
> When another UDP packet is sent directly to s, i.e. a:p->s:P, this is
> correctly detected as a colliding entry: tuple is taken by existing reply
> tuple in reverse direction.
>
> But the colliding conntrack is only searched for with unreversed
> direction, and we can't find such entry matching a:p->s:P.
>
> The incorrect conclusion is that the clashing entry has timed out and
> that no port address translation is required.
>
> Such conntrack will then be discarded at nf_confirm time because the
> proposed reverse direction clashes with an existing mapping in the
> conntrack table.
>
> Search for the reverse tuple too, this will then check the NAT bits of
> the colliding entry and triggers port reallocation.
>
> Followp patch extends nft_nat.sh selftest to cover this scenario.
>
> The IPS_SEQ_ADJUST change is also a bug fix:
> Instead of checking for SEQ_ADJ this tested for SEEN_REPLY and ASSURED
> by accident -- _BIT is only for use with the test_bit() API.
>
> This bug has little consequence in practice, because the sequence number
> adjustments are only useful for TCP which doesn't support clash resolution.
>
> The existing test case (conntrack_clash_clash.sh) exercise a race
> condition path (parallel conntrack creation on different CPUs), so
> the colliding entries have neither SEEN_REPLY nor ASSURED set.
>
> Thanks to Yafang Shao and Shaun Brady for an initial investigation
> of this bug.
>
> Fixes: d8f84a9bc7c4 ("netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash")
> Reported-by: Yafang Shao <laoar.shao@xxxxxxxxx>
> Reported-by: Shaun Brady <brady.1345@xxxxxxxxx>
> Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1795
> Signed-off-by: Florian Westphal <fw@xxxxxxxxx>

Tested-by: Yafang Shao <laoar.shao@xxxxxxxxx>

> ---
>  net/netfilter/nf_nat_core.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> index aad84aabd7f1..f391cd267922 100644
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -248,7 +248,7 @@ static noinline bool
>  nf_nat_used_tuple_new(const struct nf_conntrack_tuple *tuple,
>                       const struct nf_conn *ignored_ct)
>  {
> -       static const unsigned long uses_nat = IPS_NAT_MASK | IPS_SEQ_ADJUST_BIT;
> +       static const unsigned long uses_nat = IPS_NAT_MASK | IPS_SEQ_ADJUST;
>         const struct nf_conntrack_tuple_hash *thash;
>         const struct nf_conntrack_zone *zone;
>         struct nf_conn *ct;
> @@ -287,8 +287,14 @@ nf_nat_used_tuple_new(const struct nf_conntrack_tuple *tuple,
>         zone = nf_ct_zone(ignored_ct);
>
>         thash = nf_conntrack_find_get(net, zone, tuple);
> -       if (unlikely(!thash)) /* clashing entry went away */
> -               return false;
> +       if (unlikely(!thash)) {
> +               struct nf_conntrack_tuple reply;
> +
> +               nf_ct_invert_tuple(&reply, tuple);
> +               thash = nf_conntrack_find_get(net, zone, &reply);
> +               if (!thash) /* clashing entry went away */
> +                       return false;
> +       }
>
>         ct = nf_ct_tuplehash_to_ctrack(thash);
>
> --
> 2.49.0
>


-- 
Regards
Yafang





[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux