Re: [BUG REPORT] netfilter: DNS/SNAT Issue in Kubernetes Environment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 28, 2025 at 05:03:56PM +0800, Yafang Shao wrote:
> diff --git a/net/netfilter/nf_conntrack_core.c
> b/net/netfilter/nf_conntrack_core.c
> index 7bee5bd22be2..3481e9d333b0 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -1245,9 +1245,9 @@ __nf_conntrack_confirm(struct sk_buff *skb)
> 
>         chainlen = 0;
>         hlist_nulls_for_each_entry(h, n,
> &nf_conntrack_hash[reply_hash], hnnode) {
> -               if (nf_ct_key_equal(h, &ct->tuplehash[IP_CT_DIR_REPLY].tuple,
> -                                   zone, net))
> -                       goto out;
> +               //if (nf_ct_key_equal(h, &ct->tuplehash[IP_CT_DIR_REPLY].tuple,
> +               //                  zone, net))
> +               //      goto out;
>                 if (chainlen++ > max_chainlen) {
>  chaintoolong:
>                         NF_CT_STAT_INC(net, chaintoolong);

Forgive me for jumping in with very little information, but on a hunch I
tried something.  I applied the above patch to another bug I've been
investigating:

https://bugzilla.netfilter.org/show_bug.cgi?id=1795
and Ubuntu reference
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2109889

The Ubuntu reproduction steps where easier to follow, so I mimicked
them:

# cat add_ip.sh 
ip addr add 10.0.1.200/24 dev enp1s0
# cat nft.sh 
nft -f - <<EOF
table ip dnat-test {
 chain prerouting {
  type nat hook prerouting priority dstnat; policy accept;
  ip daddr 10.0.1.200 udp dport 1234 counter dnat to 10.0.1.180:1234
 }
}
EOF
# cat listen.sh 
echo pong|nc -l -u 10.0.1.180 1234
# ./add_ip.sh ; ./nft.sh ; listen.sh (and then just ./listen.sh again)

On a client machine I ran:
$ echo ping|nc -u -p 4321 10.0.1.200 1234
$ echo ping|nc -u -p 4321 10.0.1.180 1234

And sure enough the listen.sh never completes (demonstrates the bug).

When I apply the above patch, the problem goes away.

What I _also_ was able to do to make the problem go away was to apply
the following patch:

diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index aad84aabd7f1..fecf5591f424 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -727,7 +727,7 @@ get_unique_tuple(struct nf_conntrack_tuple *tuple,
            !(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL)) {
                /* try the original tuple first */
                if (nf_in_range(orig_tuple, range)) {
-                       if (!nf_nat_used_tuple_new(orig_tuple, ct)) {
+                       if (!nf_nat_used_tuple(orig_tuple, ct)) {
                                *tuple = *orig_tuple;
                                return;
                        }

This was suggested to me by the bug report.  I had not brought this up
yet, as I had little understanding of why and what else was broken by
reverting to nf_nat_used_tuple from _new.

I thought that both patches fix the problem might be of interest.  I'll
keep digging in to my understanding.....



SB




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux