On Thu, Aug 28, 2025 at 11:14:35AM +0200, Fabian Bläse wrote: > The icmp_ndo_send function was originally introduced to ensure proper > rate limiting when icmp_send is called by a network device driver, > where the packet's source address may have already been transformed > by SNAT. > > However, the original implementation only considers the > IP_CT_DIR_ORIGINAL direction for SNAT and always replaced the packet's > source address with that of the original-direction tuple. This causes > two problems: > > 1. For SNAT: > Reply-direction packets were incorrectly translated using the source > address of the CT original direction, even though no translation is > required. > > 2. For DNAT: > Reply-direction packets were not handled at all. In DNAT, the original > direction's destination is translated. Therefore, in the reply > direction the source address must be set to the reply-direction > source, so rate limiting works as intended. > > Fix this by using the connection direction to select the correct tuple > for source address translation, and adjust the pre-checks to handle > reply-direction packets in case of DNAT. > > Additionally, wrap the `ct->status` access in READ_ONCE(). This avoids > possible KCSAN reports about concurrent updates to `ct->status`. I think such concurrent update cannot not happen, NAT bits are only set for the first packet of a connection, which sets up the nat configuration, so READ_ONCE() can go away. Florian? > Fixes: 0b41713b6066 ("icmp: introduce helper for nat'd source address in network device context") > > Signed-off-by: Fabian Bläse <fabian@xxxxxxxxx> > Cc: Jason A. Donenfeld <Jason@xxxxxxxxx> > Cc: Florian Westphal <fw@xxxxxxxxx> > --- > Changes v1->v2: > - Implement fix for ICMPv6 as well > > Changes v2->v3: > - Collapse conditional tuple selection into a single direction lookup [Florian] > - Always apply source address translation if IPS_NAT_MASK is set [Florian] > - Wrap ct->status in READ_ONCE() > - Add a clearer explanation of the behaviour change for DNAT > --- > net/ipv4/icmp.c | 6 ++++-- > net/ipv6/ip6_icmp.c | 6 ++++-- > 2 files changed, 8 insertions(+), 4 deletions(-) > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c > index 2ffe73ea644f..c48c572f024d 100644 > --- a/net/ipv4/icmp.c > +++ b/net/ipv4/icmp.c > @@ -799,11 +799,12 @@ void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info) > struct sk_buff *cloned_skb = NULL; > struct ip_options opts = { 0 }; > enum ip_conntrack_info ctinfo; > + enum ip_conntrack_dir dir; > struct nf_conn *ct; > __be32 orig_ip; > > ct = nf_ct_get(skb_in, &ctinfo); > - if (!ct || !(ct->status & IPS_SRC_NAT)) { > + if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) { > __icmp_send(skb_in, type, code, info, &opts); > return; > } > @@ -818,7 +819,8 @@ void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info) > goto out; > > orig_ip = ip_hdr(skb_in)->saddr; > - ip_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.ip; > + dir = CTINFO2DIR(ctinfo); > + ip_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.ip; > __icmp_send(skb_in, type, code, info, &opts); > ip_hdr(skb_in)->saddr = orig_ip; > out: > diff --git a/net/ipv6/ip6_icmp.c b/net/ipv6/ip6_icmp.c > index 9e3574880cb0..233914b63bdb 100644 > --- a/net/ipv6/ip6_icmp.c > +++ b/net/ipv6/ip6_icmp.c > @@ -54,11 +54,12 @@ void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info) > struct inet6_skb_parm parm = { 0 }; > struct sk_buff *cloned_skb = NULL; > enum ip_conntrack_info ctinfo; > + enum ip_conntrack_dir dir; > struct in6_addr orig_ip; > struct nf_conn *ct; > > ct = nf_ct_get(skb_in, &ctinfo); > - if (!ct || !(ct->status & IPS_SRC_NAT)) { > + if (!ct || !(READ_ONCE(ct->status) & IPS_NAT_MASK)) { > __icmpv6_send(skb_in, type, code, info, &parm); > return; > } > @@ -73,7 +74,8 @@ void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info) > goto out; > > orig_ip = ipv6_hdr(skb_in)->saddr; > - ipv6_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.in6; > + dir = CTINFO2DIR(ctinfo); > + ipv6_hdr(skb_in)->saddr = ct->tuplehash[dir].tuple.src.u3.in6; > __icmpv6_send(skb_in, type, code, info, &parm); > ipv6_hdr(skb_in)->saddr = orig_ip; > out: > -- > 2.51.0 > >