On Mon, Jul 28, 2025 at 06:21:50PM -0700, Martin KaFai Lau wrote: > On 7/28/25 2:43 AM, Mahe Tardy wrote: > > Hello, > > > > This is v3 of adding the icmp_send_unreach kfunc, as suggested during > > LSF/MM/BPF 2025[^1]. The goal is to allow cgroup_skb programs to > > actively reject east-west traffic, similarly to what is possible to do > > with netfilter reject target. > > > > The first step to implement this is using ICMP control messages, with > > the ICMP_DEST_UNREACH type with various code ICMP_NET_UNREACH, > > ICMP_HOST_UNREACH, ICMP_PROT_UNREACH, etc. This is easier to implement > > than a TCP RST reply and will already hint the client TCP stack to abort > > the connection and not retry extensively. > > > > Note that this is different than the sock_destroy kfunc, that along > > calls tcp_abort and thus sends a reset, destroying the underlying > > socket. > > > > Caveats of this kfunc design are that a cgroup_skb program can call this > > function N times, thus send N ICMP unreach control messages and that the > > program can return from the BPF filter with SK_PASS leading to a > > potential confusing situation where the TCP connection was established > > while the client received ICMP_DEST_UNREACH messages. > > > > Another more sophisticated design idea would be for the kfunc to set the > > kernel to send an ICMP_HOST_UNREACH control message with the appropriate > > code when the cgroup_skb program terminates with SK_DROP. Creating a new > > 'SK_REJECT' return code for cgroup_skb program was generally rejected > > and would be too limited for other program types support. > > > > We should bear in mind that we want to add a TCP reset kfunc next and > > also could extend this kfunc to other program types if wanted. > > Some high level questions. > > Which other program types do you need this kfunc to send icmp and the future > tcp rst? I don't really know, I mostly need this in cgroup_skb for my use case but I could see other programs type using this either for simplification (for progs that can already rewrite the packet, like tc) or other programs types like cgroup_skb, because they can't touch the packet themselves. > > This cover letter mentioned sending icmp unreach is easier than sending tcp > rst. What problems do you see in sending tcp rst? > Yes, I based these patches on what net/ipv4/netfilter/ipt_REJECT.c's 'reject_tg' function does. In the case of sending ICMP unreach 'nf_send_unreach', the routing step is quite straighforward as they are only inverting the daddr and the saddr (that's what my renamed/moved ip_route_reply_fetch_dst helper does). In the case of sending RST 'nf_send_reset', there are extra steps, first the same routing mechanism is done by just inverting the daddr and the saddr but later 'ip_route_me_harder' is called which is doing a lot more. I'm currently not sure which parts of this must be ported to work in our BPF use case so I wanted to start with unreach.