On Tue, Jul 29, 2025 at 06:54:58PM -0700, Martin KaFai Lau wrote: > On 7/29/25 2:53 AM, Mahe Tardy wrote: > > > Which other program types do you need this kfunc to send icmp and the future > > > tcp rst? > > > > I don't really know, I mostly need this in cgroup_skb for my use case > > but I could see other programs type using this either for simplification > > (for progs that can already rewrite the packet, like tc) or other > > programs types like cgroup_skb, because they can't touch the packet > > themselves. > > I also don't think the tc needs this kfunc either. The tc should already > have ways to do this now. > > > > > > > > > This cover letter mentioned sending icmp unreach is easier than sending tcp > > > rst. What problems do you see in sending tcp rst? > > > > > > > Yes, I based these patches on what net/ipv4/netfilter/ipt_REJECT.c's > > 'reject_tg' function does. In the case of sending ICMP unreach > > 'nf_send_unreach', the routing step is quite straighforward as they are > > only inverting the daddr and the saddr (that's what my renamed/moved > > ip_route_reply_fetch_dst helper does). > > > > In the case of sending RST 'nf_send_reset', there are extra steps, first > > the same routing mechanism is done by just inverting the daddr and the > > saddr but later 'ip_route_me_harder' is called which is doing a lot > > more. I'm currently not sure which parts of this must be ported to work > > in our BPF use case so I wanted to start with unreach. > > I don't think we necessarily need to completely borrow from nf, the hooks' > locations are different and the use case may be different. > > A concern that I have is the icmp6_send called by the kfunc. The icmp6_send > should eventually call to ip6_finish_output which may call the very same > "cgroup/egress" program again in a recursive way. The same for v4 icmp_send. > > The icmp packet is sent from an internal kernel sk. I suspect you will see > this recursive behavior if the test is done in the default cgroup > (/sys/fs/cgroup). I think the is_ineligible(skb) should have stopped the > second icmpv6_send from replying to an icmp error and the cgroup hook cannot > change the skb. However, I am not sure I want to cross this bridge. Is there > a way to avoid the recursive bpf prog? > Thanks Martin for the review. Indeed the recursive BPF prog call is a concerning issue. I'll take some time to think about it and hopefully propose something.