lvxiafei <xiafei_xupt@xxxxxxx> wrote: > Florian Westphal <fw@xxxxxxxxx> wrote: > > Whats the function of nf_conntrack_max? > > After this change its always 0? > > nf_conntrack_max is a global (ancestor) limit, by default > nf_conntrack_max = max_factor * nf_conntrack_htable_size. Argh. net.netfilter.nf_conntrack_max is replaced by init_net.nf_conntrack_max in your patch. But not net.nf_conntrack_max, so they are now different and not related at all anymore except that the latter overrides the former even in init_net. I'm not sure this is sane. And it needs an update to Documentation/networking/nf_conntrack-sysctl.rst in any case. Also: - if (nf_conntrack_max && unlikely(ct_count > nf_conntrack_max)) { + if (net->ct.sysctl_max && unlikely(ct_count > min(nf_conntrack_max, net->ct.sysctl_max))) { ... can't be right, this allows a 0 setting in the netns. So, setting 0 in non-init-net must be disallowed. I suggest to remove nf_conntrack_max as a global variable, make net.nf_conntrack_max use init_net.nf_conntrack_max too internally, so in the init_net both sysctls remain the same. Then, change __nf_conntrack_alloc() to do: unsigned int nf_conntrack_max = min(net->ct.sysctl_max, &init_net.ct.sysctl_max); and leave the if-condition as is, i.e.: if (nf_conntrack_max && unlikely(ct_count > nf_conntrack_max)) { ... It means: each netns can pick an arbitrary value (but not 0, this ability needs to be removed). When a new conntrack is allocated, then: If the limit in the init_net is lower than the netns, then that limit applies, so it provides upper cap. If the limit in the init_net is higher, the lower pernet limit is applied. If the init_net has 0 setting, no limit is applied. This also needs an update to Documentation/networking/nf_conntrack-sysctl.rst to explain the restrictions. Or, alternative, try the other suggestion I made (memcg charge at sysctl change time, https://lore.kernel.org/netfilter-devel/20250408095854.GB536@xxxxxxxxxxxxx/). Or come up with a better proposal.