Hi Pablo, On 25/07/2025 18:03, Pablo Neira Ayuso wrote: > The seqcount xt_recseq is used to synchronize the replacement of > xt_table::private in xt_replace_table() against all readers such as > ipt_do_table() > > To ensure that there is only one writer, the writing side disables > bottom halves. The sequence counter can be acquired recursively. Only the > first invocation modifies the sequence counter (signaling that a writer > is in progress) while the following (recursive) writer does not modify > the counter. > The lack of a proper locking mechanism for the sequence counter can lead > to live lock on PREEMPT_RT if the high prior reader preempts the > writer. Additionally if the per-CPU lock on PREEMPT_RT is removed from > local_bh_disable() then there is no synchronisation for the per-CPU > sequence counter. > > The affected code is "just" the legacy netfilter code which is replaced > by "netfilter tables". That code can be disabled without sacrificing > functionality because everything is provided by the newer > implementation. This will only requires the usage of the "-nft" tools > instead of the "-legacy" ones. > The long term plan is to remove the legacy code so lets accelerate the > progress. > > Relax dependencies on iptables legacy, replace select with depends on, > this should cause no harm to existing kernel configs and users can still > toggle IP{6}_NF_IPTABLES_LEGACY in any case. > Make EBTABLES_LEGACY, IPTABLES_LEGACY and ARPTABLES depend on > NETFILTER_XTABLES_LEGACY. Hide xt_recseq and its users, > xt_register_table() and xt_percpu_counter_alloc() behind > NETFILTER_XTABLES_LEGACY. Let NETFILTER_XTABLES_LEGACY depend on > !PREEMPT_RT. > > This will break selftest expecing the legacy options enabled and will be > addressed in a following patch. > > Co-developed-by: Florian Westphal <fw@xxxxxxxxx> > Co-developed-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Signed-off-by: Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> > --- > net/bridge/netfilter/Kconfig | 10 +++++----- > net/ipv4/netfilter/Kconfig | 24 ++++++++++++------------ > net/ipv6/netfilter/Kconfig | 19 +++++++++---------- > net/netfilter/Kconfig | 10 ++++++++++ > net/netfilter/x_tables.c | 16 +++++++++++----- > 5 files changed, 47 insertions(+), 32 deletions(-) [...] > +config NETFILTER_XTABLES_LEGACY > + bool "Netfilter legacy tables support" > + depends on !PREEMPT_RT > + help > + Say Y here if you still require support for legacy tables. This is > + required by the legacy tools (iptables-legacy) and is not needed if > + you use iptables over nftables (iptables-nft). > + Legacy support is not limited to IP, it also includes EBTABLES and > + ARPTABLES. > + This has caused some minor pain for me using Docker on Ubuntu 22.04, which I guess is still using iptables-legacy. I've had to debug why Docker has stopped working and eventually ended here. Explcitly enabling NETFILTER_XTABLES_LEGACY solved the problem. I thought I'd try my luck at convincing you to default this to enabled for !PREEMPT_RT to save others from such issues? Thanks, Ryan