Re: [PATCH nf-next,v1 0/6] revisiting nf_tables ruleset validation

Florian Westphal <fw@xxxxxxxxx> · Tue, 20 May 2025 12:42:50 +0200

Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote:
> On Thu, May 15, 2025 at 08:03:13AM +0200, Florian Westphal wrote:
> > >   netfilter: nf_tables: honor EINTR in ruleset validation from commit/abort path
> > 
> > Do this via nf.git?
> 
> nf-next.git should be fine, this late in the development cycle.

Right, agreed.

> > >   netfilter: nf_tables: honor validation state in preparation phase
> > >   netfilter: nf_tables: add infrastructure for chain validation on updates
> > >   netfilter: nf_tables: add new binding infrastructure
> > >   netfilter: nf_tables: use new binding infrastructure
> > >   netfilter: nf_tables: add support for validating incremental ruleset updates
> > > 
> > >  include/net/netfilter/nf_tables.h |  52 +-
> > >  net/netfilter/nf_tables_api.c     | 800 ++++++++++++++++++++++++++++--
> > >  net/netfilter/nft_immediate.c     |  25 +-
> > >  3 files changed, 844 insertions(+), 33 deletions(-)
> > 
> > This is a lot of new code but no explanation as to why is given.
> 
> These are the results of a test program I made, which is incrementally
> adding elements to an already populated verdict map with 100k elements.
> 
> The ruleset validation shows in the perf callgraph, for each new
> element that is added:
> 
>     55.06%  nft-buffer       [kernel.kallsyms]           [k] nft_chain_validate
>      7.68%  nft-buffer       [kernel.kallsyms]           [k] nf_tables_commit
>      7.19%  nft-buffer       [kernel.kallsyms]           [k] nft_table_validate
>      5.34%  nft-buffer       [kernel.kallsyms]           [k] nft_rhash_walk
>      2.82%  swapper          [kernel.kallsyms]           [k] mwait_idle_with_hints.constprop.0
>      1.26%  nft-buffer       [kernel.kallsyms]           [k] __rhashtable_walk_find_next
>      1.09%  nft-buffer       [kernel.kallsyms]           [k] nft_setelem_validate
>      0.94%  nft-buffer       [kernel.kallsyms]           [k] rhashtable_walk_next
>      0.54%  nft-buffer       libnftables.so.1.1.0        [.] nft_parse
> 
> This is a test adding 100 new elements including jump to chain in a
> existing 100k verdict map.
> 
> With this approach, there is no need to fully validate the chain
> graph.
> 
> > Does this fix bugs with the existing scheme?
> 
> The two initial patches are targetted at fixing minor issues. The
> remaining patches are new code, they are passing tests/shell with
> CONFIG_KASAN and CONFIG_KMEMLEAK. I will post v2 but I would like to
> run more fuzz test on the error path.
> 
> > Or is this an optimization? If so, how big is the speedup?
> 
> Optimization. After this series, validation does not show up near the
> top 10; this is the first symbol in the perf call graph:
> 
>      0.03%  nft-buffer       [nf_tables]               [k] __nft_chain_validate
> 
> and it is far from the top 10.
> 
> I can include this information in v2 so they can sit in the mailing
> list for a while, there is a few bugs in v1 that I have addressed.
> Phil has spotted one in them.
> 
> Moreover, I can move the bindings hashtable to make it per-net to
> control the maximum number of jump/goto per family. This is a lot
> larger than Shaun's update, you have to tell me your preference on
> this.

I think Shauns approach is better due to backporting reasons.

Once the new code is stable enough you can move the jump limitation
check there.

Makes sense?

> Old binding code can possibly go away, but that requires closer look
> too.

Yes, lets keep it for now until there is confidence that the split
approach catches all issues.

I suggest to do this by adding a WARN_ON_ONCE() in the old path once
you think the new code should have prevented cycles in 100% of cases,
then remove it after e.g. 6-12 months or so.