Re: [nf-next RFC] netfilter: nf_tables: Feature ifname-based hook registration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 03, 2025 at 04:33:49PM +0200, Phil Sutter wrote:
[...]
> > > Pablo, WDYT? Feasible alternative to the feature flag?

That is more simple, it puts a bit more pressure on netlink_dump().
Worst case fully duplicates the information, I think we discussed this
already.

If my code reading is correct, maximum size of the area to add netlink
messages in the dump path is SKB_WITH_OVERHEAD(32768) according to
netlink_recvmsg(), there is a fallback to NLMSG_GOODSIZE when 3-order
allocation fails.

In the event path, memory budget is already used up since
NLMSG_GOODSIZE (4096) even before this proposal because

256 devices * IFNAMSIZ = 4096

This is beyond the limit.

This can be fixed by splitting the report in two events of NEWCHAIN,
but if there is another large attribute reaching the worst case, the
splitting will get more complicated.

With your new attribute, worst case scenario means duplicating _DEVS
with the same devices.

There is a memory budget for the netlink message, and the ADDCHAIN
netlink message with the netdev family is already pushing it to the
limit when *I rised* the maximum to 256 devices per request of a user.
I can post a patch to reduce it to 128 devices now that your device
wildcard feature is available.

I regret _DEVS attribute only includes DEV_NAME, it should have been
possible to add a flag and provide a more efficient representation
than your proposal.

A possible fugly hack would be to stuffed this information after \0 in
DEV_NAME, but that would feel like the iptables revision
infrastructure, better not to go that way.

Maybe all this is not worth to worry and we can assume in 2025 that
when 3-order allocation fails netlink dump will simply fail? Probably
this already is right for other existing netlink subsystems.

And this effort is to provide a way for the user to know that the
device that has been specified has actually registered a hook so the
chain will see traffic.

So far we only have events via NEWDEV to report new hooks. Maybe
GETDEV to consult the hooks that are attached to what chains is
needed? That would solve this usability issue.

But that is _a lot more work_, stuffing more information into the
ADDCHAIN netlink message is easier. GETDEV means more netlink boiler
plate code to avoid this simple extra attribute you propose.

GETDEV would be paired with NEWDEV events to determine which device
the base chain is hooked to.

Maybe it is not for users to know that that dummy* is matching
_something_ but also they want to know what device is matching such
pattern for debugging purpose.

It boils down to "should we care to provide facility to allow for more
instrospection in this box"?

If the answer is "no, what we have is sufficient" then let's not worry
about mistypes. GETDEV facility would be rarely used, then skip.

If you want to complete this picture, then add GETDEV, because NEWDEV
events without GETDEV command looks incomplete.

> > If my understanding is correct, I think this approach will break new
> > nft binary with old kernel.
> 
> If not present (old kernel), new nft would assume all device specs
> matched an interface. I would implement this as comment, something like:
> "# not bound: ethX, wlan*" which would be missing with old kernels.

If after all this, you decide to go for this approach with the new
attribute into ADDCHAIN, maybe a more compact representation,
add ? notation:

table ip x {
        chain y {
                type filter hook ingress devices = { "eth*"?, "vlan0"?, "wan1" } priority 0;
        }
}

This notation can be removed if -s/--stateless is used and ignored if
ruleset is loaded and it is more compact.

It feels like we are trying to stuff too much information in the
existing output, and my ? notation is just trying to find a "smart"
way to make thing not look bloated. Then, GETDEV comes again and this
silly notation is really not needed if a more complete view is
provided.

BTW, note there is one inconsistency that need to be addressed in the
listing, currently 'devices' does not enclose the names in quotes
while 'device' does it.

I mean, with device:

table netdev x {
        chain y {
                type filter hook ingress device "dummy0" priority filter; policy accept;
        }
}

with devices:

table netdev x {
        chain y {
                type filter hook ingress devices = { dummy0, dummy1 } priority filter; policy accept;
        }
}

It would be great to fix this.

P.S: This still leaves room to discuss if comments are the best way to
go to display handle and set count, but we can start a new thread to
discuss this.




[Index of Archives]     [Netfitler Users]     [Berkeley Packet Filter]     [LARTC]     [Bugtraq]     [Yosemite Forum]

  Powered by Linux