On 9/4/25 20:07, Petr Machata wrote:
The bridge FDB contains one local entry per port per VLAN, for the MAC of the port in question, and likewise for the bridge itself. This allows bridge to locally receive and punt "up" any packets whose destination MAC address matches that of one of the bridge interfaces or of the bridge itself. The number of these local "service" FDB entries grows linearly with number of bridge-global VLAN memberships, but that in turn will tend to grow quadratically with number of ports and per-port VLAN memberships. While that does not cause issues during forwarding lookups, it does make dumps impractically slow. As an example, with 100 interfaces, each on 4K VLANs, a full dump of FDB that just contains these 400K local entries, takes 6.5s. That's _without_ considering iproute2 formatting overhead, this is just how long it takes to walk the FDB (repeatedly), serialize it into netlink messages, and parse the messages back in userspace. This is to illustrate that with growing number of ports and VLANs, the time required to dump this repetitive information blows up. Arguably 4K VLANs per interface is not a very realistic configuration, but then modern switches can instead have several hundred interfaces, and we have fielded requests for >1K VLAN memberships per port among customers.
[snip]
All this FDB duplication is there merely to make things snappy during forwarding. But high-radix switches with thousands of VLANs typically do not process much traffic in the SW datapath at all, but rather offload vast majority of it. So we could exchange some of the runtime performance for a neater FDB. To that end, in this patchset, introduce a new bridge option, BR_BOOLOPT_FDB_LOCAL_VLAN_0, which when enabled, has local FDB entries installed only on VLAN 0, instead of duplicating them across all VLANs. Then to maintain the local termination behavior, on FDB miss, the bridge does a second lookup on VLAN 0. Enabling this option changes the bridge behavior in expected ways. Since the entries are only kept on VLAN 0, FDB get, flush and dump will not perceive them on non-0 VLANs. And deleting the VLAN 0 entry affects forwarding on all VLANs. This patchset is loosely based on a privately circulated patch by Nikolay Aleksandrov.
I knew this sounded familiar, I actually did try to upstream the original patch[1] way back in 2015 and was rejected, at the time that led to the vlan rhashtable code. :-) By the way the original idea and change predate me and were by Wilson Kok, I just polished them and took over the patch while at Cumulus. Now, this is presented in a much shinier new option manner with selftests which is great. I think we can take the new option this time around, it will be very helpful for some setups as explained. The code looks good to me, I appreciate how well split it is. For the series: Acked-by: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx> Thanks, Nik [1] https://lore.kernel.org/netdev/1440549295-3979-1-git-send-email-razor@xxxxxxxxxxxxx/
The patchset progresses as follows: - Patch #1 introduces a bridge option to enable the above feature. Then patches #2 to #5 gradually patch the bridge to do the right thing when the option is enabled. Finally patch #6 adds the UAPI knob and the code for when the feature is enabled or disabled. - Patches #7, #8 and #9 contain fixes and improvements to selftest libraries - Patch #10 contains a new selftest The corresponding iproute2 support is at: https://github.com/pmachata/iproute2/commits/fdb_local_vlan_0/ Petr Machata (10): net: bridge: Introduce BROPT_FDB_LOCAL_VLAN_0 net: bridge: BROPT_FDB_LOCAL_VLAN_0: Look up FDB on VLAN 0 on miss net: bridge: BROPT_FDB_LOCAL_VLAN_0: On port changeaddr, skip per-VLAN FDBs net: bridge: BROPT_FDB_LOCAL_VLAN_0: On bridge changeaddr, skip per-VLAN FDBs net: bridge: BROPT_FDB_LOCAL_VLAN_0: Skip local FDBs on VLAN creation net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0 selftests: defer: Allow spaces in arguments of deferred commands selftests: defer: Introduce DEFER_PAUSE_ON_FAIL selftests: net: lib.sh: Don't defer failed commands selftests: forwarding: Add test for BR_BOOLOPT_FDB_LOCAL_VLAN_0 include/uapi/linux/if_bridge.h | 3 + net/bridge/br.c | 22 ++ net/bridge/br_fdb.c | 114 +++++- net/bridge/br_input.c | 8 + net/bridge/br_private.h | 3 + net/bridge/br_vlan.c | 10 +- .../testing/selftests/net/forwarding/Makefile | 1 + .../net/forwarding/bridge_fdb_local_vlan_0.sh | 374 ++++++++++++++++++ tools/testing/selftests/net/lib.sh | 32 +- tools/testing/selftests/net/lib/sh/defer.sh | 20 +- 10 files changed, 559 insertions(+), 28 deletions(-) create mode 100755 tools/testing/selftests/net/forwarding/bridge_fdb_local_vlan_0.sh