Re: [PATCH net-next 00/10] bridge: Allow keeping local FDB entries only on VLAN 0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/4/25 20:07, Petr Machata wrote:
The bridge FDB contains one local entry per port per VLAN, for the MAC of
the port in question, and likewise for the bridge itself. This allows
bridge to locally receive and punt "up" any packets whose destination MAC
address matches that of one of the bridge interfaces or of the bridge
itself.

The number of these local "service" FDB entries grows linearly with number
of bridge-global VLAN memberships, but that in turn will tend to grow
quadratically with number of ports and per-port VLAN memberships. While
that does not cause issues during forwarding lookups, it does make dumps
impractically slow.

As an example, with 100 interfaces, each on 4K VLANs, a full dump of FDB
that just contains these 400K local entries, takes 6.5s. That's _without_
considering iproute2 formatting overhead, this is just how long it takes to
walk the FDB (repeatedly), serialize it into netlink messages, and parse
the messages back in userspace.

This is to illustrate that with growing number of ports and VLANs, the time
required to dump this repetitive information blows up. Arguably 4K VLANs
per interface is not a very realistic configuration, but then modern
switches can instead have several hundred interfaces, and we have fielded
requests for >1K VLAN memberships per port among customers.

[snip]
All this FDB duplication is there merely to make things snappy during
forwarding. But high-radix switches with thousands of VLANs typically do
not process much traffic in the SW datapath at all, but rather offload vast
majority of it. So we could exchange some of the runtime performance for a
neater FDB.

To that end, in this patchset, introduce a new bridge option,
BR_BOOLOPT_FDB_LOCAL_VLAN_0, which when enabled, has local FDB entries
installed only on VLAN 0, instead of duplicating them across all VLANs.
Then to maintain the local termination behavior, on FDB miss, the bridge
does a second lookup on VLAN 0.

Enabling this option changes the bridge behavior in expected ways. Since
the entries are only kept on VLAN 0, FDB get, flush and dump will not
perceive them on non-0 VLANs. And deleting the VLAN 0 entry affects
forwarding on all VLANs.

This patchset is loosely based on a privately circulated patch by Nikolay
Aleksandrov.


I knew this sounded familiar, I actually did try to upstream the original patch[1] way back
in 2015 and was rejected, at the time that led to the vlan rhashtable code. :-)

By the way the original idea and change predate me and were by Wilson Kok, I just polished
them and took over the patch while at Cumulus.

Now, this is presented in a much shinier new option manner with selftests which is great.
I think we can take the new option this time around, it will be very helpful for some
setups as explained.

The code looks good to me, I appreciate how well split it is.
For the series:

Acked-by: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx>

Thanks,
 Nik

[1] https://lore.kernel.org/netdev/1440549295-3979-1-git-send-email-razor@xxxxxxxxxxxxx/

The patchset progresses as follows:

- Patch #1 introduces a bridge option to enable the above feature. Then
   patches #2 to #5 gradually patch the bridge to do the right thing when
   the option is enabled. Finally patch #6 adds the UAPI knob and the code
   for when the feature is enabled or disabled.
- Patches #7, #8 and #9 contain fixes and improvements to selftest
   libraries
- Patch #10 contains a new selftest

The corresponding iproute2 support is at:
https://github.com/pmachata/iproute2/commits/fdb_local_vlan_0/

Petr Machata (10):
   net: bridge: Introduce BROPT_FDB_LOCAL_VLAN_0
   net: bridge: BROPT_FDB_LOCAL_VLAN_0: Look up FDB on VLAN 0 on miss
   net: bridge: BROPT_FDB_LOCAL_VLAN_0: On port changeaddr, skip per-VLAN
     FDBs
   net: bridge: BROPT_FDB_LOCAL_VLAN_0: On bridge changeaddr, skip
     per-VLAN FDBs
   net: bridge: BROPT_FDB_LOCAL_VLAN_0: Skip local FDBs on VLAN creation
   net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0
   selftests: defer: Allow spaces in arguments of deferred commands
   selftests: defer: Introduce DEFER_PAUSE_ON_FAIL
   selftests: net: lib.sh: Don't defer failed commands
   selftests: forwarding: Add test for BR_BOOLOPT_FDB_LOCAL_VLAN_0

  include/uapi/linux/if_bridge.h                |   3 +
  net/bridge/br.c                               |  22 ++
  net/bridge/br_fdb.c                           | 114 +++++-
  net/bridge/br_input.c                         |   8 +
  net/bridge/br_private.h                       |   3 +
  net/bridge/br_vlan.c                          |  10 +-
  .../testing/selftests/net/forwarding/Makefile |   1 +
  .../net/forwarding/bridge_fdb_local_vlan_0.sh | 374 ++++++++++++++++++
  tools/testing/selftests/net/lib.sh            |  32 +-
  tools/testing/selftests/net/lib/sh/defer.sh   |  20 +-
  10 files changed, 559 insertions(+), 28 deletions(-)
  create mode 100755 tools/testing/selftests/net/forwarding/bridge_fdb_local_vlan_0.sh






[Index of Archives]     [Netdev]     [AoE Tools]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]     [Video 4 Linux]

  Powered by Linux