Linux bridge features introduction
Contents
A Linux bridge behaves like a network switch. It forwards packets between interfaces that are connected to it.
The Linux bridge has added basic STP, multicast, netfilter support since the 2.4 and 2.6 kernel series. And it added more features after that. e.g.
- Config via netlink
- VLAN filter
- VxLAN tunnel mapping
- IGMPv3/MLDv2
- switchdev
- netfilter
- Others
We will introduce all the features in this article. At the end I will also show the limits of bridge and when OVS should be used.
Basic Usage
All the cmds used in this article are via iproute2, which is using netlink message to config the bridge.
There are 2 iproute2 cmds for bridge setting/configuring. ip link
and bridge
.
ip link
cmd used for add/set/remove bridge. bridge
is used to
show/manipulate bridge fdb/mdb/vlan. I will introduce the
details of these 2 cmds in the following sections.
Show ip link bridge help info
|
|
Create a bridge with name br0
|
|
Show bridge details
|
|
Show bridge details with pretty JSON format (which is a good way to get bridge key/values)
|
|
Add interfaces to bridge
|
|
STP
Linux bridge has supported STP(Spanning Tree Protocol) since 2.4 and 2.6 kernel series. STP is used to prevent a networking loop.
Note:
Linux bridge does not support RSTP.
To enable bridge STP
|
|
Here is an example that a user creating a loop in the network. If STP is not enabled, there will be a traffic storm in the network.
With STP enabled, the bridges will send BPDUs and elect a root bridge and block an interface to make the network topology loop-free.
Show bridge STP blocking state
|
|
We can see the interface veth2 is in blocking state after enabling STP
To change the STP hello time,
|
|
You can use the same way to change other STP parameters like: max age, forward delay, ageing time, etc.
VLAN filter
VLAN filter was introduced in Linux kernel 3.8. Before this feature, the user needs to create multi bridge/VLAN interfaces when separating VLAN traffic on the bridge.
e.g. with the following topo, users need to add multi bridges and VLANs to make sure VLAN traffic goes to correspond netns.
But with VLAN filter, only one bridge device is enough to set all the VLAN configs.
we can do like
|
|
Command ip link set br0 type bridge vlan_filtering 1
enables VLAN filter on
bridge br0.
Command bridge vlan add dev veth1 vid 2
makes bridge port veth1 only transmit
VLAN 2 data.
Command bridge vlan add dev veth2 vid 2 pvid untagged
make bridge port
veth2 transmit VLAN 2 data. With parameter pvid
, any untagged frames will be
assigned to this VLAN at ingress (veth2 to bridge). With parameter untagged
,
the VLAN 2 tag will be untagged on egress (bridge to veth2).
Command bridge vlan add dev veth3 vid 3 pvid untagged master
do the same thing
as previous one, the parameter master
is a default value, which means the link
setting is configured on the software bridge. You can omit this as the previous cmd.
Command bridge vlan add dev eth1 vid 2-3
enalbes VLAN 2 and VLAN 3 traffic on eth1.
To show the VLAN traffic state, we can enable VLAN stats (added in kernel 4.7) by
ip link set br0 type bridge vlan_stats_enabled 1
. But this is a global
VLAN stats on the bridge and we could not find each VLAN’s state. To enalbe
per-vlan stats, we also need to enable vlan_stats_per_port
(added in kernel 4.20)
when there is no
port vlans in the bridge. With ip link set br0 type bridge vlan_stats_per_port 1
,
and show the stats with
|
|
We can see each VLAN’s stats now.
VLAN to Tunnel mapping
As we know VxLAN is used to build layer 2 virtual networks across the underlay layer3 infrastructure. A VxLAN tunnel endpoint (VTEP) originates and terminates VxLAN tunnels. VxLAN bridging is the function provided by VTEPs to terminate VxLAN tunnels and map the VxLAN vni to traditional end host VLAN.
To achieve this, previously we need to add local ports and VxLAN netdevs into a VLAN filtering bridge. The local ports are configured as trunk ports carrying all VLANs. A VxLAN netdev per vni is added to the bridge. VLAN mapping to vni is achieved by configuring the VLAN as pvid on the corresponding VxLAN netdev. like:
Since kernel 4.11, kernel has provided a native way to support Vxlan bridging. First, add related VIDs to the interfaces (Note: vxlan0 was added with lwt to handle multi vni)
|
|
Now enable VLAN tunnel mapping on a bridge port:
|
|
Then add vlan tunnel mapping
|
|
Multicast
Linux bridge had IGMPv2/MLDv1 support since 2.6. And IGMPv3/MLDv2 support has been added in kernel 5.10.
Enable bridge multicast snooping, querier, and stats
|
|
By default, the bridge use IGMPv2/MLDv1 when snooping enabled. You can change the version like
|
|
After a port join a group, we can show the mdb(multicast data base) by
|
|
Bridge also supports setting mcast snooping and querier on a single VLAN. e.g.
|
|
Show bridge xstats (mcast RX/TX info)
|
|
There are also some other multicast parameters you can config like mcast_router, mcast_query_interval, mcast_hash_max, etc.
Bridge switchdev
Linux bridge is always used for connecting Virtual Machines with physical networks, by using tap/virtio driver. We can also attach a SRIOV VF in VM guest to get better performance. But the way SR-IOV embedded switches were dealt with in Linux was limited in its expressiveness and flexibility. And the kernel model for controlling the SR-IOV e-switch did not allow the configuration of anything beyond MAC/VLAN based forwarding.
To make VFs also get dynamic FDB, VLAN filter benefits while still keeping the best performance. Since kernel v4.9 the Linux bridge added switchdev support, which could supply L2 forwarding offloading via switchdev to HW switch like Mellanox Spectrum, DSA based switches, MLX5 CX6 Dx cards etc.
If we are in switchdev mode(bridge is up and related config enabled, e.g. MLX5_BRIDGE for MLX5 SRIOV E-Switch), we can connect the VF’s representors to the bridge, and the frames that suppose to be transmitted by the bridge, will be transmitted by hardware only. Their routing will be done in the eswitch of the NIC.
Once a frame passes through the VF to its representor, the bridge learns that the source MAC of the VF is behind a particular port, and it adds an entry with the MAC & port to its FDB. Immediately afterward it will send a message to the mlx5 driver, and the driver will add a relevant rule/line to 2 tables located in the eswitch (on the NIC). Later, if such frame (with the same destination MAC address) comes from the VF, it will not go through the kernel - it will go directly through the NIC to the appropriate port.
More info: Switchdev support for embedded switches in NICs is simple but for full featured switches like Mellanox Spectrum the offloading capabilities are much rich and supports things like LAG (team, bonding), tunneling (VxLAN etc.), routing and TC offloading. The last 2 go out of scope of bridge but LAGs can be attached to bridge as well as VxLAN tunnels and offloading of such scenario is fully supported.
bridge netfilter
By default, the traffic forwarded by bridge does not go through iptables. If users want the L2 traffic also be filtered by iptables forward rules. You can enable this option by
|
|
The same with ip6tables and arptables
Other bridge configs
bridge aging time the number of seconds a MAC address will be kept in the FDB after a packet has been received from that address. after this time has passed, entries are cleaned up. To change the timer, you can try
|
|
Bridge vs OVS
Linux bridge is very useful and popular in the past years as it supplies L2 forwarding, connects VMs and networks with VLAN/Multicast support. It’s stable, reliable, and easy for setup and config. On the other hand, it also has some limitations, like fewer tunnel supports. If you want to get easier network management, more tunnel supports (GRE, VXLAN, etc), L3 forwarding and incorporated with SDN, you can try OVS.
Author Hangbin Liu
LastMod 2022-07-25 (8b46cd9)