Bonding & Bridges
- One tool per host: NetworkManager keyfiles or legacy
ifcfgscripts, not both. Mixing is a source of silent config loss on reboot. - LACP (mode=4) is the default for switch-attached servers. Use active-backup (mode=1) only when the two uplinks go to switches that cannot form an MLAG/LAG.
- Bridges for virtualisation; VLANs carry tagged traffic into them. Do not put host IP addressing on a raw physical interface that is also enslaved to a bridge.
- Always pin the bond's MAC — failover to a slave with a different MAC surprises ARP caches, especially on hypervisors.
- Never test failover in production by yanking cables. Use
ip link set <slave> down; it is reversible from the remote end only if the other slave is working.
NetworkManager keyfiles vs ifcfg
Modern EL systems default to NetworkManager with keyfile storage under /etc/NetworkManager/system-connections/. Older boxes used /etc/sysconfig/network-scripts/ifcfg-* via network-scripts or NM's ifcfg-rh plugin.
| File | Format | Mode |
|---|---|---|
/etc/NetworkManager/system-connections/<name>.nmconnection | keyfile (INI-like) | 0600, owned by root |
/etc/sysconfig/network-scripts/ifcfg-<name> | shell-style KEY=VALUE | 0644 |
/etc/NetworkManager/conf.d/*.conf | NM daemon config | 0644 |
.nmconnection without locking perms. NetworkManager refuses keyfiles that are not 0600. Use nmcli connection modify or edit and then chmod 600 + nmcli connection reload.
nmcli connection show
nmcli device status
nmcli connection reload
nmcli connection up bond0
Bond modes
| Mode | Name | When to use | Switch requirement |
|---|---|---|---|
| 0 | balance-rr | Raw throughput between two hosts; uncommon in prod because of reordering | Static LAG, both links to same switch |
| 1 | active-backup | Switch pair without MLAG; simple HA | No LAG needed; both switches independent |
| 2 | balance-xor | Static hashing when LACP is not available | Static LAG (etherchannel on) |
| 3 | broadcast | Niche HA with duplicated traffic | Depends; rarely used |
| 4 | 802.3ad (LACP) | Default for servers on LACP-capable switch/MLAG | LACP on switch side |
| 5 | balance-tlb | No switch config needed; outbound load-balance, inbound on one slave | None |
| 6 | balance-alb | Like tlb but ARP-negotiates inbound balancing | None; breaks with some switches |
Three practical choices cover 95% of servers:
- mode=4 (LACP) — standard for modern DC switches with MLAG/VPC/EVPN. Use
xmit_hash_policy=layer3+4for sensible per-flow balance. - mode=1 (active-backup) — two independent ToRs, no stacking, no LAG. Fewer moving parts; ~half the theoretical bandwidth, always.
- mode=6 (balance-alb) — last resort when you cannot touch the switch side but need > one link of bandwidth.
LACP tuning
[connection]
id=bond0
type=bond
interface-name=bond0
[bond]
mode=802.3ad
miimon=100
lacp_rate=fast
xmit_hash_policy=layer3+4
updelay=200
downdelay=200
ad_actor_sys_prio=65535
ad_actor_system=00:00:00:00:00:01
miimon=100- Check link every 100 ms via MII. Alternative is
arp_interval=; don't mix the two. lacp_rate=fast- Sends LACPDUs every second; detects partner loss within ~3s. Must match the switch config or you get churn.
xmit_hash_policy=layer3+4- Hash on src/dst IP + port — best spread across flows.
layer2is the default but hashes only MACs, so you get all-to-one when traffic goes to a single gateway. updelay/downdelay- Hold-off on add/remove of a slave to ride out brief link flaps.
cat /proc/net/bonding/bond0 for Actor Churn State: churned as a smoking gun.
Creating a bond with nmcli
nmcli connection add type bond ifname bond0 con-name bond0 \
bond.options "mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer3+4"
nmcli connection add type ethernet ifname enp1s0f0 con-name bond0-port0 \
master bond0 slave-type bond
nmcli connection add type ethernet ifname enp1s0f1 con-name bond0-port1 \
master bond0 slave-type bond
nmcli connection modify bond0 \
ipv4.method manual ipv4.addresses 10.0.10.20/24 \
ipv4.gateway 10.0.10.1 ipv4.dns "10.0.0.2 10.0.0.3" \
ipv6.method auto \
connection.autoconnect-slaves 1
nmcli connection up bond0-port0
nmcli connection up bond0-port1
nmcli connection up bond0
cat /proc/net/bonding/bond0
Bridges for VMs and containers
A bridge is a software switch. VMs and container veth pairs attach as bridge ports; the bridge becomes the L3 endpoint (it carries the host IP). Never put host IP on both the bridge and the physical NIC.
nmcli connection add type bridge ifname br0 con-name br0
nmcli connection modify br0 bridge.stp yes bridge.priority 32768 bridge.forward-delay 4
# Enslave a bond (or a raw interface) to the bridge
nmcli connection modify bond0 master br0 slave-type bridge
# Or a raw NIC:
# nmcli connection add type ethernet ifname enp2s0 con-name br0-port0 master br0 slave-type bridge
# IP lives on the bridge, not the uplink
nmcli connection modify br0 \
ipv4.method manual ipv4.addresses 10.0.20.10/24 \
ipv4.gateway 10.0.20.1 ipv6.method auto
nmcli connection up br0
bridge link show
bridge fdb show br br0
forward-delay 2) but prevents you from meltdown-looping the switch if a VM misbehaves.
Legacy utilities: brctl and ip link
ip link add name br0 type bridge
ip link set br0 up
ip link set enp2s0 master br0
ip addr add 10.0.20.10/24 dev br0
brctl show # deprecated, still works
bridge link show
bridge vlan show
bridge fdb show
VLAN tagging
Most access-tier servers take tagged traffic on the uplink and terminate VLANs either directly (bond0.10) or via a bridge per VLAN.
nmcli connection add type vlan ifname bond0.10 con-name bond0.10 \
dev bond0 id 10 ipv4.method manual ipv4.addresses 10.10.10.20/24
nmcli connection add type vlan ifname bond0.20 con-name bond0.20 \
dev bond0 id 20 ipv4.method manual ipv4.addresses 10.20.20.20/24
For VM-facing bridges per VLAN:
nmcli connection add type bridge ifname br-vlan30 con-name br-vlan30
nmcli connection add type vlan ifname bond0.30 con-name br-vlan30-port \
dev bond0 id 30 master br-vlan30 slave-type bridge
nmcli connection up br-vlan30
nmcli connection up br-vlan30-port
MTU and path MTU
An MTU mismatch manifests as "small requests work, big responses hang" — the classic PMTU blackhole. Set MTU consistently across the bond, bridges, VLANs, and the switch trunk.
nmcli connection modify bond0 802-3-ethernet.mtu 9000
nmcli connection modify bond0-port0 802-3-ethernet.mtu 9000
nmcli connection modify bond0-port1 802-3-ethernet.mtu 9000
nmcli connection modify br0 bridge.mtu 9000
nmcli connection up bond0
ip link show bond0 | awk '/mtu/ {print $5}'
tracepath 10.0.10.1 # reports path MTU changes
ping -M do -s 8972 10.0.10.1 # 9000 - 28 header
MAC address stability
Active-backup bonds inherit the MAC of the primary slave by default. On failover, the bond takes over the other slave's MAC unless pinned — and the upstream switches / hypervisors may need a second to refresh ARP/FDB tables.
nmcli connection modify bond0 802-3-ethernet.cloned-mac-address 02:42:5a:00:00:10
# Or with bond option fail_over_mac:
# nmcli connection modify bond0 +bond.options "fail_over_mac=active"
fail_over_mac=none— bond always uses the configured MAC; slave MACs are overwritten. Safest for most setups.fail_over_mac=active— bond MAC follows the active slave. Needed for some SR-IOV/hw configurations that refuse to emit frames with a MAC other than their own.fail_over_mac=follow— slaves keep their MACs; on failover the new active slave's MAC is announced via gratuitous ARP.
Diagnostics
ip -br link
ip -br addr
ip -d link show bond0
ip -d link show br0
# Bond status (mode, slaves, partner, churn, LACP timers)
cat /proc/net/bonding/bond0
# Bridge
bridge link show
bridge fdb show br br0
bridge vlan show
# Physical layer
ethtool enp1s0f0 | grep -E 'Speed|Duplex|Link detected'
ethtool -S enp1s0f0 | grep -Ei 'errors|drops|discards'
# LLDP — what switch/port am I actually on?
lldpcli show neighbors
# or: tshark -i enp1s0f0 -c 10 ether proto 0x88cc
# Current kernel routing
ip route
ip -6 route
Failover testing procedure
A bond that has "never failed over" has never been proven to work. Test during a change window with out-of-band access (console, iDRAC, iLO) available.
- Baseline. Record
cat /proc/net/bonding/bond0, ping from a peer, check LACP state on the switch. - Fail slave 1.
ip link set enp1s0f0 down. Confirm the bond reportsMII Status: upon the remaining slave; pings should drop fewer than 5 packets. - Recover slave 1.
ip link set enp1s0f0 up. Confirm it re-aggregates;updelaycontrols how long before it becomes active. - Fail slave 2. Same procedure. This is the important one because the primary MAC host now moves.
- Recover slave 2. Return to baseline.
- Check that ARP entries on peers (
ip neigh) still point to the host; if not, tunenum_grat_arp/num_unsol_na.
Common gotchas
- STP storm when enslaving a raw NIC to a bridge that is also the default gateway path. STP blocks the interface briefly on up; arrange for a separate management interface or use
bridge.stp=noon host-only bridges. - Promiscuous mode: bridges force bridge-ports into promiscuous. Don't be alarmed by
PROMISCinip link; do be alarmed by unexpected broadcast counters. net.bridge.bridge-nf-call-iptables. On some kernels this is on by default, subjecting bridged frames to the iptables/nftablesFORWARDchain. For pure L2 bridging between VMs this is usually undesired; set to0via/etc/sysctl.d/.- Bond + VLAN ordering: the VLAN subinterface must be built on top of the bond. Building VLANs on individual slaves then bonding them is an old anti-pattern and does not work with LACP.
- NetworkManager device vs connection:
nmcli device disconnectbrings a device down but leaves the connection'sautoconnect=trueactive — at next NM event it may come back. Usenmcli connection downfor persistence. - Firewalld zone on the right interface: enslaving an interface changes which zone its traffic is accounted under. If firewalld rules stop taking effect post-enslave, check
firewall-cmd --get-active-zones. - MAC learning on MLAG: a duplicate MAC across MLAG peers (because of
fail_over_mac) triggers MAC-move alarms. Coordinate with the network team.
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
Bond shows MII Status: up but no traffic |
LACP not forming; partner state stuck at defaulted |
Switch side missing LACP; mismatch in lacp_rate; wrong VPC/MLAG setup. Check /proc/net/bonding/bond0 "Partner Mac Address" |
| Throughput plateaus at one link's speed with LACP | Hash ends up on one slave (single-flow) | Set xmit_hash_policy=layer3+4; remember a single TCP flow always rides one slave |
| After reboot the bond is missing a slave | Slave connection has autoconnect=false, or controller= not set |
nmcli connection modify bond0-port0 connection.autoconnect yes connection.master bond0 |
| VMs on the bridge can't talk to each other | bridge-nf-call-iptables=1 plus a blocking rule in FORWARD |
sysctl net.bridge.bridge-nf-call-iptables=0, or fix the iptables FORWARD rule |
| Large pings fail but small pings work | MTU mismatch along the path | ping -M do -s N to find the edge; align MTU across bond/bridge/VLAN/switch trunk |
| Hosts on different VLANs can see each other unexpectedly | Switchport is access for one VLAN only, but Linux has tagged subif; or a bridge with no VLAN filtering | Confirm trunk config on switch; enable bridge vlan_filtering or use a dedicated bridge per VLAN |
| After failover, peers can't reach the host for ~30s | Switch/hypervisor ARP caches out of date; insufficient gratuitous ARPs | Tune num_grat_arp=3 (or more); verify with tcpdump -i bond0 arp during failover |
| NetworkManager reverts hand-edited keyfile | File perms not 0600, or NM reloaded mid-edit | chmod 600; edit with nmcli or systemctl stop NetworkManager during the swap |
| LLDP shows unexpected switch/port | Cable mis-patch, or wrong NIC in the bond | Use ethtool -p enp1s0f0 30 to blink the port LED; confirm cabling |
| Bridge MAC keeps changing | Bridge MAC defaults to lowest port MAC; ports flap | ip link set br0 address aa:bb:cc:dd:ee:01 — pin a stable MAC |
Cross-reference
- Linux networking — interface model, routing, and
ipbasics. - sysctl tuning — net/bridge and net/ipv4 knobs relevant to bonds.
- firewalld and firewalld rich rules — zone bindings shift when you enslave an interface.
- LVM — commonly paired with bonds on storage-heavy boxes.
- Containers 101 — container bridges (
cni0,docker0) use the same primitives. - Wireshark — capture on the bond and the slaves to see where a packet actually egresses.