Solved

Static VXLAN Tunnels with MLAG


On May 23, 2018, a user named @sergej.shmalko posted the following comment to our old community while we were transitioning to this new platform:

Hi all. I didn't find any manual about how to configure Static VXLAN Tunnels with MLAG, but I want to test such topology:


Links DC1-1 - DC2-2, DC1-2 - DC3-1, DC2-1 - DC3-2 are L3
Links DC1-1 - DC1-2, DC2-2 - DC2-1, DC3-1 - DC3-2 are MLAG peer-links with bonds to S-1, S-2 and S-3 respectively
OSPF for full IP connectivity

To make MLAG work with static VXLAN I have clagd-vxlan-anycast-ip under loopback (You can see it in a picture as anycast)

Below full config of DC3-1:

root@DC3-1:~# net sh configuration int

interface lo
# The primary network interface
address 172.26.1.31/32
clagd-vxlan-anycast-ip 172.26.9.9

interface eth0
address dhcp

interface swp1

interface swp2
address 172.26.0.6/30

interface swp3

interface swp6

interface bond-leaf-3
bond-slaves swp6
bridge-access 100
clag-id 1
mtu 9216

interface bridge
bridge-ports bond-leaf-3 peerlink vni-100
bridge-vids 1 100 666
bridge-vlan-aware yes

interface peerlink
bond-slaves swp1 swp3
mtu 9216

interface peerlink.4094
address 169.254.1.9/30
clagd-backup-ip 1.1.0.32
clagd-peer-ip 169.254.1.10
clagd-priority 1000
clagd-sys-mac 44:38:39:FF:00:03

interface vlan100
address 1.1.0.31/22
vlan-id 100
vlan-raw-device bridge

interface vlan666
address 172.26.0.33/30
vlan-id 666
vlan-raw-device bridge

interface vni-100
bridge-access 100
mstpctl-bpduguard yes
mstpctl-portbpdufilter yes
vxlan-id 100
vxlan-local-tunnelip 172.26.1.31
vxlan-remoteip 172.26.7.7
vxlan-remoteip 172.26.8.8

--------------------

Below full config of DC3-1

root@DC3-2:~# net sh conf int

interface lo
# The primary network interface
address 172.26.1.32/32
clagd-vxlan-anycast-ip 172.26.9.9

interface eth0
address dhcp

interface swp1

interface swp2
address 172.26.0.10/30

interface swp3

interface swp6

interface bond-leaf-3
bond-slaves swp6
bridge-access 100
clag-id 1
mtu 9216

interface bridge
bridge-ports bond-leaf-3 peerlink vni-100
bridge-vids 1 100 666
bridge-vlan-aware yes

interface peerlink
bond-slaves swp1 swp3
mtu 9216

interface peerlink.4094
address 169.254.1.10/30
clagd-backup-ip 1.1.0.31
clagd-peer-ip 169.254.1.9
clagd-priority 2000
clagd-sys-mac 44:38:39:FF:00:03

interface vlan100
address 1.1.0.32/22
vlan-id 100
vlan-raw-device bridge

interface vlan666
address 172.26.0.34/30
vlan-id 666
vlan-raw-device bridge

interface vni-100
bridge-access 100
mstpctl-bpduguard yes
mstpctl-portbpdufilter yes
vxlan-id 100
vxlan-local-tunnelip 172.26.1.32
vxlan-remoteip 172.26.7.7
vxlan-remoteip 172.26.8.8
-------------------------

When traffic goes via vxlan tunnel, routers always set source IP from clagd-vxlan-anycast-ip and ignore vxlan-local-tunnelip.

For example, ip packet is created at DC3-1 with source IP 172.26.9.9 and destination IP 172.26.8.8. According to routing table it should go via DC3-2 and then reach destination 172.26.8.8.
But in this case DC3-2 drops such packets, because it has the same source IP as router has on its loopback.
I found workaround - set "1" in /proc/sys/net/ipv4/conf/all/accept_local.

I wonder if it is a reliable workaround and if it is robust topology at all, because there is to such schema in official docs.
icon

Best answer by Jeff Nielson 24 May 2018, 20:35

On May 24, 2018, @Jason Guy replied on our old community while we were transitioning to this new platform:

I see a few potential issues with this.
- First, think of the MLAG pair as a single switch. It is not a good idea to peer a routing protocol across the peerlink. If you must, be sure to make the prefixes learned from an mlag peer switch, are less preferred. Generally if there is an uplink failure, you really do not want the traffic routing laterally. The better way to do this is dual-connect each MLAG router to the core switches, and remove the peering between the MLAG switches. Anyhow, this should correct the routing plane, and you would not need the kernel hack.
- Secondly, it is best to create a VTEP per remote anycast pair. I have never statically configured multiple remote-ip's on a VTEP, and I am not entirely sure how that would work.
- Finally, after all of these things are corrected, hopefully things will work. With your configurations, I think the VXLAN tunnel should source from the configured local-ip, and destined to the remote anycast-ip.
View original

2 replies

On May 24, 2018, @Jason Guy replied on our old community while we were transitioning to this new platform:

I see a few potential issues with this.
- First, think of the MLAG pair as a single switch. It is not a good idea to peer a routing protocol across the peerlink. If you must, be sure to make the prefixes learned from an mlag peer switch, are less preferred. Generally if there is an uplink failure, you really do not want the traffic routing laterally. The better way to do this is dual-connect each MLAG router to the core switches, and remove the peering between the MLAG switches. Anyhow, this should correct the routing plane, and you would not need the kernel hack.
- Secondly, it is best to create a VTEP per remote anycast pair. I have never statically configured multiple remote-ip's on a VTEP, and I am not entirely sure how that would work.
- Finally, after all of these things are corrected, hopefully things will work. With your configurations, I think the VXLAN tunnel should source from the configured local-ip, and destined to the remote anycast-ip.
Userlevel 4
On May 24, 2018, @sergej.shmalko posted this reply:

Hi Jason,
Many thanks for the answer.

Reply