Traceroute & pings to IPs defined for use in mgmt VRF always use mgmt VRF routing table even when sourcing from non mgmt VRF interfaces


I've found an odd behavior I am wondering if others are experiencing. I've assigned NTP, DNS, SNMP etc.. to use eth0 via mgmt VRF. When I try and trace or ping to those IPs sourcing a non mgmt VRF interface or IP it still uses the eth0 mgmt VRF to exit;

cumulus@switch1:~$ cat /etc/resolv.conf
nameserver 205.206.214.249 # vrf mgmt
nameserver 209.20.8.249 # vrf mgmt

cumulus@switch1:~$ net show route vrf mgmt

show ip route vrf mgmt
=======================
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel,
> - selected route, * - FIB route

K * 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 02w0d20h
K>* 0.0.0.0/0 [0/0] via 198.162.145.254, eth0, 02w0d20h
C>* 198.162.145.0/24 is directly connected, eth0, 02w0d20h

Mgmt VRF:
traceroute 205.206.214.249
traceroute to 205.206.214.249 (205.206.214.249), 30 hops max, 60 byte packets
1 198.162.145.254 (198.162.145.254) 0.570 ms 0.631 ms 0.731 ms
2 172.25.201.33 (172.25.201.33) 0.329 ms 0.340 ms 0.329 ms
3 172.26.31.2 (172.26.31.2) 16.363 ms 16.376 ms 16.365 ms
4 172.26.31.1 (172.26.31.1) 15.779 ms 15.768 ms 15.724 ms

cumulus@switch1:~$ net show route

show ip route
=============
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel,
> - selected route, * - FIB route

B>* 0.0.0.0/0 [200/0] via 172.26.40.66, vlan3700, 02w0d20h
C * 172.26.39.64/27 is directly connected, vlan219-v0, 02w0d20h
C>* 172.26.39.64/27 is directly connected, vlan219, 02w0d20h

Global table: Wrong exit interface with IP defined for DNS using global default route
traceroute -s 172.26.39.67 205.206.214.249
traceroute to 205.206.214.249 (205.206.214.249), 30 hops max, 60 byte packets
1 198.162.145.254 (198.162.145.254) 17.857 ms 17.975 ms 17.983 ms
2 172.25.201.33 (172.25.201.33) 17.644 ms 17.709 ms 17.793 ms
3 172.26.31.2 (172.26.31.2) 17.744 ms 17.780 ms 17.792 ms
4 172.26.31.1 (172.26.31.1) 17.861 ms 17.895 ms 17.850 ms

Global table: Correct exit interface with IP in same subnet as DNS using global default route
traceroute -s 172.26.39.67 205.206.214.250
traceroute to 205.206.214.250 (205.206.214.250), 30 hops max, 60 byte packets
1 172.26.40.66 (172.26.40.66) 0.783 ms 0.754 ms 0.756 ms
2 172.26.40.60 (172.26.40.60) 18.067 ms 18.070 ms 18.060 ms
3 172.26.40.20 (172.26.40.20) 16.028 ms 16.051 ms 16.051 ms
4 172.26.31.1 (172.26.31.1) 16.053 ms 16.274 ms 16.059 ms

3 replies

Userlevel 5
Jeff this is due to a little documented behavior of our VRF implementation on Cumulus specifically.... check the output of the "ip rule ls" command on your system and you'll see an override for the IP addresses of your DNS servers. When the vrf interface corresponding to the mgmt VRF is brought up it creates the ip rule behaviors you see. Essentially IP rule is sort of like PBR which short circuits lookups for these IP addresses to use a specific table. This can get tedious when you want your DNS to use one VRF and your NTP to use another but they share the same IP address. Worth noting is that this behavior only exists in the control plane and won't affect traffic moving through the switch in the dataplane. In order to disable this behavior, add the following lines under your mgmt vrf interface:

auto mgmt
iface mgmt
vrf-table auto
post-up ip rule del from all to 205.206.214.249 lookup mgmt
post-up ip rule del from all to 209.20.8.249 lookup mgmt
Eric Pulvino wrote:

Jeff this is due to a little documented behavior of our VRF implementation on Cumulus specificall...

Thanks Eric, I'll be sure to document this in our BCP. Upon double checking it only seems to have this behaviour for the DNS IP's in mgmt VRF. I've configured SNMP & NTP to use the mgmt VRF as well but they are not added to the "ip rule ls" feature.
Userlevel 5
Eric Pulvino wrote:

Jeff this is due to a little documented behavior of our VRF implementation on Cumulus specificall...

That is correct, when the interface:
auto mgmt
iface mgmt
vrf-table auto
Is brought up, it looks at the configured DNS servers in /etc/resolv.conf and adds the:
ip rule add from all to 205.206.214.249 lookup mgmt 
ip rule add from all to 209.20.8.249 lookup mgmt
rules to the configuration.

Reply