tcp adjust-mss


Userlevel 1
Does cumulus linux have a command to adjust mss in the tcp syn/ack packets?

9 replies

Userlevel 3
Hi Vikram,

The problem is none of the ASICs support this functionality. The complication is the TCP checksum would need to be recalculated in the ASIC if the MSS was changed.

Typically through the network core, it is best practice to ensure the transit MTU is sufficient to handle the maximum end-user packet size, plus any overhead. The end host should set their MTU to the not exceed the intended transit (host-to-host) packet size. For example, regardless of the connected end-hosts, set all core-facing uplinks to 9216, and all edge-facing interfaces to 9000. The host can still be set to 1500, and then the network-added overhead will not matter.
Userlevel 1
Hello Jason,
We have run into this issue where ssh connections are not working because of MTU issues with the inter-dc tunnel mtu auto set to 1422. Going back to your iptables example, can i refer the l3vni interface in the FORWARD rule instead of eth0 ?

ex: "-A FORWARD -o l3vni_customer1 -p tcp -m tcp --tcp-flags SYN,RST SYN -c 0 0 -j TCPMSS --set-mss 1380 "

code:
Userlevel 3
Hi Vikram,

Back in the mid-2000's, I worked on a lot of IPsec VPN, and as I recall, the device originating the GRE tunnel would typically configure the MSS-adjust to handle the smallest MTU expected in transit. Once the encrypted packet enters the EVPN network, the TCP headers will be encrypted, so it will not be useful to adjust MSS if you can't see the TCP packets. 🙂

I still stand by the idea that with 9216 as a max MTU, there is no reason to have constraints in the network core. I like to set 9216 on all core links, 9000 on edge interfaces. If the link to the firewall will have an extra 50+ bytes of overhead, make firewall interfaces 9100, and the VTEP interfaces can be 9150. Just plan it out and you should be fine.
Userlevel 1
Hello Jason,

we are in the process of setting up gre over IPsec between a pair of firewalls and enable evpn address family ( for vxlan routing ) between the cumulus nodes which are behind the firewalls. Trying to figure the best way to adjust the MSS when the traffic gets routed through the L3VNI.
Userlevel 3
It all depends on the packet rate. The bigger issue is each packet punted from the ASIC must transit the PCI bus twice (to CPU then back to ASIC), and thus will limit the data stream to less than 400 Mbps. 😞 It is better to handle this at the network edge, and use PMTUD, or statically set the host MTU to 1400.
Userlevel 1
Thanks Jason. How much of an impact does fragmentation have on Cumulus Linux with vxlan encap adding an additional 50 bytes .
Userlevel 3
I know the Cisco router can do the TCP mss-adjust on the constrained interface. Generally this is used in tunneling solutions (GRE, IPSec, etc.). As I understand it, this just modifies the Max MSS value indicated at the start of the TCP connection. We cannot do this currently in Cumulus Linux, but I am not sure if it is due to an ASIC limitation, or lack of customers requesting it.

For fun, I looked up how this is done in Linux. Using a mangle table rule, to modify the TCP handshake, is precisely how you would do it on a linux host:
-A FORWARD -o eth0 -p tcp -m tcp --tcp-flags SYN,RST SYN -c 0 0 -j TCPMSS --set-mss 1360 
I will look to see if there is an feature request in the system for this, and file one if it does not yet exist.
Userlevel 1
Hello Eric, To avoid fragmentation of packets/flows between DC’s , Assuming the data Center interconnect is through a pair of non-evpn devices (ex: cisco asr 1k) with the interconnect supporting an MTU of 1500.
Userlevel 5
Not as far as I'm aware, could you tell me a bit more about the use case? Thanks!

Reply