How to verify that ECN is working

  • 24 August 2017
  • 4 replies
  • 870 views

Userlevel 1
I'm currently trying to find a way to prove, that ECN (https://docs.cumulusnetworks.com/display/DOCS/Buffer+and+Queue+Management#BufferandQueueManagement-e...) mechanism is actually working.

I have 4 servers, all connected to one switch with 10GE ports. I have enabled ECN on all servers and on switch itself for relevant ports. Now I want to make sure, that ECN is actually doing something. So, I emulated interface overload - from one server I started 3 iperf TCP sessions to 3 another servers (so, 3 servers are sending towards one server).

I expect to see, that eventually the receiving server's interface on the switch will be overloaded and it will start marking IP packets with '11 - Congestion Encountered' bits. But, I don't see any on the server (running tcpdump like this: tcpdump -i ens4f0 '(ip[1] & 3 == 3)' ).

I even reduced ECN threshold to 1000 bytes on switch - ecn.ecn_port_group.max_threshold_bytes = 1000 with no visible result.

Maybe, anybody could advise, how to actually emulate and see the result of working ECN?

Thank you in advance

Sergei.

4 replies

Userlevel 5
If I'm not mistaken, there are also interface counters available via ethtool for ECN.
Userlevel 1
the last thing to check is if switchd restart was performed, the command is
systemctl restart switchd.service
and you may check the log file
grep -R ECN /var/log/switchd.log
if it's been enabled in hardware, then i'd recommend you open a case and ask it to be assigned to me since i am familiar with your troubleshooting so far
Userlevel 1
Mark, thank you for comments.

The config looks to be aligned with the guide:
ecn.port_group_list = [ecn_port_group]
ecn.ecn_port_group.cos_list = [3]
ecn.ecn_port_group.port_set = swp35-swp45
ecn.ecn_port_group.min_threshold_bytes = 1000
ecn.ecn_port_group.max_threshold_bytes = 1000
ecn.ecn_port_group.probability = 100

We use T3048-LY8, which, I guess, uses Trident II - so, should be ok here as well.

With tcpdump I have tried matching on '10' - tcpdump -i ens4f0 '(ip[1] & 3 == 2)' ), and I see many packets coming (which is just a notification from servers, that they do support ECN).

Please, comment on my assumptions here - if they are correct, they I will go ahead and open the case.
Userlevel 1
Sergei,

since you are already aware of the config guide, i will assume the config is correct here. next, i would verify switchd was restarted and check your platform supports ECN with Cumulus Linux (ECN is supported on Broadcom Tomahawk, Trident II+ and Trident II, and Mellanox Spectrum switches only).

your tcpdump filter looks ok to me, but could you also check if it is matching on any values 0-3? we know you checked 3 but let's see if it's matching a different value that would further confirm ECN is not being set by the switch

if all of this has already been verified, i'd ask you open a Support case with us, please provide a cl-support file while the congestion is present.

Reply