Noisy Fans on Edgecore


Any idea how I turn down the fans on this switch



root@cumulus:~# net show ver


NCLU_VERSION=1.0


DISTRIB_ID="Cumulus Linux"


DISTRIB_RELEASE=3.6.0


DISTRIB_DESCRIPTION="Cumulus Linux 3.6.0"


root@cumulus:~# net show sys


Accton AS7712


Cumulus Linux 3.6.0


Build: Cumulus Linux 3.6.0





Chipset: Broadcom Tomahawk BCM56960


Port Config: 32 x 100G-QSFP28


CPU: (x86_64) Intel Atom C2558 2.40GHz


Uptime: 0:51:43.080000


root@cumulus:~#


e: smonctl [-h] [-j] [-s SENSOR] [-v]


smonctl: error: argument -s/--sensor: expected one argument


root@cumulus:~# smonctl


Fan1 (Fan Tray 1, Fan A ): OK


Fan2 (Fan Tray 1, Fan B ): OK


Fan3 (Fan Tray 2, Fan A ): OK


Fan4 (Fan Tray 2, Fan B ): OK


Fan5 (Fan Tray 3, Fan A ): OK


Fan6 (Fan Tray 3, Fan B ): OK


Fan7 (Fan Tray 4, Fan A ): OK


Fan8 (Fan Tray 4, Fan B ): OK


Fan9 (Fan Tray 5, Fan A ): OK


Fan10 (Fan Tray 5, Fan B ): OK


Fan11 (Fan Tray 6, Fan A ): OK


Fan12 (Fan Tray 6, Fan B ): OK

7 replies

Userlevel 3
Hello Rory,

In general we don't allow you to turn down the fans on a switch because they are running at that speed because that's the speed needed to keep the switch cool and not cause any permanent heat-related damage. You can imagine what havoc would be created if customers wanted a silent box and just turned the fans off.

You can see the fan speeds by including the "-v" option on smonctl. Here are the fan speeds on a system in our lab. Of course, your speeds will be different depending on the temperature readings on your system, but this should give you an idea of what "normal" fan speeds are:

$ smonctl -v  Fan1(Fan Tray 1, Fan A):  OK  fan:14900 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)    Fan2(Fan Tray 1, Fan B):  OK  fan:12500 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)    Fan3(Fan Tray 2, Fan A):  OK  fan:14900 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)    Fan4(Fan Tray 2, Fan B):  OK  fan:12300 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)    Fan5(Fan Tray 3, Fan A):  OK  fan:14800 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)    Fan6(Fan Tray 3, Fan B):  OK  fan:12400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)    Fan7(Fan Tray 4, Fan A):  OK  fan:14700 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)    Fan8(Fan Tray 4, Fan B):  OK  fan:12300 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)    Fan9(Fan Tray 5, Fan A):  OK  fan:14800 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)    Fan10(Fan Tray 5, Fan B):  OK  fan:12400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)    Fan11(Fan Tray 6, Fan A):  OK  fan:14800 RPM   (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)    Fan12(Fan Tray 6, Fan B):  OK  fan:12400 RPM   (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)    PSU1:  OK    PSU2:  BAD    PSU1Temp1(PSU1 Inlet Temp Sensor):  OK  temp:26.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C, crit = 60 C)    PSU1Temp2(PSU1 Max Temp Sensor):  OK  temp:34.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C, crit = 60 C)    PSU2Temp1(PSU2 Inlet Temp Sensor):  ABSENT    PSU2Temp2(PSU2 Max Temp Sensor):  ABSENT    Temp1(Temp sensor behind networking asic):  OK  temp:28.0 C (lcrit = 0 C, fan_max = 55 C, fan_min = 41 C, min = 5 C, max = 61 C, crit = 67 C)    Temp2(Temp sensor in front of networking asic):  OK  temp:26.5 C (lcrit = 0 C, fan_max = 60 C, fan_min = 46 C, min = 5 C, max = 66 C, crit = 69 C)    Temp3(Temp sensor front left):  OK  temp:25.5 C (lcrit = 0 C, fan_max = 50 C, fan_min = 32 C, min = 5 C, max = 56 C, crit = 59 C)    Temp4(Temp Sensor Near CPU):  OK  temp:23.5 C (lcrit = 0 C, fan_max = 50 C, fan_min = 33 C, min = 5 C, max = 57 C, crit = 60 C)    Temp5(Intel CPU die sensor):  OK  temp:22.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)    Temp6(Intel CPU die sensor):  OK  temp:22.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)    Temp7(Intel CPU die sensor):  OK  temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)    Temp8(Intel CPU die sensor):  OK  temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)    Temp9(Networking ASIC Die Temp Sensor):  OK  temp:38.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp10(Networking ASIC Die Temp Sensor):  OK  temp:38.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp11(Networking ASIC Die Temp Sensor):  OK  temp:39.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp12(Networking ASIC Die Temp Sensor):  OK  temp:38.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp13(Networking ASIC Die Temp Sensor):  OK  temp:40.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp14(Networking ASIC Die Temp Sensor):  OK  temp:42.8 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp15(Networking ASIC Die Temp Sensor):  OK  temp:39.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Temp16(Networking ASIC Die Temp Sensor):  OK  temp:38.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)    Messages:  PSU2:  status is installed, power_bad  
Note that PSU2 is not plugged in on this system, which is why you see the power_bad for that supply.

Scott
Userlevel 5
Modifying fan speeds is not recommended and definitely not supported. Most of the time those fan speeds are built with understanding of the optimal temperature values specified by the hardware manufacturer so if the fans are spinning quickly it is because they should be.

Is your environment very warm?
What does output from 'smonctl -v' show?
Is there a particular sensor that is reading significantly higher than others?
Is this happening on all switches of this model number?

Happy to work through this with you because there are options here.
Thanks Eric, The environment is normal lab temperature. Here is the output. I realise that playing with Fan configs is not normal, but next to the edgecore I have an Arista 7060 - very similar switch in terms of hardware and its silent. I need to take the edgecore rack to a customer and I need to shout in order to be heard next to it.@cumulus:~# smonctl -v
Fan1(Fan Tray 1, Fan A): OK
fan:15000 RPM (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)
Fan2(Fan Tray 1, Fan B): OK
fan:13800 RPM (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)
Fan3(Fan Tray 2, Fan A): OK
fan:15200 RPM (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)
Fan4(Fan Tray 2, Fan B): OK
fan:13800 RPM (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)
Fan5(Fan Tray 3, Fan A): OK
fan:14700 RPM (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)
Fan6(Fan Tray 3, Fan B): OK
fan:13400 RPM (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)
Fan7(Fan Tray 4, Fan A): OK
fan:15100 RPM (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)
Fan8(Fan Tray 4, Fan B): OK
fan:13800 RPM (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)
Fan9(Fan Tray 5, Fan A): OK
fan:14600 RPM (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)
Fan10(Fan Tray 5, Fan B): OK
fan:13500 RPM (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)
Fan11(Fan Tray 6, Fan A): OK
fan:15100 RPM (max = 21300 RPM, min = 12000 RPM, limit_variance = 15%)
Fan12(Fan Tray 6, Fan B): OK
fan:13800 RPM (max = 17700 RPM, min = 10100 RPM, limit_variance = 15%)
PSU1: BAD
PSU2: OK
PSU1Temp1(PSU1 Inlet Temp Sensor): ABSENT
PSU1Temp2(PSU1 Max Temp Sensor): ABSENT
PSU2Temp1(PSU2 Inlet Temp Sensor): OK
temp:22.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C, crit = 60 C)
PSU2Temp2(PSU2 Max Temp Sensor): OK
temp:32.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 25 C, min = 5 C, max = 50 C, crit = 60 C)
Temp1(Temp sensor behind networking asic): OK
temp:26.5 C (lcrit = 0 C, fan_max = 55 C, fan_min = 41 C, min = 5 C, max = 61 C, crit = 67 C)
Temp2(Temp sensor in front of networking asic): OK
temp:25.5 C (lcrit = 0 C, fan_max = 60 C, fan_min = 46 C, min = 5 C, max = 66 C, crit = 69 C)
Temp3(Temp sensor front left): OK
temp:23.5 C (lcrit = 0 C, fan_max = 50 C, fan_min = 32 C, min = 5 C, max = 56 C, crit = 59 C)
Temp4(Temp Sensor Near CPU): OK
temp:24.0 C (lcrit = 0 C, fan_max = 50 C, fan_min = 33 C, min = 5 C, max = 57 C, crit = 60 C)
Temp5(Intel CPU die sensor): OK
temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)
Temp6(Intel CPU die sensor): OK
temp:23.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)
Temp7(Intel CPU die sensor): OK
temp:23.0 C (lcrit = 0 C, f

Temp9(Networking ASIC Die Temp Sensor): OK
temp:36.0 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp10(Networking ASIC Die Temp Sensor): OK
temp:36.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp11(Networking ASIC Die Temp Sensor): OK
temp:37.4 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp12(Networking ASIC Die Temp Sensor): OK
temp:35.5 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp13(Networking ASIC Die Temp Sensor): OK
temp:35.5 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp14(Networking ASIC Die Temp Sensor): OK
temp:37.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp15(Networking ASIC Die Temp Sensor): OK
temp:37.9 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Temp16(Networking ASIC Die Temp Sensor): OK
temp:36.0 C (lcrit = 0 C, fan_max = 82 C, fan_min = 67 C, min = 5 C, max = 87 C, crit = 95 C)
Messages:
PSU1: status is installed, power_bad
Userlevel 5
Rory,
Temperature readings from Temp sensor 7 and 8 are missing in the output above but all the rest are quite low. Some manufacturers require that the fan_min value be set to 50% of the max speed at the lowest value -- I'm not sure if that's the case here but looking at these values, picking on Fan1, the max value is 21300 RPM and the minimum value is 12000 RPM so even at minimum temperatures that fan is probably still not going to be quiet.

I'll let Scott Emery comment further as he is one of our platform engineers and is in a better position to provide more guidance. In the mean time could you post the temp values from sensors 7 and 8?

Thanks!
Thanks Eric
emp7(Intel CPU die sensor): OK
temp:24.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)
Temp8(Intel CPU die sensor): OK
temp:24.0 C (lcrit = 0 C, fan_max = 65 C, fan_min = 58 C, min = 5 C, max = 70 C, crit = 98 C)
Eric, Scott any ideas?
Userlevel 5
Rory Browne wrote:

Eric, Scott any ideas?

Sending you an e-mail.

Reply