Support for RTL838x based managed switches

I thought I'd have a look at fan control on the HP / HPE 1920-24g-poe-370w (JG926A). The fan speed changes a few times during boot - uboot seems to set it high. With the stock firmware, when the kernel loads the fan speed briefly switches low speed, before going high again. With no (or light) PoE load in normal room temperature, the fan speed goes low again a few minutes after boot.

My working hypothesis is that fan speed is controlled by some combination of:

  • Has a fan failed (low/zero rpm reading)? If-so command fan speed to max (confirmed by manually stopping one fan - this is consistent with my experience of the fan behaviour on other HP switches in the past).
  • One or more temperature sensors (unconfirmed).
  • PoE power load (because this would lead to the generation of more heat within the power supply - also unconfirmed).

Searching for Fan in snmp shows a signal:

hh3cDevMFanStatus.1 ... when all fans are running freely this returns active(1). When one or more has failed it returns deactive(2).

As it happens I already knew this hardware signal existed because I had spotted it in OpenWRT when mechanically disabling one of the fans and running a command to print GPIO values. I'd guess that there is a fan control IC (or maybe a microcontroller) in there, and it has a 'failed' gpio output. There is just one GPIO input which toggles when at least one of the three fans is stalled. I haven't opened the case on one of these to check what ICs are present.

With the stock firmware if there are no fan failures for some time period (half an hour or so) then the fan speed returns to low speed.

I thought I'd search for any temperature sensor onboard in the output of snmpwalk (from net-snmp package). The firmware returns multiple temperature SNMP data points, but they are all either 0 or 65536. I'd guess that these are all bogus and the same snmp server code is shared with higher end 3Com and 3Com-derived H3C/HP/HPE hardware which did have such sensors.

display diagnostic-information is more promising. The section debug poe port-power 1 includes per-port temperature values. On my hardware these look a bit weird. My best guess is that the PoE hardware is based on 3x blocks of 8 ports (based on power to logical port numbering - see previous post). I'd guess that each has an ADC, attached to a set of thermistors or similar. In a switch which has been on for a while (I measured the case temperature at 22 Celsius, room temperature 21 C), the port "Temp" column values are:

ports 01 to 08 -> 12, 12, 12, 12, 12, 12, 12, 12
ports 09 to 16 -> 19, 19, 19, 19, 18, 20, 20, 20
ports 17 to 24 -> 08, 08, 08, 08, 09, 08, 08, 08

No units are stated in the output. Could be uncalibrated absolute temperature values, or it could be headroom values or anything else.

I also noticed in the display diagnostic-information output (and also the snmp output) - there is an item in the process list FanT - which might be some sort of fan speed control daemon, but I haven't investigated that further, or checked to see if it's absent on switches without fans/PoE.

So some progress. I was hoping for some temperature sensor which I could easy create a devicetree entry for, and then get the kernel to control fan speed, but doesn't look like that's (easily) possible...

1 Like