Collectd ping standard deviation always zero on some platforms?

On my Netgear R7800 (ipq806x) running 22.03.0, collectd-mod-ping is calculating standard deviations just fine for me. I've also installed collectd-mod-csv to gather the raw data in readable form, and configured it as an output.

Config info:

# uci get luci_statistics.collectd_ping.Hosts
xxx upstream-gateway 172.22.1.1 xxx xxx (redacted)

# uci show luci_statistics.collectd_csv
luci_statistics.collectd_csv=statistics
luci_statistics.collectd_csv.enable='1'
luci_statistics.collectd_csv.DataDir='/tmp/csv'

Sample output in the ping directory:

# cat /tmp/csv/LEDE/ping/ping_stddev-upstream-gateway-2022-09-06 
epoch,value
1662505053.126,6.100917
1662505083.132,1.341382
1662505113.127,6.602256
1662505143.133,3.148039
1662505173.126,3.891209
1662505203.128,2.903380
1662505233.127,2.632559
1662505263.134,1.571898

But on my Linksys MR8300, or my GL.iNet GL-B2200 (both ipq40xx systems):

# uci get luci_statistics.collectd_ping.Hosts
xxx upstream-gateway 192.168.1.1 xxx xxx (redacted)

# uci show luci_statistics.collectd_csv
luci_statistics.collectd_csv=statistics
luci_statistics.collectd_csv.DataDir='/tmp/csv'
luci_statistics.collectd_csv.enable='1
# cat /tmp/csv/sueleeg/ping/ping-upstream-gateway-2022-09-06 
epoch,value
1662505546.907,16.289000
1662505606.910,14.290000
1662505636.915,13.791000
# cat /tmp/csv/sueleeg/ping/ping_stddev-upstream-gateway-2022-09-06
epoch,value
1662505546.907,0.000000
1662505606.910,0.000000
1662505636.915,0.000000

Any idea why one platform calculates the stddev OK but the other one doesn't?
Anybody else have working stddev, or zero stddev, on their systems?

It's not just the hostname upstream-gateway but any host in the ping list has an improper zero-valued stddev

Did you find out what's going on? I'm seeing stddev=0 on my Netgear r6230 running Mediatek.

And... how is it calculated, anyway?

Just in case someone else stumbles upon this...

collectd has a global "Data collection interval", 30 sec by default.

Separately, the ping module has an "Interval for Pings" setting, also 30 sec by default.

Now, the ping stddev is estimated from all the pings within a single "Data collection interval". If there is only one ping, the stddev cannot be calculated and gets set to zero. The fix is to have the "Interval for pings" be at least half of the "Data collection interval". That way there are at least two samples available for estimating the stddev.

I'd say two things should be done: First, by default, the "Interval for pings" should be small enough (e.g., 10 sec) to get enough samples to have a meaningful number for the std dev. Second, the help message next to the "Interval for pings" setting should be improved to explain that it should be at least half of the "Data collection interval". Perhaps something like this could work: Seconds. Should be less than half the Data collection interval to get a meaningful stddev per interval.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.