Speedtest: new package to measure network performance

I'm going to do some more testing tonight.

Here are some more results. With these results I stopped forwarding to dnscrypt proxy and just forwarded to 1.1.1.1 and 1.0.0.1. Even so, netperf refuses to work.

root@lede:/usr/bin# speedtest.sh
2018-11-11 20:46:19 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
.
 Download:   0.00 Mbps
  Latency: (in msec, 1 pings, 0.00% packet loss)
      Min: 35.123 
    10pct: 0.000 
   Median: 0.000 
      Avg: 35.123 
    90pct: 0.000 
      Max: 35.123
Processor: (in % busy, avg +/- stddev, -1 samples)
 Overhead: (in % total CPU used)
  netperf:  0
.
   Upload:   0.00 Mbps
  Latency: (in msec, 1 pings, 0.00% packet loss)
      Min: 16.721 
    10pct: 0.000 
   Median: 0.000 
      Avg: 16.721 
    90pct: 0.000 
      Max: 16.721
Processor: (in % busy, avg +/- stddev, -1 samples)
 Overhead: (in % total CPU used)
  netperf:  0
root@lede:/usr/bin# speedtest.sh
2018-11-11 20:46:22 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
.
 Download:   0.00 Mbps
  Latency: (in msec, 1 pings, 0.00% packet loss)
      Min: 19.562 
    10pct: 0.000 
   Median: 0.000 
      Avg: 19.562 
    90pct: 0.000 
      Max: 19.562
Processor: (in % busy, avg +/- stddev, -1 samples)
 Overhead: (in % total CPU used)
  netperf:  0
....................................................................................................................................
   Upload:   0.00 Mbps
  Latency: (in msec, 133 pings, 0.00% packet loss)
      Min: 12.133 
    10pct: 12.649 
   Median: 13.677 
      Avg: 15.950 
    90pct: 14.354 
      Max: 66.922
Processor: (in % busy, avg +/- stddev, 130 samples)
     cpu0:  8 +/-  9
     cpu1:  7 +/-  8
 Overhead: (in % total CPU used)
  netperf:  0

nslookup

root@lede:/usr/bin# nslookup netperf.bufferbloat.net
Server:		216.165.129.158
Address:	216.165.129.158#53

Name:      netperf.bufferbloat.net
netperf.bufferbloat.net	canonical name = netperf.richb-hanover.com
Name:      netperf.richb-hanover.com
netperf.richb-hanover.com	canonical name = atl.richb-hanover.com
Name:      atl.richb-hanover.com
Address 1: 23.226.232.80
netperf.bufferbloat.net	canonical name = netperf.richb-hanover.com
netperf.richb-hanover.com	canonical name = atl.richb-hanover.com

and

root@lede:/usr/bin# nslookup gstatic.com
Server:		216.165.129.158
Address:	216.165.129.158#53

Name:      gstatic.com
Address 1: 64.233.185.120
Address 2: 64.233.185.94
Address 3: 2607:f8b0:4002:c09::5e

netperf with debug enabled

root@lede:/usr/bin# netperf -4 -H netperf.bufferbloat.net -t TCP_STREAM -l 10 -d
resolve_host called with host 'netperf.bufferbloat.net' port '(null)' family AF_INET
getaddrinfo returned the following for host 'netperf.bufferbloat.net' port '(null)'  family AF_INET
	cannonical name: 'atl.richb-hanover.com'
	flags: 0 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
	sa_family: AF_INET sadata: 0 0 23 226 232 80 0 0 0 0 0 0 0 0 0 0
scan_omni_args called with the following argument vector
netperf -4 -H netperf.bufferbloat.net -t TCP_STREAM -l 10 -d 
sizeof(omni_request_struct)=200/648
sizeof(omni_response_struct)=204/648
sizeof(omni_results_struct)=284/648
Program name: netperf
Local send alignment: 8
Local recv alignment: 8
Remote send alignment: 8
Remote recv alignment: 8
Local socket priority: -1
Remote socket priority: -1
Local socket TOS: cs0
Remote socket TOS: cs0
Report local CPU 0
Report remote CPU 0
Verbosity: 1
Debug: 1
Port: 12865
Test name: TCP_STREAM
Test bytes: 0 Test time: 10 Test trans: 0
Host name: netperf.bufferbloat.net

installing catcher for all signals
Could not install signal catcher for sig 32, errno 22
Could not install signal catcher for sig 33, errno 22
Could not install signal catcher for sig 34, errno 22
Could not install signal catcher for sig 65, errno 22
remotehost is netperf.bufferbloat.net and port 12865
resolve_host called with host 'netperf.bufferbloat.net' port '12865' family AF_INET
getaddrinfo returned the following for host 'netperf.bufferbloat.net' port '12865'  family AF_INET
	cannonical name: 'atl.richb-hanover.com'
	flags: 0 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
	sa_family: AF_INET sadata: 50 65 23 226 232 80 0 0 0 0 0 0 0 0 0 0
resolve_host called with host '0.0.0.0' port '0' family AF_INET
getaddrinfo returned the following for host '0.0.0.0' port '0'  family AF_INET
	cannonical name: '0.0.0.0'
	flags: 0 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
	sa_family: AF_INET sadata: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
establish_control called with host 'netperf.bufferbloat.net' port '12865' remfam AF_INET
		local '0.0.0.0' port '0' locfam AF_INET
bound control socket to 0.0.0.0 and 0
successful connection to remote netserver at netperf.bufferbloat.net and 12865
complete_addrinfo using hostname netperf.bufferbloat.net port 0 family AF_INET type SOCK_STREAM prot IPPROTO_TCP flags 0x0
getaddrinfo returned the following for host 'netperf.bufferbloat.net' port '0'  family AF_INET
	cannonical name: 'atl.richb-hanover.com'
	flags: 0 family: AF_INET: socktype: SOCK_STREAM protocol IPPROTO_TCP addrlen 16
	sa_family: AF_INET sadata: 0 0 23 226 232 80 0 0 0 0 0 0 0 0 0 0
local_data_address not set, using local_host_name of '0.0.0.0'
complete_addrinfo using hostname 0.0.0.0 port 0 family AF_UNSPEC type SOCK_STREAM prot IPPROTO_TCP flags 0x1
complete_addrinfo: could not resolve '0.0.0.0' port '0' af 0
	getaddrinfo returned -11 System error

Seems it still can't resolve the hostname.

1 Like

Some results on my gigabit connection, no SQM and using 8430.

OpenWrt SNAPSHOT r8430-4d5b0efc09 / LuCI Master (git-18.311.58259-40de466)

I also have the following in /etc/rc.local to help with cpu scaling using ondemand governor:

echo 35 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

root@OpenWrt:~# speedtest.sh -H netperf-west.bufferbloat.net -p 1.1.1.1 --concurrent
2018-11-11 18:41:07 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-west.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
............................................................
 Download:  85.27 Mbps
   Upload: 770.63 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  13.970
    10pct:  25.491
   Median:  36.041
      Avg:  37.778
    90pct:  48.356
      Max:  79.930
 CPU Load: [in % busy (avg +/- std dev), 55 samples]
     cpu0:  97.4% +/-  0.9%
     cpu1:  97.4% +/-  0.0%
 Overhead: [in % total CPU used]
  netperf: 46.2%

root@OpenWrt:~# speedtest.sh -H netperf-west.bufferbloat.net -p 1.1.1.1
2018-11-11 18:47:50 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-west.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 906.98 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  14.153
    10pct:  14.474
   Median:  15.279
      Avg:  16.254
    90pct:  18.377
      Max:  32.466
 CPU Load: [in % busy (avg +/- std dev), 58 samples]
     cpu0:  77.5% +/-  0.0%
     cpu1:  47.9% +/-  3.9%
 Overhead: [in % total CPU used]
  netperf: 49.5%
............................................................
   Upload: 815.07 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  14.229
    10pct:  15.731
   Median:  29.835
      Avg:  33.368
    90pct:  52.642
      Max:  83.194
 CPU Load: [in % busy (avg +/- std dev), 55 samples]
     cpu0:  92.7% +/-  3.5%
     cpu1:  98.6% +/-  0.0%
 Overhead: [in % total CPU used]
  netperf: 54.4%

root@OpenWrt:~# speedtest.sh -H netperf-east.bufferbloat.net -p 1.1.1.1
2018-11-11 18:56:18 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-east.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are sequential, each with 5 simultaneous streams.
.............................................................
 Download: 493.72 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  13.705
    10pct:  13.997
   Median:  14.227
      Avg:  14.573
    90pct:  15.825
      Max:  17.088
 CPU Load: [in % busy (avg +/- std dev), 58 samples]
     cpu0:  47.1% +/- 14.3%
     cpu1:  24.0% +/- 10.5%
 Overhead: [in % total CPU used]
  netperf: 32.5%
............................................................
   Upload: 592.40 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  14.145
    10pct:  17.219
   Median:  36.497
      Avg:  35.704
    90pct:  50.932
      Max:  67.179
 CPU Load: [in % busy (avg +/- std dev), 57 samples]
     cpu0:  73.8% +/-  8.1%
     cpu1:  97.7% +/-  0.0%
 Overhead: [in % total CPU used]
  netperf: 24.6%

Thanks!

1 Like

In the tests above, I tried using my ISP's assigned DNS servers. But I also tried 1.1.1.1 and 1.0.0.1.

That is why you see the IP of 216.165.129.158.

I checked the box - "Use DNS servers advertised by peer"

Thanks,
David

1 Like

My thanks to everyone for the continued and thoughtful feedback. After catching up following a brief absence...

That's really great as follow-up! I was going to suggest running speedtest.sh off the router to take its on-router load out of the equation. I'm afraid the CAKE CPU comes as not too much of a surprise, as @dtaht has been pointing out for a little while. As additional context, could you provide a ballpark for the "low CPU" and "high CPU" figures?

That's a good suggestion, and it's been bothering me since realizing there's little error checking for netperf. I'll look into making an update that does two things:

  1. Check netperf return status and warn/abort if detected.
  2. Check elapsed time-to-complete for all netperf processes and warn/abort if too far out of bounds.

These two items should catch the "0.0" speeds as well as the "run extra long" situations.

@davidc502 Thanks for sticking with the troubleshooting. Your DNS service does seem to work, both with nslookup and netperf itself resolving netperf.bufferbloat.net (note the correct IP 23.226.232.80 in sadata output below):

The problem you're seeing seems related to the local host name lookup. Refer to this snippet below from my working netperf example above:

And compare it to the corresponding error output in your posted netperf debug log:

Does any of this ring a bell with respect to your DNS/resolution setup? The error code appears to be EAGAIN from errno.h. @hnyman Something you've seen before perhaps?

One further suggestion to help narrow things down. If you could try what @hnyman did, install netperf on a Linux box in your LAN, and then run the same "debug" netperf command from there:

netperf -4 -H netperf.bufferbloat.net -t TCP_STREAM -l 10 -d

If this does work, then it might point to some discrepency between your on-router vs. LAN DNS resolution.

@egross Thanks for your results! Those are very interesting. Nice to see consistent speeds in the gigabit range, and also the potential variation among netperf servers (as @richb-hanover-priv also highlighted).

I notice you see aggregate throughput around 800 or 900 Mbps, whether testing sequentially or concurrent. It's not obvious if that's related to your network link or CPU exhaustion. Your CPU usage is high, but you have "headroom" available by taking netperf out of the picture (i.e. normal operation). If you could run speedtest.sh from a Linux server on your LAN with netperf installed, that would provide some good additional data.

One other odd thing I noticed is that you are tweaking the CPU frequency scaling on your 2-core router, but speedtest.sh doesn't find any CPU frequency information to display from /proc/cpuinfo. Are you certain scaling is working for you? Could you post the output of your /proc/cpuinfo file for my benefit?

Thanks again everyone for the feedback and testing!

If I remember right, it would be something like
low = 10-15%,
high = 90% on one core, 15% on the other

So, cake caused one core to be rather fully utilised.

Thanks for the clarification. I honestly didn't think the CAKE overhead was that high from the little A/B testing I did in the past, but note that was done by "eyeballing" top and without the benefit of this script. I'm hoping @dtaht could share his observations...

@hnyman @egross Could you please confirm seeing CPU average frequency on your multi-core/freq. scaling boxes, since my last update to the script added this capability?

I've successfully tested on multi-core Ubuntu and single-core (no freq. scaling) OpenWrt, but don't have an OpenWrt router that does frequency scaling to check on. Thanks again!

@guidosarducci I did some testing of on the raspberry pi3+. first CPU set on performance and second ondemand

-----------------------------------------------------
 OpenWrt SNAPSHOT, r8467-dd02a19ff5
 -----------------------------------------------------
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
1400000
root@OpenWrt:~# speedtest.sh -H netperf-east.bufferbloat.net -p 1.1.1.1 --concurrent
2018-11-18 05:10:39 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-east.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
..................................................................................................................................
 Download:  34.33 Mbps
   Upload:   9.16 Mbps
  Latency: [in msec, 128 pings, 0.00% packet loss]
      Min:  13.693
    10pct:  14.644
   Median:  18.387
      Avg:  41.574
    90pct:  88.621
      Max: 350.122
 CPU Load: [in % busy (avg +/- std dev), 128 samples]
     cpu0:   1.1% +/-  1.4%
     cpu1:   1.2% +/-  1.6%
     cpu2:   1.9% +/-  2.5%
     cpu3:   1.4% +/-  1.7%
 Overhead: [in % total CPU used]
  netperf:  0.8%
root@OpenWrt:~# 
root@OpenWrt:~# speedtest.sh -H netperf-west.bufferbloat.net -p 1.1.1.1 --concurrent
2018-11-18 05:14:34 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-west.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
.............................................................
 Download:  33.32 Mbps
   Upload:   9.28 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  15.867
    10pct:  19.486
   Median:  24.562
      Avg:  33.013
    90pct:  38.205
      Max: 278.781
 CPU Load: [in % busy (avg +/- std dev), 59 samples]
     cpu0:   1.8% +/-  1.3%
     cpu1:   2.9% +/-  1.9%
     cpu2:   3.1% +/-  1.8%
     cpu3:   2.6% +/-  1.7%
 Overhead: [in % total CPU used]
  netperf:  1.5%

root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
600000
root@OpenWrt:~# speedtest.sh -H netperf-east.bufferbloat.net -p 1.1.1.1 --concurrent
2018-11-18 05:38:36 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-east.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
.............................................................
 Download:  41.61 Mbps
   Upload:   9.16 Mbps
  Latency: [in msec, 60 pings, 0.00% packet loss]
      Min:  18.280
    10pct:  22.199
   Median:  32.515
      Avg:  43.444
    90pct:  79.759
      Max: 221.882
 CPU Load: [in % busy (avg +/- std dev), 58 samples]
     cpu0:   8.9% +/-  3.7%
     cpu1:   8.6% +/-  4.1%
     cpu2:   8.5% +/-  4.3%
     cpu3:   9.8% +/-  3.5%
 Overhead: [in % total CPU used]
  netperf:  5.5%
root@OpenWrt:~# speedtest.sh -H netperf-west.bufferbloat.net -p 1.1.1.1 --concurrent
2018-11-18 05:39:45 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-west.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
.............................................................
 Download:  29.05 Mbps
   Upload:   9.31 Mbps
  Latency: [in msec, 62 pings, 0.00% packet loss]
      Min:  16.105
    10pct:  17.909
   Median:  24.875
      Avg:  52.382
    90pct: 116.738
      Max: 442.237
 CPU Load: [in % busy (avg +/- std dev), 59 samples]
     cpu0:   7.5% +/-  3.0%
     cpu1:   5.4% +/-  2.3%
     cpu2:   6.2% +/-  3.0%
     cpu3:   6.3% +/-  3.2%
 Overhead: [in % total CPU used]
  netperf:  4.2%

A quick update after a brief absence...

Thanks @m4r for the extra data points. The lack of frequency information still wasn't clear until the "raspberry pi3+" hint, and checking online for sample /proc/cpuinfo files. Apparently, the contents of /proc/cpuinfo aren't standardized and vary between Linux platforms, and is a point of complaint for many.

In particular, Linux/arm doesn't include CPU frequency in /proc/cpuinfo while Linux/x86_64 does. For more reliable output, I've updated speedtest to use the sysfs interface for CPU frequency monitoring, which should be more robust.

The OP has been edited with the new package link: speedtest_0.9-6_all.ipk

I'd be interested to hear if those users with multi-core (non-x86) routers now see CPU frequency information. Thanks everyone!

1 Like

Sorry for the delayed response.

I do see frequency in the output now. You will notice that I also started irqbalance first and that seems to help lower the cpu slightly (in the upload test, one cpu would get maxed):

root@OpenWrt:~# irqbalance
root@OpenWrt:~# speedtest.sh -H netperf-west.bufferbloat.net -p 1.1.1.1
2018-12-18 11:45:21 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-west.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download: 910.64 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:   4.924
    10pct:   5.323
   Median:   6.689
      Avg:   6.769
    90pct:   7.466
      Max:  18.762
 CPU Load: [in % busy (avg +/- std dev), 57 samples]
     cpu0:  77.7% +/-  9.6%  @ 1725 MHz
     cpu1:  59.7% +/- 10.0%  @ 1725 MHz
 Overhead: [in % used of total CPU available]
  netperf: 49.2%
...........................................................
   Upload: 815.01 Mbps
  Latency: [in msec, 60 pings, 0.00% packet loss]
      Min:   5.199
    10pct:  10.478
   Median:  21.578
      Avg:  27.282
    90pct:  51.101
      Max:  75.928
 CPU Load: [in % busy (avg +/- std dev), 52 samples]
     cpu0:  91.3% +/-  0.0%  @ 1725 MHz
     cpu1:  95.0% +/-  0.0%  @ 1725 MHz
 Overhead: [in % used of total CPU available]
  netperf: 39.2%

Concurrent test:

root@OpenWrt:~# speedtest.sh -H netperf-west.bufferbloat.net -p 1.1.1.1 --concurrent
2018-12-18 11:49:30 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-west.bufferbloat.net (IPv4) while pinging 1.1.1.1.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
............................................................
 Download: 136.61 Mbps
   Upload: 672.10 Mbps
  Latency: [in msec, 60 pings, 0.00% packet loss]
      Min:   5.317
    10pct:  12.724
   Median:  22.947
      Avg:  24.738
    90pct:  39.668
      Max:  54.965
 CPU Load: [in % busy (avg +/- std dev), 55 samples]
     cpu0:  89.0% +/-  0.0%  @ 1725 MHz
     cpu1:  97.1% +/-  1.9%  @ 1725 MHz
 Overhead: [in % used of total CPU available]
  netperf: 42.8%
1 Like

It shows this way in ipq806x R7800:

root@router1:/tmp# ./speedtest.sh
2018-12-18 21:56:55 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
.............................................................
 Download:  80.41 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  11.478
    10pct:  11.572
   Median:  11.838
      Avg:  11.901
    90pct:  12.298
      Max:  12.686
 CPU Load: [in % busy (avg +/- std dev), 57 samples]
     cpu0:  35.3% +/- 17.8%  @ 1283 MHz
     cpu1:  35.9% +/- 19.3%  @ 1260 MHz
 Overhead: [in % used of total CPU available]
  netperf: 21.0%
.............................................................
   Upload:   9.19 Mbps
  Latency: [in msec, 62 pings, 0.00% packet loss]
      Min:  11.489
    10pct:  11.617
   Median:  12.308
      Avg:  12.382
    90pct:  13.118
      Max:  14.770
 CPU Load: [in % busy (avg +/- std dev), 58 samples]
     cpu0:  16.1% +/- 13.4%  @  967 MHz
     cpu1:  14.9% +/- 14.9%  @  866 MHz
 Overhead: [in % used of total CPU available]
  netperf:  2.5%
root@router1:/tmp# ./speedtest.sh -c
2018-12-18 21:59:09 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
.............................................................
 Download:  80.40 Mbps
   Upload:   8.18 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  11.409
    10pct:  11.758
   Median:  12.505
      Avg:  12.708
    90pct:  13.526
      Max:  17.203
 CPU Load: [in % busy (avg +/- std dev), 57 samples]
     cpu0:  38.4% +/- 19.2%  @ 1294 MHz
     cpu1:  41.2% +/- 15.6%  @ 1297 MHz
 Overhead: [in % used of total CPU available]
  netperf: 20.8%
1 Like

Hi! I´ve done some very basic benchmarks to figure out if an Asus AC68U (ARM dual core @ 800MHz) has enough CPU to manage SQM in a 100 / 100 mbit connection. The aim is to use VLAN in the router as well, but I forgot to also try it out during the tests.

SQM settings in OpenWRT (18.06.1) was set to 100 000 for both upload and download. Queue Discipline was set to Cake and Piece of Cake.

The test setup consists of a local netperf server in my LAN which the Asus AC68U router was connected to from its WAN port. Not sure I get realistic results from limiting the throughput using SQM parameters in the router compared applying the limitations further upstream.

Finally I installed the "speetest package" from: https://github.com/guidosarducci/packages/tree/master-add-speedtest/net/speedtest/files

I then ran this command using different CPU frequencies : speedtest.sh -H -p 1.1.1.1 --concurrent

Below are test results:

CPU 400 Mhz
Memory: 800 Mhz

Download: 93.20 Mbps
Upload: 93.96 Mbps
Latency: [in msec, 60 pings, 0.00% packet loss]
Min: 8.280
10pct: 8.966
Median: 10.038
Avg: 10.225
90pct: 11.475
Max: 16.441
CPU Load: [in % busy (avg +/- std dev), 55 samples]
cpu0: 96.2% +/- 2.0%
cpu1: 99.7% +/- 0.2%
Overhead: [in % used of total CPU available]
netperf: 69.5%

CPU 1200 Mhz
Memory: 800 Mhz

Download:  93.02 Mbps
   Upload:  93.27 Mbps
  Latency: [in msec, 59 pings, 0.00% packet loss]
      Min:   8.360
    10pct:   8.597
   Median:   9.043
      Avg:   9.281
    90pct:  10.396
      Max:  11.201
 CPU Load: [in % busy (avg +/- std dev), 56 samples]
     cpu0:  72.4% +/-  4.1%
     cpu1:  96.3% +/-  2.0%
 Overhead: [in % used of total CPU available]
  netperf: 35.2%

CPU 1400 Mhz
Memory: 800 Mhz

Download:  91.16 Mbps
   Upload:  93.19 Mbps
  Latency: [in msec, 60 pings, 0.00% packet loss]
      Min:   8.197
    10pct:   8.349
   Median:   8.856
      Avg:   9.079
    90pct:   9.600
      Max:  16.938
 CPU Load: [in % busy (avg +/- std dev), 57 samples]
     cpu0:  66.2% +/-  6.5%
     cpu1:  88.3% +/-  3.1%
 Overhead: [in % used of total CPU available]
  netperf: 37.3%

Is the setup of the test correct? If so, what conclusions can I draw from the results?

Thanks! Erik

1 Like

I would say the test show that shaping 100/100 works on your router even clocked down to 400 MHz, assuming the frequency stayed constant during the test (standard deviations would be nice to have for frequency as already implemented for the load). But you probably should not run heavy applications like netperf when running at 400 MHz (with load in the high nineties you basically have no reserve cycles left for say the wifi driver).

Thanks for the feedback moeller0! 1200Mhz CPU + 800Mhz memory seems to be a good balance between avoiding additional cooling yet staying below 90% for most cores.

Wifi is not a problem because I didn´t plan to use it.

Targeted a netperf server outside of my LAN, adding two samples to the post.

Local netperf server netperf-eu.bufferbloat.net sample #1 netperf-eu.bufferbloat.net sample #2

Download: 93.02 Mbps
Upload: 93.27 Mbps
Latency: [in msec, 59 pings, 0.00% packet loss]
Min: 8.360
10pct: 8.597
Median: 9.043
Avg: 9.281
90pct: 10.396
Max: 11.201
CPU Load: [in % busy (avg +/- std dev), 56 samples]
cpu0: 72.4% +/- 4.1%
cpu1: 96.3% +/- 2.0%
Overhead: [in % used of total CPU available]
netperf: 35.2%

Download: 86.95 Mbps
Upload: 93.61 Mbps
Latency: [in msec, 60 pings, 0.00% packet loss]
Min: 8.264
10pct: 8.405
Median: 8.834
Avg: 8.952
90pct: 9.562
Max: 10.668
CPU Load: [in % busy (avg +/- std dev), 56 samples]
cpu0: 80.0% +/- 5.4%
cpu1: 86.4% +/- 5.6%
Overhead: [in % used of total CPU available]
netperf: 49.0%

Download: 89.26 Mbps
Upload: 93.50 Mbps
Latency: [in msec, 60 pings, 0.00% packet loss]
Min: 8.431
10pct: 8.602
Median: 9.231
Avg: 9.554
90pct: 10.538
Max: 14.159
CPU Load: [in % busy (avg +/- std dev), 56 samples]
cpu0: 82.2% +/- 3.9%
cpu1: 89.8% +/- 4.3%
Overhead: [in % used of total CPU available]
netperf: 51.6%

Using a local netperf server the CPU load is less balanced beteen both cores, and netperf used significantly less CPU (35% vs 50%).

The results still suggests its possible to get a rough indication of where the CPU limit lies for a givven CPU.

EDIT #1: fixed typo

1 Like

The AC68U runs pretty cool and I see no realistic difference in temperature in regular use between different frequencies. Did you notice overheating at 1400 ? I played around with these too, and my temps always remain around 61 C for regular use. Also, I can see sqm working fine for my 150/10 connection with default clocks. Bumping it up to 1400 MHz hasn't really shown any improvement in anything other than synthetic compute tests.

Reading at snb forums some people mentioned issues with high temperatures but I was surprised to find the temperature of my AC68U rev A2 staying around 65 degrees.

Running it at 1400 Mhz I noticed barely any change in temperature at all but if I remember correctly, the router got slightly more unstable getting a reboot once in a while. Yet feeling with my hand the router didn´t get any warmer. Tried adding a 120mm fan to cool the router but I noticed no big change in temperature but the router stabilized. So apparently some component got cooler, although the router never felt hot nor reported high temps at all, regardless of using fan or not. Makes me question the accuracy of using /sys/class/thermal/thermal_zone0/temp to get temps - or if the stability was impacted by something else.

Then I totally lost my mind and went directly for 1600 Mhz. The router still booted but then got into a loop after some time. Managed to restore settings using Asus Recovery Utility after a period of cool-down.

I tend to spend way too much time to reach marginal benefits which probably isn´t very noticeable in real world usage. All these rabbits holes are just too tempting..! :slight_smile:

Yes, I don't think any freq more than 1400 is supported. I tried increasing that too, and had a similar experience. Another factor is that since there is no wifi with lede, at least those parts are not even powered on presumably. So that also helps keep the overall temp in check. The other 68U that I'm using as an AP with merlin fork shows a higher temp of 72C for the cpu with 1200 MHz. Also, this is actually a pretty good cheap setup for < $100 with two AC68Us. I imagine it will handle sqm fine up to 300 Mbps.

Almost, one issue the grand average CPU load hides is that the CPU might be cyclically overloaded, say at 50% load over a 10 second test the CPU might be running at 100% for 5 seconds and zero for the other 5 seconds; in that case expect bad shaping performance. Now, having the average load is really much better than having no measure of load at all, but it is not exactly temporally high-resolved enough to show "micro-stalls".

Hope everyone enjoyed the holidays and all the best to you in the new year!

The code has been fairly stable and I've only found a couple of minor issues, for which I've updated the package version and link in the top post. I also added some guidance on installing and running speedtest.sh from a LAN-connected Linux server, which can be useful if your router is heavily CPU-bound when running the script.

@gechu Erik, well done posting your detailed test results! I didn't realize the AC68U was so capable (reaching ~180 Mbps aggregate) although it's a shame the wireless chipset is Broadcom. I see them on sale cheap occasionally but never looked too closely. Do you normally run it at fixed frequencies or was that only for testing? Is there a problem with the Linux frequency scaling driver perhaps? According to WikiDevi the default CPU speed is 800MHz so that might be safer (if true) than 1200MHz. You've got me curious about this device now...

Glad to see you tried putting your own upstream netperf server on the WAN side. This definitely gives you more flexibility for testing. If you haven't already, you might also try using a "downstream" server/VM connected to your router LAN port and running both netperf and speedtest.sh (see notes in top post). This would let you test the limits of forwarding/SQM on your router without any on-board test overhead.

From the on-board tests you did however, you are right to think you're within your normal CPU limits, given the netperf overhead.

BTW, on OpenWRT you're safer to install the .ipk package from my repo, since that will take care of dependencies, permissions and upgrades.

I was thinking the same, or just use half the budget for a cheap wireless AP with great wifi (e.g. Atheros) but not-so-great NAT/routing/SQM.

Take care!

1 Like

Anytime! Happy to share!

When setting cpu frequency I think it sets the upper cap and the frequency is allowed to scale up n down (hence not fixed).

800mhz is the default speed for the revision of the router I own. Asus released a model with a trailing "P", ac68p which runs at 1400mhz by default.

I experienced no issues running the cpu at 1200mhz which is a crazy 50% above default frequency. Even clocked memory from default 666 to 800.

Thanks for the hint how to install the speedtest package in a better way.

@guidosarducci thanks for you contribution and keep up the good work!

2 Likes