I think I figured out what goes wrong with the CPU utilization stats:
It does not take into account the CPU frequency scaling that is available on some targets, like ipq806x for R7800. Apparently the stats only read the utilisation at the current frequency.
Below is proof: first a run with the default "ondemand" CPU scaling governor, where CPUs idles at 384 MHz and under load can scale frequency up to 1700 MHz. Then a run with "performance" governor, where CPU is always at 1700 MHz.
On the first run the measured CPU load is 41/43%, while on the second run the reading is just 10/14% reflecting better the true utilisation of the CPU's power. On both runs the speeds and latency are identical, so the true CPU load is also identical.
This will be visible at least with ipq806x routers, plus likely also x86 ( and mvebu if you use the scaling driver patch).
root@router1:~# echo ondemand > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
root@router1:~# echo ondemand > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
root@router1:~# /tmp/speed.sh -c -H netperf-eu.bufferbloat.net
2018-11-07 22:05:16 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-eu.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
............................................................
Download: 98.23 Mbps
Upload: 8.34 Mbps
Latency: (in msec, 57 pings, 0.00% packet loss)
Min: 36.952
10pct: 39.048
Median: 46.507
Avg: 45.769
90pct: 49.395
Max: 51.590
Processor: (in % busy, avg +/- stddev, 58 samples)
cpu0: 41 +/- 4
cpu1: 43 +/- 2
Overhead: (in % total CPU used)
netperf: 48
root@router1:~# echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
root@router1:~# echo performance > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
root@router1:~# /tmp/speed.sh -c -H netperf-eu.bufferbloat.net
2018-11-07 22:06:28 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-eu.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
.............................................................
Download: 98.23 Mbps
Upload: 8.32 Mbps
Latency: (in msec, 60 pings, 0.00% packet loss)
Min: 10.915
10pct: 40.384
Median: 45.772
Avg: 44.483
90pct: 48.308
Max: 49.213
Processor: (in % busy, avg +/- stddev, 59 samples)
cpu0: 10 +/- 3
cpu1: 14 +/- 3
Overhead: (in % total CPU used)
netperf: 33
CPU speed stats: