Archer C7 5GHz performance, sirq 99%

Hi guys,

I am trying to wring maximum WiFi performance from my fleet of Archer C7 routers (configured as dumb AP's) running OpenWrt 19.07.1 r1091. I am using 80MHz wide channels @ 5GHz and clients are connected by highest speed, using proper 866.7 Mbit/s, 80MHz, VHT-MCS 9, VHT-NSS 2, Short GI.

However, I am never able to pass 330Mbit in iperf3 (measured to wired PC. I tried both directions, multiple streams etc.)
When I run top in router itself I can see that my sirq is hovering around 99% during transfer. Is my WiFi throughput capped by CPU or is there something else I can do to gain maximal performance (I am fully aware that it will be less than PHY 866Mbit)?

wrt7

P.S. I have just replaced ath10k-firmware-qca988x-ct with ath10k-firmware-qca988x and throughput rose from ~320Mbit to ~370Mbit with sirq still pegged @ 99%.

Iperf with the archer c7 or is the archer c7 in between two computers (the second better represents reality because the archer c7 is not generating traffic, more routing traffic in real life)?

Wired backbone between your fleet of devices?

What does you bandwidth looking like with fast.com and speedtest.net?

This is LAN-only bandwidth measured between iPhone 8+ running iperf3 server and Ethernet connected PC running iperf3 client (I tested other way around as well). There is no NAT/routing involved, C7 only acts as dumb AP. speedtest gives roughly same speed (I have Gbit connection and wired x86 router, so it is basically 802.11ac that is the bottleneck here (1:st world problem, I know)

P.S.
Looking at this thread, this seems to have been the case even in early LEDE:

Perhaps there is some Archer C7-optimized build that can offload some of CPU job even in AP setting?

Are you running the Performance CPU governor or more aggressive ondemand CPU settings?

These are the ondemand Linux defaults (not good):


cat /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
95

cat /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor
1

These are common ondemand settings tweaks:

echo 35 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold; echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

I upgraded from ancient netgear r6250’s for APs about 6 months ago (3x3 first gen 802.11ac) to r7800s when they were on sale. Gained on average 100-200mbps max wifi throughput and 200mbps wired line speed for gig wan. On sale it was worth it for me.

Maybe you can try overclocking the CPU and see if it makes a difference?

There are the Breed (https://breed.hackpascal.net/) and a modified uboot (https://github.com/pepe2k/u-boot_mod/pull/229).

Cheers.

Unfortunately, this is Archer C7 with Mips74k, it does not have those configs in its Linux distro.

Wired line speed (using my C7 as combined edge-switch/AP) is close to 1Gbit, so that works for me. It is just 802.11ac that is pegging sirq so I cannot fully utilize 80MHz channel. Otherwise, everything works fine

Interesting. Thought it was a linux universal setting. Guess it might be in another folder or could be custom set.

OK, I will try to overclock one of my C7's to 1GHz and test throughput. For science! :slight_smile:
BTW, looking at benchmarks on smallnetbuilder, it seems that actual useful data rate on majority of home routers on 802.11ac @ 886Mbit PHY tops around 400Mbit, so I guess it is not bad after all.

1 Like

First of all, set the performance governor. You should be able to do this. The ondemand one will create latency as the cpu needs to spool up to max. This script will let you set the governor or show you how to set it, whatever you choose. Unless you're using bash the script will not work as it uses a bash-specific regular expression comparison.

#!/bin/bash

CPUS=$(grep -c ^processor /proc/cpuinfo)
AVAILABLE_GOVERNORS=$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors)

show_governor() 
{
	echo "available governors: $AVAILABLE_GOVERNORS"
	for i in $(seq 0 $(expr $CPUS - 1)); do
		echo "cpu$i: $(cat /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor)"
	done
}


case "$1" in
	"")
		show_governor
		exit 0;;
	*)
		;;
esac

if [[ "$AVAILABLE_GOVERNORS" =~ "$1" ]]; then

	for i in $(seq 0 $(expr $CPUS - 1)); do
		echo $1 > /sys/devices/system/cpu/cpu$i/cpufreq/scaling_governor
	done

else

	echo "$(basename $0): unknown governor $1"
	echo "supply valid governor type or no arguments to show current available and current governors"
	exit 1

fi

Make sure you get the iperf direction right. To simulate download with the iperf server on the lan side and the client on the wlan side, the client needs the -R flag, otherwise the client sends data to the server (upload effectively) and then you're getting contention on the medium with the other clients. With the -R flag the server will send data to the wireless client and it's not contending for the medium with anything else, since the openwrt server itself schedules the download.

The sirqs are almost certainly being generated by network TX or RX. You can see by

cat /proc/sofitirqs

Do it before and after a iperf run, copy/paste the two outputs into an excel sheet using the text import paste function and check the difference to see where they're being generated. Depending on the direction of your iperf transfer most of the softirqs will likely either be in NET_TX or NET_RX.

Once you've got the iperf direction flow right and you've checked the softirqs, you may get some performance enhancement if you tune the smp_affinity of the irqs on your network adapters. For example, if it is the NET_RX ones doing the interrupts, put half of the network adapter hardware irqs on one core and half on the other core to spread the load a bit - the kernel will likely do the softirq on the same core as the initial hardware irq for that flow.

You can see the hardware irqs in /proc/interrupts and you pin them to a core by echoing a hex mask to /proc/irq/<irq>/smp_affinity, where in a two core system the binary mask 11 or hex 3 means use both cores, 1 means use core 0 and 2 means use core 1. On an untuned system, you'll likely see most network card irqs cluster on core 0.

Are you using sqm scripts or some kind of qos? If so, this will definitely generate a whole lot more softirqs and you should turn it off.

EDIT: you're unlikely, even in a cpu-unbounded state, to get wireless performance much above 500mbps on a 2x2 mimo client, no matter whether you use 80Mhz channels or not....

Hi,

Most of my sirq's are NET_RX so I am likely fully CPU-bound. No sqm is used, this is just an dumb Access Point.

Regarding tuning of core affinity: this is Archer C7. It is powered by single-core QCA9558 @ 720 MHz so there is not much to tune. There is no "scaling_governor" in /sys/devices/system/cpu/cpu0/cpufreq/ as there is only one core.

BR

Ah, I was under the impression when looking at a spec of the product that the cpu is a dual core...

So then yes, you look to be cpu bound.

OK, I have just overclocked it to 1GHz and it hit 511 Mbit/sec in iperf3 at roughly 90% sirq. So it seems that around 920MHz overclock is a sweet spot where CPU is no longer limiting factor (at least for 2x2 866 PHY ac). Everything above that will not make WiFi go faster.

That is 62MB/sec true wireless transfer rate on 7 years old router costing 40$. Not bad. :slight_smile:

3 Likes

Wow, how did you overclocking?

It was quite easy. Look here:

Also, make sure to configure DDR CAS to 5 or router refuses to boot. I had it up to 1GHz w/o any issues but took it down to 920MHz in order to protect the CPU.

Next step is to find true 3x3 802.11ac client so I can check whether 1200Mbit PHY is also capped by CPU. If not, I would not need WiFi 6 for a long long time :smiley:

1 Like

Very interesting... I have had a C7 as my main router/wifi, now it's doing AP duty with an x86 router box. It used to be the cpu/idle time ran out due to SQM load, around 100-140mbit, that was my issue on a 300mbit link. I thought I could do better than that w/o SQM, but was limited by my basic ISP speed which would cap out at 300-350. Router easily handled that over the wire, but seemed to run into a wall at 270-280mbit on 5ghz. Im now seeing that again, idle 0%, sirq 95% over the 5ghz radio. BUT... I didn't see it at first since I was running at 40mhz not 80mhz! Set to 40mhz bandwidth, I see a speed of 240-250mbit, but still 20% idle and 80% sirq or better. Hmmm...

The interesting note is that others and myself have been seeing a stopping of the wifi, usually the 2.4ghz radio, with high traffic. Been hard to figure out, seems dependent on heavy traffic, also seems dependent on the ath10k (5ghz) firmware/driver somewhat, though that didn't seem to make sense. Wondering if its some kind of driver lockup provoked by running out of system resources? Have to pass this idea along...

Never did try the overclocking...

Yes, I experienced it to. I tried everything and my current conclusion is that Archer C7 2.4GHz is broken in OpenWRT (due to bug in driver blob?) and should not be used. I tried swapping out ath10k drivers and it did not work (quite plausible as ath10k is 5GHz driver).

Yes, using stock clock you are CPU-bound ~330MBit with 80MHz channel on 5GHz. Actual maximum seems to be ~520Mbit (using WPA2 PSK CCMP, maybe more with less encryption).