Why the switch to unstable ath10k-ct?

Uptime 11 days, new crash (not resulting in a reboot):

http://sprunge.us/DOlZ7s

Happy New Year, Ben!

So, I took the time to test the ath10k-ct firmware and driver against their upstream counterparts. The platform I used is qca988x-based (TP-Link Archer C7 v2).

First things first, the gap that I saw in throughput measurements before (last time I did this was more than a year ago) is gone. The performance is very similar now. Both drivers/firmwares allow for more than 400 Mbit/s throughput between the access point and a 2x2 ac wireless client (a 2014 or 2015 MacBook Air). I tested throuput using iperf with repeated measurements and averaged their results. The server was connected by wire to the same switch as the Archer C7 v2 (which runs in access point mode).

I did notice a few things though:

  1. With the upstream firmware and driver, there's not much difference between the client being the sender or receiver of data. The results are all between about 410-430Mbit/s (with UDP transfers more towards the upper boundary and TCP more towards the lower). This is not the case with the ct driver and firmware. Here performance is slightly lower when the client is receiving data (380-400Mbit/s). This is noting I'd be worried about at all, but the difference seems to be consistent in both TCP and UDP measurements.

  2. I did test the combinations ath10k-ct driver with upstream firmware and ath10k-upstream driver with ct firmware as well, albeit not as long. While the combination ath10k-ct driver and upstream firmware worked fine, the upstream driver didn't seem to like the ct firmware. In fact, it was horrible. Packet loss was extremely high and the throughput only 8-10 Mbit/s!

A few more notes on the testing I did: Testing was done at night, so there wouldn't be as much wifi traffic in the neighborhood. The MacBook was places about 2-2.5m away from the access point (without obstructions) and was not moved during testing. The commands used for testing were:

iperf3 -c <server-ip>   # TCP client to server
iperf3 -R -c <server-ip>   # TCP server to client
iperf3 -u -b 867M -c <server-ip>   #UDP client to server
iperf3 -R -u -b 867M -c <server-ip>   #UDP server to client

I'm not an iperf expert, but I chose 867 Mbit/s here as it is the theoretical maximum bandwidth of the 2x2 client and then looked at the actual throughput taking the packet loss into account.

I used recent builds from the master branch (ar71xx) for testing.

escalade: I found your crash, but in general, please open bug at ath10k-ct instead of just posting to forums, I don't reliably read forums.

silentcreek: For your performance results, do you mean that the download speed is less than upload when using ct firmware?

For now, I plan to ignore the issue of stock driver not liking the ct firmware. I can debug it later in case there is ever a real need to run that test case.

1 Like

@hnyman

How do you avoid them being selected by default again by a defconfig? Which then results in this:

Collected errors:
 * check_data_file_clashes: Package ath10k-firmware-qca99x0-ct wants to install file /media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/target-dir-f7e20ea3/lib/firmware/ath10k/QCA99X0/hw2.0/board-2.bin
	But that file is already provided by package  * ath10k-firmware-qca99x0

Yes, downloading (iperf with the -R switch on the client) was slightly slower than uploading on the client. Not as much as I would care about it, but I found it interesting as it showed a pattern both with TCP and UDP. However, it didn't occur when I coupled the upstream firmware with the ct driver, so I assume it's related to the firmware itself.

If you run make menuconfig, deselect the ct driver and firmware package and select the original ath10k driver and firmware package, you will and up with these lines somewhere in your configuration:

CONFIG_PACKAGE_ath10k-firmware-qca988x=y
# CONFIG_PACKAGE_ath10k-firmware-qca988x-ct is not set
[...]
CONFIG_PACKAGE_kmod-ath10k=y
# CONFIG_PACKAGE_kmod-ath10k-ct is not set
# CONFIG_PACKAGE_kmod-hwmon-core is not set

If you run ./scripts/diffconfig.sh > my_diffconfig, these will be preserved.
Note: At least on my platform the ath10k-ct driver depends on the kmod-hwmon-core package, while the default driver does not (nor does any of my other packages), so it's deselected as well. Depending on your hardware/configuration, you may still need the kmod-hwmon-core package.

In addition, you can also choose to build one of the firmware packages as a module that does not get included in the firmware image. That should avoid the error as well and gives you the option to manually install the other firmware package (using the opkg --force-overwrite option), so you can test the two firmwares against each other. In this case I would make sure to make a backup of the file /lib/firmware/ath10k/QCA99X0/hw2.0/board-2.bin though, as it will get deleted once you uninstall either of the two firmware packages.

1 Like

Escalade: I opened a bug for your crash file, see link below. It has a binary you can test:

silentcreek: I opened a bug to track your 988x slowdown report.

At this point, I don't plan to work on it, but good to have the bug open in case others see something similar or I get interested in debugging it more in the future.

Hello everyone I have a UBNT AP AC Lite with openWrt 18.06.1.
I want to use the ath10k-ct driver, but as a newbie I don't know how to switch to it. The package for the CT driver says. "This firmware will NOT be used unless the standard ath10k-firmware-qca988x is un-selected"

How does one "un-select" a firmware?

Do I just install the "ath10k-firmware-qca988x-ct" package and uninstall the "ath10k-firmware-qca988x" package and reboot in order to "un-select"?

Is there a guide for this? I tried all the google-fu I could to find answers.

Deselecting primarily refers to building OpenWrt from source (there make menuconfig (kconfig) gives you an ncurses (text-)GUI to select/ deselect options), if you use a prebuilt image, like 18.06.1, you do indeed use opkg to remove ath10k kmods and firmware and then install their ath10k-ct variants instead.

1 Like

@slh

Thanks for the help! To confirm I did the following opkg commands below... It seems to work, but my it is unstable under dumb AP mode. https://openwrt.org/docs/guide-user/network/wifi/dumbap

Unstable as in once I start a speed on my phone. It suddenly disconnects and web GUI is frozen. After waiting web GUI comes back up.

However when my UBNT AP AC Lite is in router mode with the eth0 set as WAN and the WiFi chips on LAN it's stable.

opkg update
opkg install ath10k-firmware-qca988x-ct
opkg install ath10k-firmware-qca988x-ct-htt
reboot
opkg remove kmod-ath10k
opkg remove ath10k-firmware-qca988x
reboot

1 Like

I'd do the removals first - and you can only install either ath10k-firmware-qca988x-ct XOR ath10k-firmware-qca988x-ct-htt; furthermore ath10k-ct has seen significant improvements past 18.06.1, which means the 18.06.1 versions are not quite representative to the current ones (master/ snapshots).

1 Like

I see how do I get the master / snapshots on openwrt?

https://openwrt.org/downloads

https://downloads.openwrt.org/snapshots/targets/

1 Like

For me the ct driver+firmware is unstable in the TCP/UDP connection sense - no crashing. I use TP-Link Archer C7 v4 router. I saw lot of repeated "authenticated" lines from hostapd, "disassociated" due to inactivity (during an active Skype call...), the Skype call had problems sharing the screen (it disappeared three times during 40 minutes of Skype call) - all seen via logread command. No error from the driver in logread, though (I haven't checked dmesg, sorry). I recompiled the firmware with non-ct driver+firmware and just repeated the Skype call (~90 minutes) with screen sharing and there was not a single line in logread, it just works™.

That the ath10k-ct drivers still do not support 802.11s, at least as far as I can tell from the documentation and the output of iw, is a show stopper.

On what model?
On IPQ40xx 802.11s works on ath10k-ct on both radios while on ath10k it only works on 2.4GHz

On ar71xx / ath79, Archer C7 v2 and GL.iNet AR750 both support mesh on 5 GHz with the ath10k driver (non-CT). I have been sucesfully running 802.11s on 5 GHz on the Archer C7 units for close to a year now.

"mesh point" is absent from the output of iw phy with the ath10k-ct driver and firmware for at least the 5 GHz band.

1 Like

On my Archer A7 v5, with an image built from git, ath10k-ct is noticeably slower than ath10k.

My 802.11ac device (a Pixelbook running a web-based speed test) reports a download speed of ~140Mbps using ath10k-ct, and ~250Mbps after reverting to plain ath10k.