Why the switch to unstable ath10k-ct?


#32

I don't think anyone really wants to do reverse engineering unless you want to fork out a lot of cash, even then it would be quite a task. Just look at the Allwinner video acceleration campaign for instance.

That said, I'm a bit surprised with the persistance of sticking with -ct since it does seem to cause more issue than it fixes overall(?) as OpenWrt is usually very quick at backing out commits that breaks things. Ben is trying to fix issues but I would be a bit concerned for how long he can keep up since it's a one man band project to my knowledge.

Regard the bugs, realistically I'm not sure how much time they want to spend on troubleshooting a specific device that's been discontinued for more than 2 years (as of writing). I'm sure there are underlying issues but I would guess that they're trying to cover as much ground as possible with as little effort as possible. I'm sure you'd get much more attention if lets say iPhone 8 would crash the firmware as its a much larger userbase. In that regard I'd guess the same goes for mwlwifi, there are edge cases etc and it's overly naive to demand that they should reproduce all scenarios reported.

Worth keeping in mind is that 11ax is starting to pop up so focus is most likely shifting on a corp level as sales > *.


#33

As I am the author if the linked kernel bug report, I'd like to chime in. The issue is gone for me for quite while now. I am not exactly sure what solved it, but I have suspected it might have been the ieee80211w setting (management frame protection), that I disabled at some point. At least that was the only configuration change I made between the bug report and when the issue didn't occur anymore. I did however also update my OpenWrt builds from time to time, so who knows. Another reason I didn't try to track this down more is that one of the two Nexus 5X devices died in the meantime and with only one device it took more time to trigger the bug.

And one word of fairness regarding QCA: I did take this bug to the ath10k mailing list (if I recall correctly) and I did get a response from a QCA employee asking for further info. I tried to collect as much as I could, but after that I never heard back anymore. So, I can't say nobody cared at all. But maybe they didn't care enough or they could simply not reproduce it, I don't know.


#34

Uptime 11 days, new crash (not resulting in a reboot):

http://sprunge.us/DOlZ7s


#35

Happy New Year, Ben!

So, I took the time to test the ath10k-ct firmware and driver against their upstream counterparts. The platform I used is qca988x-based (TP-Link Archer C7 v2).

First things first, the gap that I saw in throughput measurements before (last time I did this was more than a year ago) is gone. The performance is very similar now. Both drivers/firmwares allow for more than 400 Mbit/s throughput between the access point and a 2x2 ac wireless client (a 2014 or 2015 MacBook Air). I tested throuput using iperf with repeated measurements and averaged their results. The server was connected by wire to the same switch as the Archer C7 v2 (which runs in access point mode).

I did notice a few things though:

  1. With the upstream firmware and driver, there's not much difference between the client being the sender or receiver of data. The results are all between about 410-430Mbit/s (with UDP transfers more towards the upper boundary and TCP more towards the lower). This is not the case with the ct driver and firmware. Here performance is slightly lower when the client is receiving data (380-400Mbit/s). This is noting I'd be worried about at all, but the difference seems to be consistent in both TCP and UDP measurements.

  2. I did test the combinations ath10k-ct driver with upstream firmware and ath10k-upstream driver with ct firmware as well, albeit not as long. While the combination ath10k-ct driver and upstream firmware worked fine, the upstream driver didn't seem to like the ct firmware. In fact, it was horrible. Packet loss was extremely high and the throughput only 8-10 Mbit/s!

A few more notes on the testing I did: Testing was done at night, so there wouldn't be as much wifi traffic in the neighborhood. The MacBook was places about 2-2.5m away from the access point (without obstructions) and was not moved during testing. The commands used for testing were:

iperf3 -c <server-ip>   # TCP client to server
iperf3 -R -c <server-ip>   # TCP server to client
iperf3 -u -b 867M -c <server-ip>   #UDP client to server
iperf3 -R -u -b 867M -c <server-ip>   #UDP server to client

I'm not an iperf expert, but I chose 867 Mbit/s here as it is the theoretical maximum bandwidth of the 2x2 client and then looked at the actual throughput taking the packet loss into account.

I used recent builds from the master branch (ar71xx) for testing.


#36

escalade: I found your crash, but in general, please open bug at ath10k-ct instead of just posting to forums, I don't reliably read forums.

silentcreek: For your performance results, do you mean that the download speed is less than upload when using ct firmware?

For now, I plan to ignore the issue of stock driver not liking the ct firmware. I can debug it later in case there is ever a real need to run that test case.


#37

@hnyman

How do you avoid them being selected by default again by a defconfig? Which then results in this:

Collected errors:
 * check_data_file_clashes: Package ath10k-firmware-qca99x0-ct wants to install file /media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/target-dir-f7e20ea3/lib/firmware/ath10k/QCA99X0/hw2.0/board-2.bin
	But that file is already provided by package  * ath10k-firmware-qca99x0

#38

Yes, downloading (iperf with the -R switch on the client) was slightly slower than uploading on the client. Not as much as I would care about it, but I found it interesting as it showed a pattern both with TCP and UDP. However, it didn't occur when I coupled the upstream firmware with the ct driver, so I assume it's related to the firmware itself.


#39

If you run make menuconfig, deselect the ct driver and firmware package and select the original ath10k driver and firmware package, you will and up with these lines somewhere in your configuration:

CONFIG_PACKAGE_ath10k-firmware-qca988x=y
# CONFIG_PACKAGE_ath10k-firmware-qca988x-ct is not set
[...]
CONFIG_PACKAGE_kmod-ath10k=y
# CONFIG_PACKAGE_kmod-ath10k-ct is not set
# CONFIG_PACKAGE_kmod-hwmon-core is not set

If you run ./scripts/diffconfig.sh > my_diffconfig, these will be preserved.
Note: At least on my platform the ath10k-ct driver depends on the kmod-hwmon-core package, while the default driver does not (nor does any of my other packages), so it's deselected as well. Depending on your hardware/configuration, you may still need the kmod-hwmon-core package.

In addition, you can also choose to build one of the firmware packages as a module that does not get included in the firmware image. That should avoid the error as well and gives you the option to manually install the other firmware package (using the opkg --force-overwrite option), so you can test the two firmwares against each other. In this case I would make sure to make a backup of the file /lib/firmware/ath10k/QCA99X0/hw2.0/board-2.bin though, as it will get deleted once you uninstall either of the two firmware packages.


ATH10K wifi instability on TP-LINK Archer C2600
#40

Escalade: I opened a bug for your crash file, see link below. It has a binary you can test:


#41

silentcreek: I opened a bug to track your 988x slowdown report.

At this point, I don't plan to work on it, but good to have the bug open in case others see something similar or I get interested in debugging it more in the future.


#42

Hello everyone I have a UBNT AP AC Lite with openWrt 18.06.1.
I want to use the ath10k-ct driver, but as a newbie I don't know how to switch to it. The package for the CT driver says. "This firmware will NOT be used unless the standard ath10k-firmware-qca988x is un-selected"

How does one "un-select" a firmware?

Do I just install the "ath10k-firmware-qca988x-ct" package and uninstall the "ath10k-firmware-qca988x" package and reboot in order to "un-select"?

Is there a guide for this? I tried all the google-fu I could to find answers.


Optimized build for the TP-Link C2600 / Netgear R7x00 / Linksys EA8500 / Zyxel Armor Z2
#43

Deselecting primarily refers to building OpenWrt from source (there make menuconfig (kconfig) gives you an ncurses (text-)GUI to select/ deselect options), if you use a prebuilt image, like 18.06.1, you do indeed use opkg to remove ath10k kmods and firmware and then install their ath10k-ct variants instead.


#44

@slh

Thanks for the help! To confirm I did the following opkg commands below... It seems to work, but my it is unstable under dumb AP mode. https://openwrt.org/docs/guide-user/network/wifi/dumbap

Unstable as in once I start a speed on my phone. It suddenly disconnects and web GUI is frozen. After waiting web GUI comes back up.

However when my UBNT AP AC Lite is in router mode with the eth0 set as WAN and the WiFi chips on LAN it's stable.

opkg update
opkg install ath10k-firmware-qca988x-ct
opkg install ath10k-firmware-qca988x-ct-htt
reboot
opkg remove kmod-ath10k
opkg remove ath10k-firmware-qca988x
reboot


#45

I'd do the removals first - and you can only install either ath10k-firmware-qca988x-ct XOR ath10k-firmware-qca988x-ct-htt; furthermore ath10k-ct has seen significant improvements past 18.06.1, which means the 18.06.1 versions are not quite representative to the current ones (master/ snapshots).


#46

I see how do I get the master / snapshots on openwrt?


#47

https://openwrt.org/downloads

https://downloads.openwrt.org/snapshots/targets/


#48

For me the ct driver+firmware is unstable in the TCP/UDP connection sense - no crashing. I use TP-Link Archer C7 v4 router. I saw lot of repeated "authenticated" lines from hostapd, "disassociated" due to inactivity (during an active Skype call...), the Skype call had problems sharing the screen (it disappeared three times during 40 minutes of Skype call) - all seen via logread command. No error from the driver in logread, though (I haven't checked dmesg, sorry). I recompiled the firmware with non-ct driver+firmware and just repeated the Skype call (~90 minutes) with screen sharing and there was not a single line in logread, it just works™.


#49

That the ath10k-ct drivers still do not support 802.11s, at least as far as I can tell from the documentation and the output of iw, is a show stopper.


#50

On what model?
On IPQ40xx 802.11s works on ath10k-ct on both radios while on ath10k it only works on 2.4GHz


#51

On ar71xx / ath79, Archer C7 v2 and GL.iNet AR750 both support mesh on 5 GHz with the ath10k driver (non-CT). I have been sucesfully running 802.11s on 5 GHz on the Archer C7 units for close to a year now.

"mesh point" is absent from the output of iw phy with the ath10k-ct driver and firmware for at least the 5 GHz band.