Why the switch to unstable ath10k-ct?

So ever since the switch to ath10k-ct, I've had to roll back to the old driver for my C2600 due to unstable wireless and kernel panics resulting in reboots. Seeing as the driver and firmware recently got upgraded in master, I've tried it again. Here's my dmesg and as you can see it's not working very well:

http://sprunge.us/IXXDs7

I've seen others (with ipq806x at least) say the same thing, just wondering why the switch to an obviously unstable driver as default?

The driver performed better than ath10k on a number of boards, it offers better instrumentation for firmware crashes, can be updated independently of mac80211 and interfaces with the -ct firmware which - in contrast to the official one - has at least a theoretical chance to receive bug fixes.

Before deciding to switch, the driver wasn't obviously unstable on the devices being tested.

1 Like

I suppose it might be a bit too harsh for me saying that :slight_smile: Are there any tickets regarding this I should add some info to?

For me after the recent ath10k-ct FW update it crashes too

They recently updated the -ct drivers and firmware...

https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=4df3c71cd4c59d80374dceb5267ecee5b91931ad
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=cc5c63f217e6ca29959b5c61ed5de690a42d9fbf

Before the update, wireless on my R7800 was erratic on the -ct driver and trying to use 802.11w was a no-go. Always had to compile with the older QCA driver for things to work properly.

Since the updates 2 days ago, the -ct driver significantly, and consistently, outperforms the QCA driver in iperf testing with my R7800. 11w got fixed too. Until that update tho... yeah... bumpy road.

1 Like

There is a good bug reporting guide on the candela site here.

This site refers to the ath10k-ct github site which has a few issues here.

I plan to make contributions of my own - 2.4 and 5 GHz crashed on start, 5 GHz crashed several times subsequently but so far the router has recovered - i.e. no reboots just firmware crashes.

I've had a r7500v2 for just under a month now and I'm pretty disappointed with qualcom/aetheros. That said, Ben Greear seems like the best hope for turning it into something useful so I'm happy to test while I have the time.

There are some helpful comments about the ct driver/firmware on dd-wrt forums (latest kong builds tried it and went back to sock - doubt they tried this latest iteration) as well places like this and comments like this. Pretty technical but interesting and they all indicate issues for some time now.

1 Like

Hello,

With the recent switch to the new ath10k-ct firmware, the wave-2 firmware (4019, 9888, 9980, 9984)
moved to a new firmware tree that is a rebase of my patches onto newer upstream QCA firmware. There were a few regressions I fixed yesterday, and likely more to come. But, since the base is newer and the git tree cleaner, then I have a lot better chance of fixing regressions and bisecting problems, so I expect this will pay off after a few weeks of ironing out regressions.

Please open bugs on my ath10k-ct github project if you see crashes, or check recent bugs for your platform for new FW attachments to try. I'll try to upload new images for all wave-2 in a day or two, once I get feedback from the current bug reports.

4 Likes

I'm also somewhat surprised by this change. I generally have a tendency to stay as close to upstream as possible unless there are striking reasons to do otherwise. And even if they are, I'd rather try to resolve them upstream first...

But anyway, back to the matter at hand. Last time I tested the ct firmware and kernel module against their upstream counterparts, at least on my devices I found that the performance was inferior and on top of that it seemed to decrease further the longer the access point was running. Back then I only had one client device with 2x2 ac wireless to test with, so that is certainly not representative and things might have changed in the meantime. But still, I don't see a striking reason to switch the default as long as both options seem to come with issues for one group or the other.

I'm quite happy with the official firmware myself. I have no issues with that fw and R7800 right now. I'm sure the CT driver will be better over time but i think the switch was a bit premature.

I'm building myself anyways, so it's not really that big of a deal. I guess it's reasonable that master is a little bit experimental. I did however have to modify the target makefile and default profile a bit in order to build with the non-ct driver, then again perhaps there is an easier way?

In my C2600 I had many problems with CT drivers in different client wifi devices. The solution, go back to non CT drivers and the problems were over.

Maybe in a short/long time I'll try again CT drivers, but right now I prefer non CT.

@greearb I benchmark both qca-ct and qca firmware for 988x Wave 1 chipset, performance is lower on the ct vs the original qca

1 Like

For those of you experiencing slower performance (or other bugs) with CT firmware and/or driver, please note that the CT driver can work with the stock firmware, and the CT firmware (non-htt-mgt version) can work with the stock driver. It would be good to know if any slowdown or other bug is related to the firmware or driver.

And, if you are going to report that CT is slower (or faster), please detail your test and actual results.

I would expect that it should be very easy to select CT or stock firmware or driver w/out having to edit code, so if some platform made CT fw mandatory, then I think that is probably a bug that should be fixed.

Thanks,
Ben

I am simply disabling -ct and enabling old normal ath10k in my seed .config file for R7800:

# Normal ath10 wifi firmware and driver instead of -ct
CONFIG_PACKAGE_ath10k-firmware-qca9984=y
# CONFIG_PACKAGE_ath10k-firmware-qca9984-ct is not set
CONFIG_PACKAGE_kmod-ath10k=y
# CONFIG_PACKAGE_kmod-ath10k-ct is not set

Ps. the current -ct seems somewhat better on connection stability than the previous versions.

1 Like

the non ct version is also far from perfect on my zyxel nbg6617 (for example changing wifi channel kills the wifi interface), but atleast it doesn't crash multiple times per day...

i already reported the crashes on github, lets see :slight_smile:

The ath10k-ct firmware was updated some hours ago, this version should fix 3 crashes for wave-2 devices, please test if you still have these problems:
https://git.openwrt.org/c0248183a49a9830a4a2458e54e83fa8a3c646c9

I went to using just the ct-driver for now and the 3.9.0.1 official fw. That combination seems ok.
The issue I had with CT fw a while back was low speeds and frequent disconnects, anyone know if that's fixed?

I don't know about r7800.

For r7500v2 (ipq8064/qca99x0) running ath10k-ct driver/ct-htt firmware (10.4b-ct-9980-fH-012-648296d) I've had almost 2 days uptime, 1 firmware crash on phy0 (last night) which it looks like has gracefully recovered.

I've have an android device that previously could not stay connected for more than ~12 hours, usually less than 8 hr (any firmware/driver with this router), connected the entire ~ 2 days uptime.

I saw this same android device successfully go through a GTK rekey event (previously no device seem to stay connected through these).

This is the most stable this router has been on any driver/firmware I've tried in the limited time I've had it. It's not yet the 100+ days uptime I'm used to getting with other devices but it seems to be moving in the right direction.

1 Like

For me the recently updated bugfixed CT firmware does not crash anymore, but ath10k_ahb a800000.wifi: Invalid legacy rate 26 peer stats and ath10k_ahb a000000.wifi: Invalid VHT mcs 15 peer stats have completelly filled my dmesg.
@greearb Any chance of disabling those warnings as they accur on any firmware version both from QCA or CT

With the latest firmware update there's no errors in dmesg anymore. Will report back regarding stability.