Hi guys, I've installed 21.02 on TP-Link Archer C7 v2 , everything works well...more memory usage with samba4, but the router speed looks better now.
But, after a few time, a day or 2, the router seems to freeze, internet stops, the PPOE IP address is still assigned, but cannot access internet and Luci/SSH became very slow.
Any ideas what is or where start to looking at ?
ps.: even with high memory usage, the router have plenty of memory when this happen and not too much active connections as well ( eg.: 900/16000 ).
Have replace ct firmware with non ct:
ath10k-board-qca988x 20201118-3 qca988x board firmware
ath10k-firmware-qca988x 20201118-3 ath10k qca988x firmware
kmod-ath10k 5.4.111+5.10.16-1-1
After 1:30h testing with iperf3 same problem, wifi signal freeze again.
But, non ct firmware run for 1:30h, ct just 30min freezes. With non ct the wifi speed on 2.4G has increase as well, before max. 17 Mbit/s, now 35~45 Mbit/s.
I've replaced the ct for the non ct firmware, it working fine now for 2 days. If I ran iperf3 for ~2h the wifi stops, but in normal use I'm not experiencing any problems anymore.
But, as said before, the wifi speed seems increased with non ct firmware, lot better for me in gamming, netflix, youtube, etc .
If wifi stops again I will try and let you know.
Besides, whats the difference in ct firmware, any good reason to stick with it ? Can we propose to use it as default in next versions of openwrt for Archer C7 as it looks better ?
I'm still trying to understand the ramifications as well and am wondering if the wifi problem might also take out (or at least appear to) the LAN ports as well.
I'm running an Archer C7 V2 as a wireless AP (WDS link to my main router not a cable) and I was able to run iperf3 for over 5 hours no problem (2.4 and 5 ghz radios running at the same time). I installed ath10k-ct-full-htt firmware and wpad-wolfssl for 802.11r/k/v support. No idea if these drivers would help you or not but it would be an easy test for you to run.
Besides iperf, my installation is working fine now with non ct firmware. I'll stick with that for now until any further problem or when I'm understand the differences between the firmwares lol.
@Catfriend1 I saw in this topic that you try to reach devel list for talking about this issue. Did you find any reply about it ?
Maybe we can open a bug for this ath10k wifi problems ? Looks like ct firmware is a open-source version, so will be better try to find a solution for this firmware problems instead of replacing it with a closed version ?
@csantz I some time later did unsubscribe because of all those mails about other things after I reported the problem. So maybe there was a reply and I didn't get it - else no. I feel better staying on the forum and swinging by to check new posts from day to day.
If I would know more about wifi drivers etc. I'd make a bug report.
We are currently making progress in analysing symptoms and possible workarounds in Archer C7 2.4 GHz wireless dies in 24~48 hours - #173 by TopDog . Maybe someone experienced in wifi driver dev reads it and has an idea what root cause could lead to those recurring problem observations.
There seem to be notable differences between the 3 firmware options:
ath10k-firmware-qca988x: closed source firmware, seems to be stable and supports Mesh networks. does not seem to be maintained all that well.
ath10k-firmware-qca988x-ct: more actively maintained. enables additional features like IBSS. Seems to be unstable.
ath10k-firmware-qca988x-ct-full-htt: Improved stability for busy networks, fixes .11r authentication and enables 802.11k/v.
There is also a section at the bottom of this page talking about know bugs. Sounds like there is a way to trigger the firmware to crash but also a way to enable more verbose messaging.
CT firmware has a WMI message watchdog feature that can be enabled when using the CT patched drivers/kernels. The driver will send no-operation (NOP) message every second to the firmware. After the firmware receives one of these messages, if it ever does NOT receive the message for 5 seconds in a row after that, it will assert and crash. This allows the host to take recovery actions instead of just having the system effectively hang forever.