Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

Hi quarky,

Are you using the ath10k driver/firmware or the ath10k-ct driver/firmware on your setup?

Thanks.

Hi D43m0n,

I did the same according to your original post and so far so good (also faster throughput than the default ath10k-ct), but I need to wait longer to see if the problem is no longer existing. I had to reboot my router several days ago due to some other non-related issue.

Thanks.

I removed the patches completely, while the aql workaround solved the occasional hang I had a crash with reboot once, so decided to remove the patches and the build is now rock solid. Since there are no important fixes in trunk I decided to let it run before I build and test flash a new one.

could this be related to wifi disconnections over the days? those command disable it correctly then? or do you have to compile without the patches?

I used the ath-10k driver and firmware.

If you are affected by the recent Wi-Fi issue for the latest 21.02 builds, and are doing your own builds, would you like to test the patch posted here?

My R7800 is not available to test at the moment, so I can't test it myself.

I'm currently building based off acwifidudes master branch. Would the patch work there as well?

In theory it should patch cleanly, as I think the ath10k driver has not been touched for a long while.

You may want to place the patch file in ath10k patch folder, as the last patch file.

Edit: I assume that you are also affected and can reliably reproduce the issue?

@quarky
Why don't you try with the latest Ath-10k firmware.
One R7800 in service has at least 6 WLAN clients constantly connected to 2.4 and 5GHz WLANs and most of the time there are around 10 or more. I cannot see any issues with Wi-Fi. Using an older master build from @ACwifidude.


I keep an eye on 5 other R7800 routers for any issues with Wi-Fi. Touch wood, for now all are OK. They use the default ath-10k firmware in heavily mixed WLAN clients environment. AN, AC, AX WPA2/WPA3 clients.

1 Like

The firmware used in the openwrt tree seems to work fine for me.

I'm using ath10k because I'm affected by this bug https://github.com/greearb/ath10k-ct/issues/139 with ath10k-ct.

With ath10k I'm experiencing a complete shutdown of the 5ghz SSIDs once in a while.
I'm assuming that is the recent issue?

Edit: compiling now

If the issue is caused by the new airtime scheduler, it should start sometime end Nov 21.

Edit: the above timeline is for 21.02. For master it started since end Oct 21 when master’s backport switched to 5.15.

That sounds about right yeah.

It's compiled and flashed now so let's see.

The patch did not shut down transmission completely? :stuck_out_tongue:

Haha.

Do let me know how it goes.

On my R7800 I use always current master build from @ACwifidude with ath-10k drivers. Since last week I put the latest ath-10k firmware too. I have up to 10 wlan clients WPA2/WPA3 connected to the Wi-Fi during the day.
For now cannot see any abnormal behaviour.

2 Likes

@sppmaster do your 10 WLAN clients connect and disconnect frequently?

For my R7800, the issue starts to manifest itself after 3-4 days uptime. The usage pattern is that my router's clients will connect and disconnect over the course of the day as I move my devices between locations. I have multiple APs at home. When this happens, as how I understand the new airtime scheduler behaviour, it will try to insert / remove client's transmit queues from their RB tree data structure.

Now the way that the new scheduler is coded, when a new transmit round is initiated, the left-most node (of the txq) of the RB tree will be the first to be selected for transmission, if it satisfy the airtime limit imposed. Else the next node to it's right will be selected and it goes on until it's done.

So what I found is that the ath10k driver seems to be doing something funny (hence my suggested patch) in that it is not scheduling the txq mac80211 is asking to be scheduled, but instead find another txq to schedule. This in itself seems to be a bug to me. In addition, the driver will sync with the firmware on transmit txq accounting. I guess it probably didn't manifest itself with the old round-robin scheduler as all txq will get their fair-share of transmit time eventually. With the new scheduler, since it always starts from the left-most node of the RB tree, potential starvation of transmit time may occur, hence the observation of high latency seen by myself and others affected.

I still do not understand why some folks like yourself did not encounter such issues tho.

Okay so this morning I turned off airplane mode on my phone and after connecting wifi initially worked and then after about 10 minutes stalled to a halt. Reconnecting didn't fix anything either. So I did /etc/init.d/network restart on the router and everything is working again. I'll keep an eye on this.

Hmm ... I'll have to dig deeper into the issue then, but I'm quite confident that the patch I proposed should make things better, as the original code do not make much sense to me.

Do you noticed anything that's different this time round?

Yes usually my 5ghz wifi dies completely which didn't happen this time around so I'm hoping this was just a one time occurrence and that the patch actually does help. Fingers crossed :crossed_fingers:

1 Like

Actually most of the time I have 6-7 clients connected and they only disconnect/connect when leaving/coming home (at least several times a day) or are turned off/on. I have only one AP (R7800 in router mode) with two different SSIDs for 2.4 and 5GHz.
My phone switches very frequently between 2.4 and 5GHz because when I move to a distant room the 2.4GHz signal is better and it automatically switches from 5 to 2.4GHz (with different SSIDs) and vice versa.
Other clients are constantly connected to 2.4 or 5GHz.
For almost a year I was running a WDS connection via 2.4GHz to another OpenWRT client router that supported another VLAN. Never seen problems there either.
And I have one guest WLAN set up on 2.4GHz that I shut down when it's not used.
I use mixed WPA2/WPA3 encryption.
There may be a rarer conditions in your Wi-Fi setup that I don't have.

In the last hour I've tried to throw the gauntlet to my Wi-Fi. I've run multiple speed tests on six devices, played youtube videos, downloaded lots of updates from PlayStore, browsed web pages with Chrome.
All of this was accompanied with over a hundred Wi-Fi disconnects/connects (that I purposely initiated turning devices Wi-Fi on/off all the time) from all six devices. I couldn't see any delay at least with this setup.

Are there tools like wireshark that can be used for troubleshooting of similar network issues.