Not really the right thread for that. Ripping out unnecessary packages - or putting in all the packages you need so they are properly compressed as part of the image, is my first thought.
Thx for showing what location services "looks like" against the cosmic background bufferbloat radiation. It looks to me as though your wifi is not saturated in either direction, but we do see things get pretty fuzzy while that is going on. Open question is what that looks like on ath10k, 9k...
Otherwise things are working pretty good. It's difficult to recall all the different bugs we've encountered along the way, and it's my hope we get more testers in general soon. Going out and pursuing the bug-filers of so many "wifi is flaky" bugs, perhaps, or just waiting for the next release....
I theorize that perhaps the up/down problem might be related to txpower - and perhaps some of the problems those on this other thread experienced are related to the string of bugs on this thread. They also asked what version of openwrt the fixes landed in, and I've lost track...
One potentially useful thing to try is a long duration flent rrul test, stepping down the power every few seconds, to watch what happens.
Still something is still not as it should be. After applying @nbd patches (330-336, 337, 338 and 339) the speed, ping and network responsiveness for many users is good (but still not what it should be in my opinion, why for example with iperf are always higher than in tests from websites such as speedtest.net, despite the ability of the link (wan, wwan ...) to higher speeds - on average by up to 50%, regardless of signal strength, noise, distance, obstacles, numbers of clients ... ? - of course, we are talking about wireless connectivity, because everything is fine with the cable. But whatever... it must have always been a negative feature of every version of openwrt.
Random errors, with which the log is buried for last 5 and 1/2 days, in practice cause a random client to be temporarily disconnected from the station for several tens of seconds; it helps to wait or sometimes restart the connection at the client (Android).
[153461.316027] ath10k_ahb a000000.wifi: Invalid peer id 139 peer stats buffer
[155486.431578] ath10k_ahb a000000.wifi: Invalid peer id 142 peer stats buffer
[232917.816463] ath10k_ahb a000000.wifi: Invalid peer id 272 peer stats buffer
[233326.526645] ath10k_ahb a000000.wifi: Invalid peer id 273 peer stats buffer
[247746.452125] ath10k_ahb a000000.wifi: Invalid peer id 285 peer stats buffer
[386441.138656] ath10k_ahb a000000.wifi: Invalid peer id 359 peer stats buffer
[432035.903251] ath10k_ahb a000000.wifi: Invalid peer id 394 peer stats buffer
I am seeing the same issue as well. I see a ton of:
ath10k_pci 0000:01:00.0: received unexpected tx_fetch_ind event: in push mode
and
ath10k_pci 0000:01:00.0: Invalid peer id 61 peer stats buffer
and
ath10k_pci 0000:01:00.0: failed to lookup txq for peer_id 351 tid 0
I am seeing clients randomly get disconnected or traffic stopping completely on clients for a few seconds (I am not sure exactly how many seconds). I am testing primarily with macOS clients - macbook pro, m1, 2x2, macOS 12.4
I don't know if it helps or not- practical tests say it is unlikely to spoil more than it already is. The patch is probably from 2015 or 2018, searched somewhere on the internet and I don't think it was ever included in any official branch. Bugs in log are, but is their less.
In kmod-ath10-ct it is seems to be structured even differently.
@dtaht
It seems that sjpacket was still using the old mainline QCA9984 firmware v131 (several years old) that is currently available in the 22.03 branch. I had all kinds of weird problems with that old firmware v131, including total loss of WIFI connectivity overnight. The new board-2.bin and mainline firmware v157 have gotten rid of all such problems for me.
@sjpacket: please upgrade to the new board-2.bin and firmware v157.
Or you can just use a recent master snapshot since the master branch has been recently updated with the latest board-2.bin and QCA9984 v157 firmware.
Can someone with some authority within the OpenWrt developer community help convince the OpenWrt developers to commit the latest linux-firmware package to the 22.03 branch?
Also, a commit of the recent ATF/RRS + multicast latency fix to the 21.02 branch so the next 21.02.4 release will have nice WIFI again. For the 21.02 branch, only the 21.02.1 release has decent WIFI because 21.02.2, 21.02.3 had the troublesome Virtual time-based airtime scheduler commit
I'm not a hundred percent sure we are out of the woods yet. A couple of days ago I had to rollback my 21.02-rc5 + multicast patches because my partner was constantly complaining that she couldn't open SSH tunnels to connect to the Citrix farm she works on.
After rolling back to 21.02.1 everything is fine again. I've been thinking how to test what's going on, I tried rebooting (just in case it was uptime degradation), removing DSCP rules, disabling SQM, disabling the firewall, nothing helped. However, flent tests where showing good latency and nothing weird.
I use multiple different VPN/tunnelling methods (Wireguard, OpenVPN, Strongswan/IPsec, SSLVPN, Websocket/WebTransport tunnelling, Reverse SSH tunnelling) frequently for connectivity to different network environments and I don't have any problem with them when connecting over WIFI. I'm using R7800 master snapshot with NSS acceleration and the latest mainline QCA9984 FW v157.
Yes, @vochong, we did that to rule out a problem with the tunnels. Rolling back is the only thing that worked. To note, we are using the mt76 driver in our APs. I'll try a few more things this weekend to see if I can pin it to the driver or it's something else.