So I am using a Zyxel nbg6817 router. Since I started using more than one SSID with the N physical AP, I have been having wifi issues. At least once per day my devices stop being able to obtain an IP address from the router. When I look in dmesg on the router, I see messages such as:
[ 52.263236] ath10k_pci 0001:01:00.0: Invalid peer id 1 or peer stats buffer, peer: 9bdf2760 sta: 00000000
[ 1562.726486] ath10k_pci 0001:01:00.0: Invalid VHT mcs 15 peer stats
[ 1769.840155] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 1906.610840] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 2057.762091] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 2096.025647] ath10k_pci 0001:01:00.0: htt tx: fixing invalid VHT TX rate code 0xff
[ 2140.823502] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 2613.905716] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 2934.945796] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 3192.876984] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 3743.688316] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 3754.628504] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 3901.208867] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[11534.629861] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[14538.059089] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[14540.699377] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[14723.919747] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[14726.570078] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[15458.612892] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[15478.783295] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[15518.643332] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[15538.263370] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[15558.993675] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[15578.503552] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[16855.938180] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[17114.009095] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[18351.454385] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[18651.615884] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[18651.815858] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[20547.994469] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[23001.918254] ath10k_pci 0001:01:00.0: mac flush vdev 1 drop 0 queues 0x2 ar->paused: 0x0 arvif->paused: 0x0
It recovers when I reboot the router. What could be the issue and how can I diagnose/solve it?
Let's review your config to understand what you've got going on:
Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button:
Remember to redact passwords, MAC addresses and any public IP addresses you may have:
The syntax of your wifi radio config does not jive wit the version of OpenWrt that you are running. I think that is the problem.
How did this config get into place? Was it present previously and then you upgraded to the latest OpenWrt? Or, did you restore a backup? Or did you manually restore or edit the files?
Yes it was there before I updated, and I restored them from a backup if I recall properly. I had not updated for quite a few years. So what is the proper syntax now?
Yeah, I'm not sure why that is. There could be other factors at play -- including possibly things like power adapters and the like. But start by working with a known good config that is syntactically correct.
i see this issue with all my openwrt devices. i have to reboot once a day or so, and then they work fine again. all my devices are old and tired, perhaps that's why.
If you have failing or marginal hardware, especially power supplies or capacitors, this could explain it. Otherwise there is likely a configuration issue. (or less likely but still possible, a bug).
My only ath10k device on 23.05.5 is marginal with a lot of error messages. However on mine when the radio crashes the driver self recovers it. So it was more of a drop and then recovery.
My suggestion would be see if on snapshot there are less error messages/crashes. I didn't do extensive testing on mine but there were less error messages on snapshot.
I just moved the device to single SSID as I wanted to use it mostly as a router/switch.
You may try replacing -ct driver and firmware to mainline (without -ct in package name) or vice versa.
(keep board data package and uninstall 2 packages and reinstall 2 replacements)
Ok so I deleted the wireless config file, rebooted, and redid it using the web service. It did not seem to fix the issue I have. I have ESP32-based devices that keep dropping from my network and are unable to recover. I see these messages on the router:
[ 1124.785579] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 1607.335891] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 1780.875812] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 1875.203008] ath10k_pci 0001:01:00.0: htt tx: fixing invalid VHT TX rate code 0xff
[ 2545.087212] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 2780.750566] ath10k_pci 0000:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0 arvif->paused: 0x0
[ 3019.754490] ath10k_pci 0001:01:00.0: mac flush vdev 2 drop 0 queues 0x4 ar->paused: 0x0 arvif->paused: 0x0
[ 3019.844521] ath10k_pci 0001:01:00.0: mac flush vdev 2 drop 0 queues 0x4 ar->paused: 0x0 arvif->paused: 0x0
Somehow it seems that the devices drop less frequently if I ping them continuously. The same type of devices with the same firmware seem to be doing fine on another wireless network, so I think it has something to do with my router. I don't seem to have issues with AC or with wired.
So I let it run overnight with only my lan_only SSID enabled and it run perfectly fine. Reenabled one SSID this morning and most of my devices on lan_only were disconnected within 15 minutes