So I've been starting to use OpenWRT when I bought a WRT1900ACS at the beginning of last year.
There's an annoying problem though: Every now and then (after a few hours to a few days) the WiFi 5 networks become unusable. By unusable I mean that no devices can connect anymore.
For example an Android Phone cycles through "Connecting ..." and in about a second back to "Saved", and immediately back to "Connecting ...".
When I restart the WiFi 5 (through LuCi), all devices can connect again. Regular 802.11bgn works perfectly stable though.
Currently I am running OpenWRT 19.07.0, but the issue already persists since I started using OpenWRT with 18.06.2.
Any hints on where to start digging?
I am also happy to provide any information that is required.
I set up a cron job to ping a known wifi client every 20 minutes, and if ping failed, the router restarts wifi.
Issue solved.
If you have no permanent wifi client connected, like a wireless bridge, then you can set a cron job to restart wifi every morning at 6, or every 12 hours, for example. Keeps the issues away.
Well, that's a way to sail around the issue. I'd prefer to have a "real" solution, but I will keep yours in mind. Thanks!
Also, sometimes the WiFi breaks within the same day, so I'd have to restart it more regularly.
Thanks also for your offer on supporting me with creating the cron job! I'm fit enough with shellscripting to work it out on my own if I decide to walk that path.
When issue pops up again check/copy/safe the log files before restarting wlan. In LuCi you will find system and kernel logs under status; same can be done in cli by issuing "logread" and "dmesg" commands whatever you prefer.
Well, it doesn't only happen with android devices. I also have Dell and Lenovo Notebooks here that can't connect anymore.
I didn't quite get the point of your "19.07.0??". I am aware that it's not the latest version, but I just noticed that there are service releases out there. I'll update to 19.07.2 on the weekend. I don't really expect it to resolve the issues as they are present since 18.06.2 for me and I suspect some kind of configuration issue.
The kernel log indicates anomalies (exception/overload/crash etc) generated by the radio drivers. The system log usually indicates the events generated by the userspace daemons (e.g. hostapd/wpad etc) such as STA authenticated/associated/disconnected etc. Weird is that in your case you have wlan1 reporting events but, if I understood well, no events reported by wlan0? I have no experience with Marvel hardware (in my case I use qca/ath9/ath10 hardware). Do you see all configured wlan interfaces when you issue command "iwinfo" when you experience problems? Try also "iw wlan0 scan" to see if you receive other wireless APs when you experience connectivity problems.
It looks like wlan0 does receive but it looks like the radio driver for phy0 is reporting problems, I have no idea what they mean but when this event is repeated several times per second it does not look ok. When this happens does the radio keeps sending beacons (i.e. is the SSID visible on your clients)? Also remarkable is the DFS reporting radars meaning that the radio was set on a DFS channel (ch52 or higher) and switched to another channel upon detection. Could it be that exactly on that moment the issues started? You could try to set the radio to a non-DFS channel (ch48 or lower) and see if the issue pops up again.
EDIT: sorry I misunderstood the events.. those logs generated "phy0 change" few times/second during the scan are most probably normal. The radio is continuously switching channels looking for activity on other frequencies when it does not need to receive on it's "home" channel.
Thanks a lot for your hints. Now that I've read what DFS is all about and what radar detection has to do with WiFi, I think that you actually might be on to something.
Yes, it was sending beacons on channel 100-112.
I checked my configuration again... My WiFi 5 configuration was set to auto channel with 80 MHz channel width.
I now set the configuration to 40 MHz channel width and fixed the channels to 36-40. I'll give it a try for a few days and see if it's stable now.
Somehow I dislike fixed channels though (although that might be completely absurd). Maybe the problem was that the firmware couldn't find a "good enough" 80 MHz slot to switch to? I think I might try auto mode with 40 or even 20 MHz channel width if the current settings seem to turn out stable.
Indeed, my perception as well, when auto-searching "free" channels it will base the search on basic settings (such as bandwidth) and it is evident the availability of VHT80 channels is far less compared to VHT40 and HT20 channels. This means that it will not try to decrease the selected bandwidth upon unsuccessful search. Now regarding the aspect "auto": it is based upon an algorithm (build into hostapd) searching for the "less occupied" channels but I never saw evidence that, once the channel has been set, that same algorithm keeps checking from time to time the entire spectrum looking for better channels unless re-initialisation (reboot of entire system, restart the radio due to config changes, etc). The only real dynamic aspect is the DFS functionality (only on 5G-DFS channels and build into the firmware): when a DFS channel has been chosen by the "auto" algorithm it will start the DFS algorithm: listen 60" to make sure there is no radar detected before start using the channel (DFS-CAC-START - DFS-CAC-COMPLETED) and, once active in production, listen continuously for radar activity in between the users traffic. Upon detection it is this same DFS functionality stopping immediately using this channel and start searching for another channel.
Sooo, after six days my resume is:
It really looks like DFS is the culprit here. Being fixed to channels 36-40 on 40MHz channel bandwidth, the WiFi networks seem to be stable.
Now the problem is that the non-DFS channels 36-48 are kind of crowded already. According to Wikipedia the channels 50-144 are DFS only, while any channels above are SRD channels limited to 25mW...
Would you have any hints on where to start looking to get a working DFS implementation? Is the DFS logic implemented in hostapd too, or is that actually a firmware feature?
For now I'm testing auto with 40MHz channel width. Thanks for clarifying the meaning of auto btw!
I cannot confirm for sure, but from what I've read in the last days (and I did quite some research) the behaviour you are describing is exactly what is mandated by regulations.
It doesn't matter if you got to a DFS channel by auto mode (which only selects your best channel on startup) or if you told your AP to start broadcasting on a specific channel. If it's on a DFS channel it has to scan for radars and get out of the way if it detects one.
But yeah, the term fixed is kind of misleading here. Better see it as initial channel on startup, as fixed only applies if you choose a non-DFS channel.