Frequent client disconnects - where to start?

Hi Guys

Newby to OpenWRT - and I have searched but somewhat bewildered as to which way I run on this problem.

I bought 5 Aerohive AP121 with stock firmware before christmas and flashed them with openwrt-22.03.2-ath79-nand-aerohive_hiveap-121-squashfs-factory, and now, OpenWrt 22.03.3 r20028-43d71ad93e / LuCI openwrt-22.03 branch git-22.361.69894-438c598. These are all setup as dumb APs.

They all flashed ok, with zero problems - so hats off to those that built/maintain the image.

My problem is that from time to time, various clients will fail to route across the network. I say network, as I am not sure how best to find where the issue is. when the problem appears (and the APs dont all fail at the same time).

The devices think they are still connected to the AP - as, for example, Windows does not report that it has a WiFi connection issue....just no IP traffic routing - I dont think any clients try to roam to another AP. I believe its not a loss of connectivity due to the clients roaming across APs either. I do have 4 SSIDs in use, (same on each AP) each on their own VLAN. It appears that all VLANs are affected on the AP when it fails.

The length of time the AP has been running does not appear to make any difference. When the issue appears, it will work/fail over say a 10min period then be ok again for a few hours.

All wired devices on the network work perfectly fine... so DNS is not the cause nor the perimeter router. I use statping-NG, and it continues to show the LAN interface responding to pings on the the AP, and likewise, the switch they are connected to dont show link-down at any point.

All APs have the same config on them, other than of course IP address, and the only non-image application is ntp.... which I added as the stock build would not sync time. The problem was however there before installing the module.

So - how am I best to approach troubleshooting this?

I have searched around and read the system logs but I cannot see anything suspicious.

Thanks in advance for any guidance.

David

Let’s start with a system diagram. Please also include the channels for each ap.

Hi

Sorry for the late reply.

I have attached a primitive diagram - with notes within.

The routing within our home is robust, and other than the APs being changed to OpenWrt on the AP121 devices no other changes have been implemented. Previous APs were Xirrus devices and all worked perfectly.

Currently on 22.03.3 - public build - issue was present in previous release.

The APs all remain on the LAN and are reachable without error when wifi devices are off-net.

I am just hoping for a guide as to where to start in terms of log digestion - I am sure someone has been here before me so would appreciate any things to check within the logs...

Thanks

So, I would start by creating a physical map of the APs within the building...

Best practice is to ensure that the channels are always different between neighboring APs, and that any re-use of channels (necessary for >3 APs in the 2G band, for example) is done such that they are as physically separated as possible. If you had 1 AP per floor (this may be significantly oversimplified for your floor plan, but follow the idea), you might select:
Floor 1: channel 1, 36
Floor 2: channel 6, 149
Floor 3: channel 11, 157
Floor 4: channel 1, 165
... etc

Note that I'm using channels 1, 6, 11 for 2G. And I'm avoiding DFS channels* on 5G (52-144 are DFS channels in the USA) and also trying to avoid reusing the 5G channels, too.

*DFS channels could be a large part of your problem -- at least for 5G devices. If DFS get's a radar 'hit', it must (by law) shut down the radio for a period of time and then rescan before bringing it back up.... you've chosen DFS channels 3 of your APs (at least based on the USA DFS channel mapping). See DFS channels.

Using different channels on neighboring APs will help improve perofrmance in general and will also make client device roaming smoother and more reliable. You should also consider adjusting power levels (usually reducing them) to reduce the overlap as much as possible while still giving adequate coverage.

I like the way that Chris from Crosstalk Solutions describes optimizing wifi in this video.

Next:

  • are you using WPA2, WPA3, or mixed mode WAP2/3?
  • What about 802.11r (fast romaing) and/or k/v?
  • and are you using 802.11s (mesh; given your wired backhaul, this should not be enabled)?

EDIT: Also, make sure you've set the appropriate country code for your region in the wifi settings.