Ok, this has been an issue for months, but I have very little idea how to debug it. Here's what I'm experiencing...
For Apple devices only (iPhones, iPads, HomePods), one of two possible things will happen as frequently as 3 times per day or as infrequently as 3 times per week:
WiFi connectivity will completely drop, but the device remains associated with the WiFi network. This means the little wifi symbol still appears in the upper-right of the screen (for iPhones and iPads), and going into Settings still shows a checkmark next to the wifi network name, as well as other relevant info (like DNS, IP addresses, etc.), but all apps are offline and no websites will load on the device. Turning wifi on & off does not fix the issue. Rebooting the device does not fix the issue. Resetting Network Settings on the device does not fix the issue. Rebooting the OpenWRT APs does not fix the issue. Roaming to another OpenWRT AP does not fix the issue. The only thing I've found that does fix the issue is successfully joining a completely different SSID, then re-joining my OpenWRT-powered SSID. I suspect something is being cached somewhere on the device, and that is the only way I've found to "clear" this cache.
The device will disassociate with the WiFi network completely (and not rejoin). In this state, the wifi symbol on the screen will disappear and going into Settings will not show a checkmark next to my SSID. Simply toggling wifi on & off will allow the device to rejoin the network with no fuss.
I have only experienced this issue on these Apple devices, and only using the mobile operating systems (iOS, iPadOs). I have never experienced these issues on other wireless clients in my household (Mac Mini, printers, OpenBSD laptops, Android phones, IoT devices, etc.)
I have also only experienced this issue since migrating my home network from UniFi to OpenWRT running on three Zytel NWA50AX across my home. These are all hard-wired to an OpenBSD router. My network is predominantly IPv6 only--you can get a better sense of it from this post.
I've tailed logs on all the OpenWRT devices and on my OpenBSD router and nothing has jumped out at me as suspicious. The issue is infrequent enough that it's been low priority to fix, though it frustratingly seems to affect my kids' iPads or my wife's iPhone at the worst possible times.
I'm open to all suggestions. I'm hoping someone here might know what's up, or maybe has run into a similar issue and has a fix.
Are you sure that iOS isn't using some IPv4-only canary domain to determine the online status? My first approach to debug this, would be giving these devices a full dual-stack network for testing (and to avoid potentially conflicting domain filtering for it).
/DISCLAIMER: no personal experience with Apple devices.
@slh I've no clue about the canary domain, but even if it were, I don't understand why toggling wifi on/off wouldn't fix that (temporarily) in scenario 1 described above.
What do you mean by "full dual-stack?" One of my iPads is dual-stack right now and I believe it still exhibits the issue (though most of my time is spent on another OpenBSD laptop, so hard to say with certainty).
Looks ok.
I don't remember exactly, but ft_over_ds '0' was not working well with some devices, so I changed to 1. Here is my wireless settings, should be up to date. Just remember to change the timezones if you want to try
Cool, thanks. Ok, I've updated all three APs, but given the inconsistent nature of the issue, think I'm going to have to sit back and wait a week to see if this happens again.
Ok, made those updates, but I don't seem to be able to post the output here as the forum is complaining I'm above the character limit and I don't see an "attachment" button anywhere... so here's a gist: https://gist.github.com/neezer/6a333e121a0d47736f2f96fa60355814
Note that I have three APs named musubi, lily, and gus. The logs for each should be in the same gist.
Yes. Roaming is actually a client side function. You want to setup the APs such that you provide the optimal environment for the client devices to make good roaming decisions. This means:
using the same SSID + encryption type + passphrase on all APs
setting non-overlapping channels for the radios on neighboring APs
optimizing the power levels (often reducing them) such that the overlap area is as small as possible while still providing adequate coverage.
where possible, optimizing the placement of the APs themselves
And it is assumed that all devices except for the main router are configured as bridged-APs. on the same L2 network There must only be a single DHCP server on the network, so all but the main router (in most cases) should have their DHCP servers disabled
I like this video as an explainer about how to optimize per the above. Although it deals with Unifi, the concepts apply to all wifi systems.
802.11r (as well as the k and v standards) are optional additions to your 'normal' wifi configuration. They can cause more problems than they solve. That is why I recommend disabling these standards. Enable them only if there is a demonstrated need, and only after optimizing the overall wireless landscape as best as possible. I personally do not use 802.11k/v/r and I have nearly seamless roaming between my APs.