Debugging wifi drops on Apple devices

Ok, this has been an issue for months, but I have very little idea how to debug it. Here's what I'm experiencing...

For Apple devices only (iPhones, iPads, HomePods), one of two possible things will happen as frequently as 3 times per day or as infrequently as 3 times per week:

  1. WiFi connectivity will completely drop, but the device remains associated with the WiFi network. This means the little wifi symbol still appears in the upper-right of the screen (for iPhones and iPads), and going into Settings still shows a checkmark next to the wifi network name, as well as other relevant info (like DNS, IP addresses, etc.), but all apps are offline and no websites will load on the device. Turning wifi on & off does not fix the issue. Rebooting the device does not fix the issue. Resetting Network Settings on the device does not fix the issue. Rebooting the OpenWRT APs does not fix the issue. Roaming to another OpenWRT AP does not fix the issue. The only thing I've found that does fix the issue is successfully joining a completely different SSID, then re-joining my OpenWRT-powered SSID. I suspect something is being cached somewhere on the device, and that is the only way I've found to "clear" this cache.
  2. The device will disassociate with the WiFi network completely (and not rejoin). In this state, the wifi symbol on the screen will disappear and going into Settings will not show a checkmark next to my SSID. Simply toggling wifi on & off will allow the device to rejoin the network with no fuss.

I have only experienced this issue on these Apple devices, and only using the mobile operating systems (iOS, iPadOs). I have never experienced these issues on other wireless clients in my household (Mac Mini, printers, OpenBSD laptops, Android phones, IoT devices, etc.)

I have also only experienced this issue since migrating my home network from UniFi to OpenWRT running on three Zytel NWA50AX across my home. These are all hard-wired to an OpenBSD router. My network is predominantly IPv6 only--you can get a better sense of it from this post.

I've tailed logs on all the OpenWRT devices and on my OpenBSD router and nothing has jumped out at me as suspicious. The issue is infrequent enough that it's been low priority to fix, though it frustratingly seems to affect my kids' iPads or my wife's iPhone at the worst possible times. :man_facepalming:

I'm open to all suggestions. I'm hoping someone here might know what's up, or maybe has run into a similar issue and has a fix.

1 Like

Please share cat /etc/config/wireless and opkg list-installed| grep -E 'wolfssl|openssl|mbedtls'

1 Like
root@musubi:~# cat /etc/config/wireless

config wifi-device 'radio0'
        option type 'mac80211'
        option path '1e140000.pcie/pci0000:00/0000:00:01.0/0000:02:00.0'
        option channel '1'
        option band '2g'
        option htmode 'HE20'
        option cell_density '0'

config wifi-iface 'default_radio0'
        option device 'radio0'
        option network 'lan'
        option mode 'ap'
        option ssid 'There'\''s No Place Like ::1'
        option encryption 'psk2'
        option key '██████████████████████████████'
        option ieee80211r '1'
        option mobility_domain '0fe2'
        option ft_over_ds '0'
        option ft_psk_generate_local '1'

config wifi-device 'radio1'
        option type 'mac80211'
        option path '1e140000.pcie/pci0000:00/0000:00:01.0/0000:02:00.0+1'
        option channel '36'
        option band '5g'
        option htmode 'HE80'
        option cell_density '0'

config wifi-iface 'default_radio1'
        option device 'radio1'
        option network 'lan'
        option mode 'ap'
        option ssid 'There'\''s No Place Like ::1'
        option encryption 'psk2'
        option key '██████████████████████████████'
        option ieee80211r '1'
        option mobility_domain '0fe2'
        option ft_over_ds '0'
        option ft_psk_generate_local '1'
        
root@musubi:~# opkg list-installed | grep -E 'wolfssl|openssl|mbedtls'
libmbedtls12 - 2.28.4-1
libustream-mbedtls20201210 - 2023-02-25-498f6e26-1
px5g-mbedtls - 9
1 Like

Are you sure that iOS isn't using some IPv4-only canary domain to determine the online status? My first approach to debug this, would be giving these devices a full dual-stack network for testing (and to avoid potentially conflicting domain filtering for it).

/DISCLAIMER: no personal experience with Apple devices.

1 Like

And there is no wpad-mbedtls installed?
Also please add:

cat /var/run/hostapd-*conf | grep -E '80211w|wnm|bss_transitional'

Similar as @slh, I don't have any Apple device, but I don't remember that some my friends got any wireless issues.

cat /var/run/hostapd-*conf | grep -E '80211w|wnm|bss_transitional'

Came back empty.

@slh I've no clue about the canary domain, but even if it were, I don't understand why toggling wifi on/off wouldn't fix that (temporarily) in scenario 1 described above.

What do you mean by "full dual-stack?" One of my iPads is dual-stack right now and I believe it still exhibits the issue (though most of my time is spent on another OpenBSD laptop, so hard to say with certainty).

Dual stack -> ipv4 and ipv6. If you have ipv6 enabled just try to turn off, or disable ULA.

I was thinking,that your issue is, that the 80211w is disabled. If yes, try to enable it as required and check how it's going.

Dual stack -> ipv4 and ipv6.

Yeah, that's what I figured. Guess I got tripped up with the "full" qualifier.

I'll check on 80211w and report back. Just to make sure I'm on the same page, we're talking about this, right?

yes, it is this option.
NOTE: some devices might not connect when "required" is set, but try now if that will help you.

Any recommendations for the timeout values? Or fine to leave at the defaults?

Looks ok.
I don't remember exactly, but ft_over_ds '0' was not working well with some devices, so I changed to 1.
Here is my wireless settings, should be up to date. Just remember to change the timezones if you want to try

Cool, thanks. Ok, I've updated all three APs, but given the inconsistent nature of the issue, think I'm going to have to sit back and wait a week to see if this happens again. :crossed_fingers:

1 Like

@danpawlik Ok, the problem is still here. My wife's iphone wifi dropped last Thursday and my kid's ipad dropped on Saturday. Any other thoughts?

Enable debugging,eg.:

uci set wireless.default_radio0.log_level=1
uci set wireless.default_radio1.log_level=1
uci commit wireless

and send logread | grep -A5 -B5 hostapd

Ok, made those updates, but I don't seem to be able to post the output here as the forum is complaining I'm above the character limit and I don't see an "attachment" button anywhere... so here's a gist: https://gist.github.com/neezer/6a333e121a0d47736f2f96fa60355814

Note that I have three APs named musubi, lily, and gus. The logs for each should be in the same gist.

Strange. What version are you using?
You can try to:

  • Disassociate On Low Acknowledgement (unset in luci)
  • Disable Inactivity Polling (set in luci)
  • you can try to change from mbedtls to openssl (but I don't believe that it might help).

If you are not using snapshot release, maybe it is worth to try.

What version are you using?

OpenWrt 23.05.0, r23497-6637af95aa

You should consider upgrading to 23.05.4. But I don't think that will resolve the issues you're having.

Meanwhile, I'd recommend disabling 802.11r entirely. I assume you have multiple APs (please confirm)? Turn it off on all APs.

Yes, I have three.

Would doing this still allow my devices to roam between each AP? That's behavior I'd rather not lose, if I can avoid it.

Yes. Roaming is actually a client side function. You want to setup the APs such that you provide the optimal environment for the client devices to make good roaming decisions. This means:

  • using the same SSID + encryption type + passphrase on all APs
  • setting non-overlapping channels for the radios on neighboring APs
  • optimizing the power levels (often reducing them) such that the overlap area is as small as possible while still providing adequate coverage.
  • where possible, optimizing the placement of the APs themselves
  • And it is assumed that all devices except for the main router are configured as bridged-APs. on the same L2 network There must only be a single DHCP server on the network, so all but the main router (in most cases) should have their DHCP servers disabled

I like this video as an explainer about how to optimize per the above. Although it deals with Unifi, the concepts apply to all wifi systems.

802.11r (as well as the k and v standards) are optional additions to your 'normal' wifi configuration. They can cause more problems than they solve. That is why I recommend disabling these standards. Enable them only if there is a demonstrated need, and only after optimizing the overall wireless landscape as best as possible. I personally do not use 802.11k/v/r and I have nearly seamless roaming between my APs.

1 Like