WI-FI Connection sometimes broken after AP roaming

Hi, maybe someone has a hint for my problem.
My setup: OpnSense firewall acting as DHCP server and router
1 switch (old HP model)
2 APs with OpenWRT 21.02.1 (Linksys WRT3200ACM and Linksys WRT1900ACS)

My issue: my mobile sometimes looses WLAN connection and can't reconnect.
I think it happens (not every time but often) when switching from one AP to the other.
I think this happens since OpenWRT version 21, but this can be a coincidence.
When connection gets lost, Wi-Fi bar shows full strength, but mobile can't connect to the gateway (using a terminal emulation on Android with a simple ping command).
just waiting -> no success
Workaround: disable and enable Wi-Fi -> all working again

WLAN config:

config wifi-device 'radio0'
	option type 'mac80211'
	option channel '36'
	option hwmode '11a'
	option path 'soc/soc:pcie/pci0000:00/0000:00:01.0/0000:01:00.0'
	option cell_density '0'
	option htmode 'VHT40'
	option country 'AT'

config wifi-device 'radio1'
	option type 'mac80211'
	option hwmode '11g'
	option path 'soc/soc:pcie/pci0000:00/0000:00:02.0/0000:02:00.0'
	option htmode 'HT20'
	option cell_density '0'
	option channel '1'
	option country 'AT'
	option legacy_rates '1'

config wifi-device 'radio2'
	option type 'mac80211'
	option hwmode '11a'
	option path 'platform/soc/soc:internal-regs/f10d8000.sdhci/mmc_host/mmc0/mmc0:0001/mmc0:0001:1'
	option htmode 'VHT80'
	option country 'AT'
	option cell_density '0'
	option channel '38'
	option txpower '20'
	option disabled '1'

config wifi-iface 'wifinet5'
	option device 'radio0'
	option mode 'ap'
	option ssid 'xxx'
	option encryption 'psk2+ccmp'
	option key 'yyy'
	option network 'INTERNAL'
	option macaddr '30:23:03:de:13:72'
	option disassoc_low_ack '0'

config wifi-iface 'wifinet6'
	option device 'radio0'
	option mode 'ap'
	option ssid 'guest'
	option encryption 'psk2+ccmp'
	option key 'welcome1'
	option network 'GUEST'
	option macaddr '30:23:03:de:13:71'
	option disassoc_low_ack '0'

config wifi-iface 'wifinet2'
	option device 'radio1'
	option mode 'ap'
	option ssid 'guest'
	option encryption 'psk2+ccmp'
	option key 'welcome1'
	option network 'GUEST'
	option disassoc_low_ack '0'

config wifi-iface 'wifinet3'
	option device 'radio1'
	option mode 'ap'
	option ssid 'xxx'
	option encryption 'psk2+ccmp'
	option key 'yyy'
	option network 'INTERNAL'
	option disassoc_low_ack '0'

config wifi-device 'radio0'
	option type 'mac80211'
	option hwmode '11a'
	option path 'soc/soc:pcie/pci0000:00/0000:00:01.0/0000:01:00.0'
	option cell_density '0'
	option htmode 'VHT40'
	option country 'AT'
	option channel '48'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option mode 'ap'
	option macaddr '58:ef:68:0c:f2:45'
	option ssid 'xxx'
	option encryption 'psk2+ccmp'
	option key 'yyy'
	option network 'INTERNAL'
	option disassoc_low_ack '0'

config wifi-device 'radio1'
	option type 'mac80211'
	option hwmode '11g'
	option path 'soc/soc:pcie/pci0000:00/0000:00:02.0/0000:02:00.0'
	option htmode 'HT20'
	option cell_density '0'
	option country 'AT'
	option legacy_rates '1'
	option channel '6'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option mode 'ap'
	option macaddr '58:ef:68:0c:f2:44'
	option ssid 'xxx'
	option encryption 'psk2+ccmp'
	option key 'yyy'
	option network 'INTERNAL'
	option disassoc_low_ack '0'

config wifi-iface 'wifinet2'
	option device 'radio0'
	option mode 'ap'
	option ssid 'guest'
	option encryption 'psk2+ccmp'
	option key 'welcome1'
	option network 'GUEST'
	option disassoc_low_ack '0'

config wifi-iface 'wifinet3'
	option device 'radio1'
	option mode 'ap'
	option ssid 'guest'
	option encryption 'psk2+ccmp'
	option key 'welcome1'
	option network 'GUEST'
	option disassoc_low_ack '0'

Tried to set a fixed IP on mobile, didn't help.
Can provide additional logs if needed.
Where should I start looking for the issue? firewall/router, switch or APs?

Are these devices running in a dumb AP mode? Does this happen on one of your APs every time?

It appears you have VLANs running to enable multiple networks. Is it safe to assume that you have setup these VLANs on your main router? Have you verified that the switch is configured properly with trunk ports for each of the APs and to the router? Does the problem happen on both networks or just one of them?

We'll probably want to see the contents of /etc/config/network for the problematic AP(s).

You also probably have issues with your radios -- channel width and selection.

radio0 VHT40 channel 36
radio1 HT20 channel 1 <--- ok
radio2 VTH80 channel 38

radio0 VHT40 channel 48
radio1 HT20 channel 6 <--- ok

Your 5G radios are trampling each other. It doesn't appear that radio2 is associated with a network, but the channels (36 and 38) will cause major issues. Set everything to VHT40 and spread your channels out properly.

Especially during debugging, it would be better to disable (remove the kernel module) the mwifiex based third radio, it's barely usable anyways (1x1, basically no antenna) and can mess up the region code massively.

There was an error affecting mwlwifi in 21.02.0 and 21.02.1, which will be fixed in the as-of-yet unreleased 21.02.2 - ideally switch to a 21.02- or master snapshot for the time being (until 21.02.2).

Thanks for all your questions, I try to answer them.
My APs running in "dumb" AP mode.
It often happens, but not always. Maybe I've to test further to check if I can force the error. As of now, I didn't do extensive field-testing
VLANs: yeah, there are VLANs and these VLANs are setup (router and switch). Everything works.
Happens on both networks? Can't answer it today, the 2nd WLAN is a guest WLAN with restrictions, I normally do not use it
radios: radio 2 has option disabled '1' it is not in use, shouldn't be problem?

remove the kernel module: any quick hint how to do that?

my network config from both APs


config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd0f:97c2:5baa::/48'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'

config interface 'lan'
	option device 'br-lan.1'
	option proto 'static'
	option ipaddr '192.168.1.249'
	option netmask '255.255.255.0'
	option gateway '192.168.1.254'
	option broadcast '192.168.1.255'
	list dns '192.168.10.8'
	list dns '192.168.10.9'
	list dns_search 'zzz'

config device
	option name 'wan'
	option macaddr '32:23:03:de:13:70'

config interface 'wan'
	option device 'wan'
	option proto 'dhcp'
	option auto '0'

config interface 'wan6'
	option device 'wan'
	option proto 'dhcpv6'
	option auto '0'
	option reqaddress 'try'
	option reqprefix 'auto'

config interface 'INTERNAL'
	option proto 'none'
	option device 'br-lan.10'
	option force_link '1'

config interface 'GUEST'
	option proto 'none'
	option device 'br-lan.9'
	option force_link '1'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '1'
	option name 'br-lan.1'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '9'
	option name 'br-lan.9'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '10'
	option name 'br-lan.10'

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'

config bridge-vlan
	option device 'br-lan'
	option vlan '9'
	list ports 'lan1:t'

config bridge-vlan
	option device 'br-lan'
	option vlan '10'
	list ports 'lan1:t'

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd81:e2a4:489f::/48'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'

config interface 'lan'
	option device 'br-lan.1'
	option proto 'static'
	option ipaddr '192.168.1.250'
	option netmask '255.255.255.0'
	option gateway '192.168.1.254'
	option broadcast '192.168.1.255'
	list dns '192.168.10.8'
	list dns '192.168.10.9'
	list dns_search 'zzz'

config device
	option name 'wan'
	option macaddr '5a:ef:68:0c:f2:43'

config interface 'wan'
	option device 'wan'
	option proto 'dhcp'
	option auto '0'

config interface 'wan6'
	option device 'wan'
	option proto 'dhcpv6'
	option auto '0'
	option reqaddress 'try'
	option reqprefix 'auto'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '1'
	option name 'br-lan.1'

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '9'
	option name 'br-lan.9'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '10'
	option name 'br-lan.10'

config bridge-vlan
	option device 'br-lan'
	option vlan '9'
	list ports 'lan1:t'

config bridge-vlan
	option device 'br-lan'
	option vlan '10'
	list ports 'lan1:t'

config interface 'INTERNAL'
	option proto 'none'
	option device 'br-lan.10'
	option force_link '1'

config interface 'GUEST'
	option proto 'none'
	option device 'br-lan.9'
	option force_link '1'

config device
	option name 'lan1'

@psherman: Did you find time to take a look at my config? Would really appreciate it....

Hey... sorry for the delay.

I don't see anything obvious in your configuration that would cause the problem... do the logs reveal anything?

Can you point me in the right direction, which log should I check? According to mobile there is a connection (WIFI symbol), but Terminal emulation shows that I'm unable to ping the gateway. I'm not sure if it is possible to dig through Android (not rooted) logs.
AP logs?
Possible logs in OpnSense?

My network know how is basic, I currently do not understand why WIFI is "working", device has an IP but is not able to ping anything. As stated before, WIFI OFF -> ON, all working again...
I'm currently guessing that it is a problem with my phone, not the infrastructure, but maybe I'm wrong...

I'd start with the AP logs. You can see them in the GUI under Status > System Logs, or you can get them on the CLI by using logread. Since the logs have timestamps, you can hopefully find entries that correspond to the loss of connectivity you are experiencing.

This is obviously an important thing to figure out. If this phone is the only one experiencing the issue, it could well be something wrong with that device. However, if you can reproduce the problem on other devices (laptops, other phones, tablets, etc.), then it could well be infrastructure.

I'm still trying to reproduce it, currently working without any issue (with no change at all ;-))
I'll report back if I can grab a log file when the issue happens

Today it happened again. I tried to leave the bad state "as it is" but after I moved back to the other AP it worked again.
I think the relevant log entries are these

Mon Jan 24 19:46:14 2022 daemon.info hostapd: wlan0: STA 39:c2:1f:2b:64:44 IEEE 802.11: authenticated
Mon Jan 24 19:46:14 2022 daemon.notice hostapd: wlan0: STA-OPMODE-N_SS-CHANGED 39:c2:1f:2b:64:44 2
Mon Jan 24 19:46:14 2022 daemon.info hostapd: wlan0: STA 39:c2:1f:2b:64:44 IEEE 802.11: associated (aid 1)
Mon Jan 24 19:46:15 2022 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 39:c2:1f:2b:64:44
Mon Jan 24 19:46:15 2022 daemon.info hostapd: wlan0: STA 39:c2:1f:2b:64:44 WPA: pairwise key handshake completed (RSN)
Mon Jan 24 19:48:05 2022 kern.debug kernel: [51474.533740] ieee80211 phy0: Mac80211 start BA 39:c2:1f:2b:64:44
Mon Jan 24 19:57:21 2022 kern.debug kernel: [52030.685775] ieee80211 phy1: Mac80211 start BA d4:11:a3:d9:08:b7
Mon Jan 24 20:38:28 2022 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 39:c2:1f:2b:64:44
Mon Jan 24 20:38:28 2022 daemon.info hostapd: wlan0: STA 39:c2:1f:2b:64:44 IEEE 802.11: disassociated due to inactivity
Mon Jan 24 20:38:48 2022 kern.err kernel: [54517.841915] ieee80211 phy0: cmd 0x9122=UpdateEncryption timed out
Mon Jan 24 20:38:48 2022 kern.err kernel: [54517.848037] ieee80211 phy0: return code: 0x1122
Mon Jan 24 20:38:48 2022 kern.err kernel: [54517.852598] ieee80211 phy0: timeout: 0x1122
Mon Jan 24 20:38:48 2022 kern.err kernel: [54517.856798] wlan0: failed to remove key (0, 39:c2:1f:2b:64:44) from hardware (-5)
Mon Jan 24 20:38:48 2022 daemon.notice hostapd: nl80211: nl80211_recv_beacons->nl_recvmsgs failed: -5
Mon Jan 24 20:38:48 2022 daemon.info hostapd: wlan0: STA 39:c2:1f:2b:64:44 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)

39:c2:1f:2b:64:44 is the phone MAC address
wlan0: failed to remove key (0, 39:c2:1f:2b:64:44) from hardware (-5) seems to be an error message
It is a Linksys WRT1900ACS device, this is mentioned here:

any ideas?

On this, I really can't offer any assistance. Sorry. Hopefully someone else can chime in.

Concerning my specific issue with the WRT1900ACS2 5GHz network loosing WLAN access, I did ultimately disable scheduled reboots and switch to a snapshot build. I cannot remember if switching to snapshot was necessary to ameliorate this issue or something I did to fix another. Either way, I no longer have this specific problem now.

@psherman: Thanks anyway for your time, it seems that this is a bug in some way
@User34: I have some sort of scheduled reboot, I power off my APs during the night ;-), but I don't think that letting them running overnight solves this issue. Maybe the snapshot build is worth giving it a try...

I would look at this thread as having many of the answers you may be looking for. Conclusion: update to latest snapshot build (sometime after Nov 25th 2021) and use channel 36.

1 Like

@User34 Thank you for pointing me in that direction... I've updated both AP with the latest snapshot (quick shocking moment as LUCI isn't included with snapshots ;-)) and now both running snapshot from yesterday...
I'll try to force the issue, but as mentioned before it happened randomly and infrequently.

WiFi on these devices is terrible, has lots of problems like the one you mention, as well as others... Random dropouts, difficulty with ipv6 stuff, just generally buggy. The drivers are abandoned... Your best bet is move to a different chipset.

yeah, heard about that... by the time I bought them they were the only one with good support from OpenWRT... currently I've no money to switch them, so I have to live with it...

1 Like

I think you'll have good luck with the master snapshot build concerning wifi performance. So far I haven't noticed any issues with my master build update yesterday. It's come a long way in the last 6 months. Sorry I didn't think to warn you about luci being a package you'll need to install. I have a running list of packages that install via SSH for any new builds/upgrades.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.