I have also monitored the wifi event without the workaround
iw event -f -t
and notice when things goes bad you get event number 64,84
64 = notify_cqm
84 = probe_client
I am monitoring the iw event logs by saving the log in the /tmp by running
I run into a similar problem on a regular basis. Restarting radio0 for 2.4G helps. In the kernel log, from interesting things, I observe the following entries
167014.660438] device wlan0 left promiscuous mode
[167014.665114] br-lan: port 5(wlan0) entered disabled state
[167014.832227] mt7622-wmac 18000000.wmac: Message 000025ed (seq 13) timeout
[167014.839140] mt7622-wmac 18000000.wmac: Message 00002aed (seq 14) timeout
[167015.305356] br-lan: port 5(wlan0) entered blocking state
[167015.310767] br-lan: port 5(wlan0) entered disabled state
[167015.316530] device wlan0 entered promiscuous mode
[167015.321506] br-lan: port 5(wlan0) entered blocking state
[167015.326961] br-lan: port 5(wlan0) entered forwarding state
[167015.721544] br-lan: port 5(wlan0) entered disabled state
[167024.735905] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[167024.742584] br-lan: port 5(wlan0) entered blocking state
[167024.747992] br-lan: port 5(wlan0) entered forwarding state
When you get the "no connectivity" issue can you check the kernel logs and the iw event logs to check if it is any of the two above cases ?
I have a single client (iPhone 11) that gets dropped. The WiFi connectivity symbol on the phone disapears and the LTE symbol is displayed. There is a corresponding event in logread output showing these lines:
Sun Feb 27 08:02:45 2022 daemon.notice hostapd: wlan1-1: AP-STA-DISCONNECTED xx:xx:xx
Sun Feb 27 08:02:45 2022 daemon.info hostapd: wlan1-1: STA xx:xx:xx IEEE 802.11: disassociated
Sun Feb 27 08:02:47 2022 daemon.info hostapd: wlan1-1: STA xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
I did not reply because I thought this one was about setups with fast roaming. So I started a new thread at the time. Seeing this from @nkef, makes me think my issue and the issues here are related:
167014.660438] device wlan0 left promiscuous mode
[167014.665114] br-lan: port 5(wlan0) entered disabled state
[167014.832227] mt7622-wmac 18000000.wmac: Message 000025ed (seq 13) timeout
[167014.839140] mt7622-wmac 18000000.wmac: Message 00002aed (seq 14) timeout
[167015.305356] br-lan: port 5(wlan0) entered blocking state
[167015.310767] br-lan: port 5(wlan0) entered disabled state
[167015.316530] device wlan0 entered promiscuous mode
[167015.321506] br-lan: port 5(wlan0) entered blocking state
[167015.326961] br-lan: port 5(wlan0) entered forwarding state
[167015.721544] br-lan: port 5(wlan0) entered disabled state
[167024.735905] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[167024.742584] br-lan: port 5(wlan0) entered blocking state
[167024.747992] br-lan: port 5(wlan0) entered forwarding state
[185339.514787] mt7530 mdio-bus:00 lan3: Link is Down
[185339.519722] br-lan: port 3(lan3) entered disabled state
[185355.117529] mt7530 mdio-bus:00 lan3: Link is Up - 100Mbps/Full - flow control rx/tx
[185355.125377] br-lan: port 3(lan3) entered blocking state
[185355.130700] br-lan: port 3(lan3) entered forwarding state
[185358.234701] mt7530 mdio-bus:00 lan3: Link is Down
[185358.240108] br-lan: port 3(lan3) entered disabled state
[185360.318329] mt7530 mdio-bus:00 lan3: Link is Up - 100Mbps/Full - flow control rx/tx
[185360.326188] br-lan: port 3(lan3) entered blocking state
[185360.331512] br-lan: port 3(lan3) entered forwarding state
[219495.827558] br-lan: port 3(lan3) entered disabled state
[219495.834313] mt7530 mdio-bus:00 lan3: Link is Down
The "no connecitity" issue has not occurred again but not enough days have passed.
Could you try to apply those also to both wifi interfaces to see if you get disconnected ?
I applied them the via uci:
uci set wireless.wifinet0.disassoc_low_ack='0'
uci set wireless.wifinet1.disassoc_low_ack='0'
uci set wireless.wifinet0.max_inactivity='900'
uci set wireless.wifinet1.max_inactivity='900'
uci set wireless.wifinet0.skip_inactivity_poll='1'
uci set wireless.wifinet1.skip_inactivity_poll='1'
uci commit wireless
wifi
I am on OpenWrt SNAPSHOT r18777-1847382456 i have not stumbled yet on that issue yet, the wlan0 went completely down if I get it correctly ...
I got also a phew "unknown event 139" but wifi did not not went down.
Which snapshot are you using ?
Ι have not tested extensively the 5G band, occasionally some of my phone connect to 5G but not for a long time.
For both bands I had all short of issues until I applied the disconnected due inactivity fix as i said before. I am testing it for a phew days now , my android's S8 and S9 phones was always been able to connect to wifi successfully mostly at the 2.4G band.
I have narrowed the problem down to the support HE160 mode in the CH36-64 band. When using HE80 everything works fine. I really don't know when that happened, but for sure HE160 mode (160MHz channel) was properly supported in the past. Tested and validated in SNAPSHOT r18646-3869ccbcc8 but fails in SNAPSHOT r19040-247eaa4416
Well I got hit by the "no connectivity" issue once again after couple days this time, similar to the @ilshatms.
From dmesg:
[543156.909270] br-lan: port 1(lan1) entered disabled state
[543156.922398] mt7530 mdio-bus:00 lan1: Link is Down
[543179.791306] mt7530 mdio-bus:00 lan1: Link is Up - 100Mbps/Full - flow control off
[543179.798928] br-lan: port 1(lan1) entered blocking state
[543179.804235] br-lan: port 1(lan1) entered forwarding state
[543189.158624] br-lan: port 1(lan1) entered disabled state
[543189.164827] mt7530 mdio-bus:00 lan1: Link is Down
[543191.231516] mt7530 mdio-bus:00 lan1: Link is Up - 100Mbps/Full - flow control rx/tx
[543191.239306] br-lan: port 1(lan1) entered blocking state
[543191.244615] br-lan: port 1(lan1) entered forwarding state
From logread:
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 IEEE 802.11: binding station to interface 'wlan0'
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 IEEE 802.11: authentication OK (FT)
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 MLME: MLME-AUTHENTICATE.indication(08:c5:e1:61:0d:b0, FT)
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 IEEE 802.11: association OK (aid 4)
Mon Mar 7 05:34:16 2022 daemon.info hostapd: wlan0: STA 08:c5:e1:61:0d:b0 IEEE 802.11: associated (aid 4)
Mon Mar 7 05:34:16 2022 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 08:c5:e1:61:0d:b0
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 MLME: MLME-REASSOCIATE.indication(08:c5:e1:61:0d:b0)
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 IEEE 802.11: binding station to interface 'wlan0'
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 WPA: event 6 notification
Mon Mar 7 05:34:16 2022 daemon.notice hostapd: wlan1: Prune association for 08:c5:e1:61:0d:b0
Mon Mar 7 05:34:16 2022 daemon.notice hostapd: wlan1: AP-STA-DISCONNECTED 08:c5:e1:61:0d:b0
Mon Mar 7 05:34:16 2022 daemon.debug hostapd: wlan0: STA 08:c5:e1:61:0d:b0 WPA: FT authentication already completed - do not start 4-way handshake
Mon Mar 7 05:34:18 2022 daemon.debug hostapd: wlan1: STA 08:c5:e1:61:0d:b0 MLME: MLME-DISASSOCIATE.indication(08:c5:e1:61:0d:b0, 1)
Mon Mar 7 05:34:18 2022 daemon.debug hostapd: wlan1: STA 08:c5:e1:61:0d:b0 MLME: MLME-DELETEKEYS.request(08:c5:e1:61:0d:b0)
Mon Mar 7 05:34:46 2022 daemon.info hostapd: wlan1: STA 08:c5:e1:61:0d:b0 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Mon Mar 7 05:34:46 2022 daemon.debug hostapd: wlan1: STA 08:c5:e1:61:0d:b0 MLME: MLME-DEAUTHENTICATE.indication(08:c5:e1:61:0d:b0, 2)
Mon Mar 7 05:34:46 2022 daemon.debug hostapd: wlan1: STA 08:c5:e1:61:0d:b0 MLME: MLME-DELETEKEYS.request(08:c5:e1:61:0d:b0)
I checked if the phone was still connected with :
iw dev wlan0 station dump
It was still connected (unfortunately I forgot to copy the output).
And I got connectivity back without needing to restart wifi or reboot the router , we may need to apply one of the workaround scripts suggested on that forum posts to apply the scan every time an event 60 occurs.
I tried the same solution & yes it works, but because the unknown event 60 occurs every 30 seconds to 3 minutes, the script runs iw dev wlan0 scan trigger freq 2447 flush every 30ish seconds.
I'm not sure what's the performance implications of the above, but just wanted to highlight that.
BTW this is just FYI, the ath9k-watchdog.sh script only triggers the scan when 2.4GHz encounters event 60, but I have more event 60s on 5GHz & it works w/o a problem.
@Lynx - I disabled cell density coverage option cell_density '0' and she hasn't complained to me about her phone dropping since doing that... but if I look in my log, I see tons of entries indicating the drop occurred to this day.
It's not just the mac address corresponding to her iPhone X, my 13 is present there as well as an iPad too.
Below, xx:xx:xx:xx:xx:xx is the mac address of the iPhone X which happened dozens of times. I just show one representative example:
I think there may be separate issues here. I have 3x RT3200's connected via WDS (one guest WiFi WDS AP on 2.4 and another normal Wifi WDS on 5) and. I have FT. All works fine for all devices save for the iPhone. So I wonder if there may be something broken with your config?
@darksky are you seeing this only for Apple type devices?