Random disassociation due to inactivity

I believe the recent hostapd may have introduced a bug or has some race in it. I get these random disassociations for clients that are active and just authenticated. What's worse is the client doesn't try to associate again until I toggle its wifi off/on. The log looks a bit strange to me:

...
Mon Mar 15 19:18:08 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.1X: authorizing port
Mon Mar 15 19:18:08 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx RADIUS: starting accounting session ***
Mon Mar 15 19:18:08 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: pairwise key handshake completed (RSN)

^^^ normal, the client keeps on going for many hours

-- this is the next entry in the log for this client:
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: authentication OK (open system)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: event 0 notification
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-AUTHENTICATE.indication(xx:xx:xx:xx:xx:xx, OPEN_SYSTEM)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-DELETEKEYS.request(xx:xx:xx:xx:xx:xx)
Tue Mar 16 09:17:38 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: authenticated
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: association OK (aid 2)
Tue Mar 16 09:17:38 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: associated (aid 2)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-REASSOCIATE.indication(xx:xx:xx:xx:xx:xx)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-DELETEKEYS.request(xx:xx:xx:xx:xx:xx)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: binding station to interface 'wlan0'
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: event 1 notification
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: sending 1/4 msg of 4-Way Handshake
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: received EAPOL-Key frame (2/4 Pairwise)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: sending 3/4 msg of 4-Way Handshake
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: received EAPOL-Key frame (4/4 Pairwise)
Tue Mar 16 09:17:38 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.1X: authorizing port
Tue Mar 16 09:17:38 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx RADIUS: starting accounting session ***
Tue Mar 16 09:17:38 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: pairwise key handshake completed (RSN)

-- 2 seconds later, event 2 and disassociated for inactivity. Why?

Tue Mar 16 09:17:40 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: event 2 notification
Tue Mar 16 09:17:40 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.1X: unauthorizing port
Tue Mar 16 09:17:40 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: disassociated
Tue Mar 16 09:17:40 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-DISASSOCIATE.indication(xx:xx:xx:xx:xx:xx, 8)
Tue Mar 16 09:17:40 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-DELETEKEYS.request(xx:xx:xx:xx:xx:xx)
Tue Mar 16 09:17:41 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Tue Mar 16 09:17:41 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-DEAUTHENTICATE.indication(xx:xx:xx:xx:xx:xx, 2)
Tue Mar 16 09:17:41 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx MLME: MLME-DELETEKEYS.request(xx:xx:xx:xx:xx:xx)

What is an event 2?

I wonder whether this will fix it and whether it will be merged to the 21.02 branch.

I set option disassoc_low_ack '0' and I no longer see IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE) but the client still disconnects 2 seconds after it connects:

Sun Mar 21 10:39:02 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.1X: authorizing port
Sun Mar 21 10:39:02 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx RADIUS: starting accounting session ...
Sun Mar 21 10:39:02 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: pairwise key handshake completed (RSN)
--- 2 seconds later ---
Sun Mar 21 10:39:04 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: event 2 notification
Sun Mar 21 10:39:04 2021 daemon.debug hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.1X: unauthorizing port
Sun Mar 21 10:39:04 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: disassociated

This feels like some timing error/bug somewhere.

What's interesting is that there's only 1 client out of about 15 that gets this WPA: event 2 notification and quick disconnect behavior. It is frustrating because it's my main laptop, a 2018 MacBook Pro.

When you are talking about "recent hostapd", is it on snapshot, or 21.02, or 19.07.7 ?

And btw, I have read that on master there would only be one hostapd for 2.4Ghz and 5Ghz radios. Is that true, and is it also part of 21.02 ?

Master and 21.02, the backports happened before the branching to 21.02

I keep hoping one of these will fix it ...

I have similar issue with upstream ath10k driver and firmware using an iPhone 11.

Sat Apr  3 17:13:35 2021 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED xx:xx:xx:xx:xx:xx
Sat Apr  3 17:13:35 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: disassociated
Sat Apr  3 17:13:36 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Sat Apr  3 17:13:50 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: associated (aid 2)
Sat Apr  3 17:13:50 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED xx:xx:xx:xx:xx:xx
Sat Apr  3 17:13:50 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: pairwise key handshake completed (RSN)
Sat Apr  3 17:13:51 2021 daemon.info dnsmasq-dhcp[2676]: DHCPREQUEST(br-lan) 192.168.1.237 xx:xx:xx:xx:xx:xx
Sat Apr  3 17:13:51 2021 daemon.info dnsmasq-dhcp[2676]: DHCPACK(br-lan) 192.168.1.237 xx:xx:xx:xx:xx:xx Castiels-iPhone
Sat Apr  3 17:15:02 2021 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED xx:xx:xx:xx:xx:xx
Sat Apr  3 17:15:02 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: disassociated
Sat Apr  3 17:15:03 2021 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)

can you post your hostapd-ph*.conf from /tmp/run ?

I found an event 2 here:
http://lists.shmoo.com/pipermail/hostap/2015-March/032357.html

Here's a redacted version of the 5GHz setup (/tmp/run/hostapd-phy0.conf). This is the one I really care about, the 2.4GHz is for some IoT devices and don't mind if they get disconnected occasionally as long as they get back on.

I use 3 SSIDs, each on its own VLAN, hence the multiple entries. My R7800 runs in plain AP mode, it doesn't do anything else, there's another wired router that does everything else.

driver=nl80211
logger_syslog=127
logger_syslog_level=2
logger_stdout=127
logger_stdout_level=2
country_code=US
ieee80211d=1
ieee80211h=1
hw_mode=a
beacon_int=100
dtim_period=2
channel=149
chanlist=149
noscan=1
tx_queue_data2_burst=2.0
ieee80211n=1
ht_coex=0
ht_capab=[HT40+][LDPC][SHORT-GI-20][SHORT-GI-40][TX-STBC][RX-STBC1][MAX-AMSDU-7935][DSSS_CCK-40]
ieee80211ac=1
vht_oper_chwidth=1
vht_oper_centr_freq_seg0_idx=155
vht_capab=[RXLDPC][SHORT-GI-80][SHORT-GI-160][TX-STBC-2BY1][SU-BEAMFORMER][SU-BEAMFORMEE][MU-BEAMFORMER][MU-BEAMFORMEE][RX-ANTENNA-PATTERN][TX-ANTENNA-PATTERN][RX-STBC-1][VHT160-80PLUS80][MAX-MPDU-11454][MAX-A-MPDU-LEN-EXP7]

radio_config_id=f01dee1ba2a0f2259080aa3d2cfe6433
interface=wlan0
ctrl_interface=/var/run/hostapd
ap_isolate=1
bss_load_update_period=60
chan_util_avg_period=600
disassoc_low_ack=0
skip_inactivity_poll=0
preamble=1
wmm_enabled=1
ignore_broadcast_ssid=0
uapsd_advertisement_enabled=1
utf8_ssid=1
multi_ap=0
wpa_passphrase=...
wpa_psk_file=/var/run/hostapd-wlan0.psk
auth_algs=1
wpa=2
wpa_pairwise=CCMP
ssid=wifi-5GHz
bridge=br-lan
wpa_disable_eapol_key_retries=0
wpa_key_mgmt=WPA-PSK
okc=0
disable_pmksa_caching=1
dynamic_vlan=0
vlan_naming=1
vlan_file=/var/run/hostapd-wlan0.vlan
config_id=e706b3286e1a4b551594786f96f6b85f
bssid=78:d2:...


bss=wlan0-1
ctrl_interface=/var/run/hostapd
ap_isolate=1
bss_load_update_period=60
chan_util_avg_period=600
disassoc_low_ack=0
skip_inactivity_poll=0
preamble=1
wmm_enabled=1
ignore_broadcast_ssid=0
uapsd_advertisement_enabled=1
utf8_ssid=1
multi_ap=0
wpa_passphrase=...
wpa_psk_file=/var/run/hostapd-wlan0-1.psk
auth_algs=1
wpa=2
wpa_pairwise=CCMP
ssid=wifi-guest
bridge=br-guest
wpa_disable_eapol_key_retries=0
wpa_key_mgmt=WPA-PSK
okc=0
disable_pmksa_caching=1
dynamic_vlan=0
vlan_naming=1
vlan_file=/var/run/hostapd-wlan0-1.vlan
config_id=97d85dc6422ba21b71abdc80352bc851
bssid=7a:d2:...


bss=wlan0-2
ctrl_interface=/var/run/hostapd
ap_isolate=1
bss_load_update_period=60
chan_util_avg_period=600
disassoc_low_ack=0
skip_inactivity_poll=0
preamble=1
wmm_enabled=1
ignore_broadcast_ssid=0
uapsd_advertisement_enabled=1
utf8_ssid=1
multi_ap=0
wpa_passphrase=...
wpa_psk_file=/var/run/hostapd-wlan0-2.psk
auth_algs=1
wpa=2
wpa_pairwise=CCMP
ssid=wifi-work
bridge=br-work
wpa_disable_eapol_key_retries=0
wpa_key_mgmt=WPA-PSK
okc=0
disable_pmksa_caching=1
dynamic_vlan=0
vlan_naming=1
vlan_file=/var/run/hostapd-wlan0-2.vlan
config_id=170b2251502d70556eb1144f644a21a7
bssid=7e:d2:...

I do not find this event 2. If it is defined in https://w1.fi/wpa_supplicant/devel/driver_8h_source.html, then it could be EVENT_MICHAEL_MIC_FAILURE, but this is just guessing, and I do not detect any error message parts that fit to your log. Unfortunately I do not know the correct source files.
What you could do is to increase your hostapd log level to verbose_debug as described in https://serverfault.com/questions/685199/monitor-wifi-association-attempts
and provide the section between authentication and disconnection that belongs to this STA.

1 Like

If event 2 = EVENT_MICHAEL_MIC_FAILURE, then it's kinda strange, as I read it that applies to TKIP, but I'm using psk2+ccmp. Though I do have 802.11r Fast Transition enabled with local PSK, maybe there is a bug somewhere and things get confused. This event 2 error shows up for a Mac client and from what I can gather, macOS doesn't support 802.11r and this client is in the same spot, it should not even try to roam.

Not exactly sure what this means though and how it may confuse things:

macOS supports static PMKID (Pairwise Master Key identifier) caching to help optimize roaming between BSSIDs in the same ESSID. macOS doesn't support Fast BSS Transition, also known as 802.11r.