Debugging Wi-Fi disconnections for a Windows machine

Hello,

I have recently moved to a new place, and gotten some additional hardware for Wi-Fi coverage throughout the house. My setup is relatively straightforward: a Netgear R7800 as the main router & DHCP server, then two TP-Link Archer C6 v3.2s acting as 'dumb' access points/managed switches. These are all on the same subnet, and I do not have any fancy VLAN/VPN/firewall business.

I have flashed the latest OpenWRT versions on these devices, and it seems to be working relatively okay for most of my devices (2 android phones, a television connected by cable, and a laptop running Linux). However, my partner's Windows machine seems to run into WiFi trouble (cable works fine). At seemingly random moments, the machine reports that it is connected to WiFi, but not to the internet. This usually resolves after a couple minutes have passed. I'd like to fix it, but am not sure how to proceed in debugging.

I've looked at the logs of the access point, where I can see the disconnect happening. However, the device successfully reconnects to the WiFi immediately, so I am not sure why it reports no internet connectivity. Running ipconfig shows a local IPv4 address, although its global IPv6 address is missing. pinging does not work, also not for local addresses.

A snippet of such a disconnect happening:

Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED [mac redacted]
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: authenticated
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: associated (aid 1)
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: STA-OPMODE-SMPS-MODE-CHANGED [mac redacted] off
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED [mac redacted] auth_alg=open
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] RADIUS: starting accounting session 5AB6FC4A7C292C6A
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] WPA: pairwise key handshake completed (RSN)
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED [mac redacted]
Wed Dec 28 12:00:20 2022 daemon.notice hostapd: phy1-ap0: BEACON-REQ-TX-STATUS [mac redacted] 185 ack=1

Let me know if more information on the configuration is wanted. I've mostly used LuCi to do the configuration. I have also installed DAWN on all devices, but the disconnects are not preceded by anything dawn related in the logs as far as I can tell.

Is there any activity on the other AP ?
Possibly trying to switch, failing, and returning to the first one?

Also, what’s the debug level set at?

Also, starting hostapd manually gives you a lot more info to trace issues.

Nope, there is no activity on the other switches when this happens -- I guess they are out of range. The only other activity I see in the log is in that of the DHCP server, which looks like a regular renewal after the reconnect occurs.

I haven't touched the debug levels. I'll try increasing them and will post back later if this gets me more information.

Alright, I increased the hostapd debug level from info=2 to debug=1 and waited for another such disconnection. It looks like this in the log:

Fri Jan  6 11:19:24 2023 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED [mac redacted]
Fri Jan  6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: event 3 notification
Fri Jan  6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.1X: unauthorizing port
Fri Jan  6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: deauthenticated
Fri Jan  6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-DEAUTHENTICATE.indication([mac redacted], 1)
Fri Jan  6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-DELETEKEYS.request([mac redacted])
Fri Jan  6 11:19:25 2023 daemon.notice hostapd: phy0-ap0: BEACON-REQ-TX-STATUS [unrelated-mac-redacted] 79 ack=1
Fri Jan  6 11:19:28 2023 daemon.notice hostapd: phy0-ap0: BEACON-REQ-TX-STATUS [unrelated-mac-redacted] 80 ack=1
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: authentication OK (open system)
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-AUTHENTICATE.indication([mac redacted], OPEN_SYSTEM)
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-DELETEKEYS.request([mac redacted])
Fri Jan  6 11:19:31 2023 daemon.info hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: authenticated
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: association OK (aid 1)
Fri Jan  6 11:19:31 2023 daemon.info hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: associated (aid 1)
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-ASSOCIATE.indication([mac redacted])
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-DELETEKEYS.request([mac redacted])
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: binding station to interface 'phy1-ap0'
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: event 1 notification
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: start authentication
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.1X: unauthorizing port
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: sending 1/4 msg of 4-Way Handshake
Fri Jan  6 11:19:31 2023 daemon.notice hostapd: phy1-ap0: STA-OPMODE-SMPS-MODE-CHANGED [mac redacted] off
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: received EAPOL-Key frame (2/4 Pairwise)
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: sending 3/4 msg of 4-Way Handshake
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] WPA: received EAPOL-Key frame (4/4 Pairwise)
Fri Jan  6 11:19:31 2023 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED [mac redacted] auth_alg=open
Fri Jan  6 11:19:31 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] IEEE 802.1X: authorizing port
Fri Jan  6 11:19:31 2023 daemon.info hostapd: phy1-ap0: STA [mac redacted] RADIUS: starting accounting session 2D404439E95CD44B
Fri Jan  6 11:19:31 2023 daemon.info hostapd: phy1-ap0: STA [mac redacted] WPA: pairwise key handshake completed (RSN)
Fri Jan  6 11:19:31 2023 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED [mac redacted]
Fri Jan  6 11:19:32 2023 daemon.notice hostapd: phy0-ap0: BEACON-REQ-TX-STATUS [unrelated-mac-redacted] 81 ack=1
Fri Jan  6 11:19:35 2023 daemon.notice hostapd: phy1-ap0: BEACON-REQ-TX-STATUS [mac redacted] 159 ack=1

???

Can you provide details?

I've read your thread a few times, I only see RADIUS noted in the log - you never mention this.

You mean the 2 TP's?

Hey, thanks for your response!

I haven't mentioned RADIUS because I haven't touched or configured RADIUS in any way. The wifi network has a regular password, no need to enter credentials or anything.
I'm not sure why it occurs in the log, that is curious.

Yes, I have installed DAWN on both the TP-Links, and on the R7800.

Taking a different angle, I've tried to track down where in the source of hostapd these log entries come from, in the hope of learning more. I found out that this line:

Fri Jan  6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-DEAUTHENTICATE.indication([mac redacted], 1)

indicates that the client sends a 802.11 deauthentication frame to the access point, with reason code 1 = Unspecified reason. This differs from when I manually disconnect from the wifi with e.g. my phone, that will show reason code 3 = station is leaving. Not sure how to do something with this information, though.

…running hostapd manually gives you a lot more insight, especially when navigating it’s source code.

From the log it seems key share is ok (aid 1) but the transition somehow fails…