I have recently moved to a new place, and gotten some additional hardware for Wi-Fi coverage throughout the house. My setup is relatively straightforward: a Netgear R7800 as the main router & DHCP server, then two TP-Link Archer C6 v3.2s acting as 'dumb' access points/managed switches. These are all on the same subnet, and I do not have any fancy VLAN/VPN/firewall business.
I have flashed the latest OpenWRT versions on these devices, and it seems to be working relatively okay for most of my devices (2 android phones, a television connected by cable, and a laptop running Linux). However, my partner's Windows machine seems to run into WiFi trouble (cable works fine). At seemingly random moments, the machine reports that it is connected to WiFi, but not to the internet. This usually resolves after a couple minutes have passed. I'd like to fix it, but am not sure how to proceed in debugging.
I've looked at the logs of the access point, where I can see the disconnect happening. However, the device successfully reconnects to the WiFi immediately, so I am not sure why it reports no internet connectivity. Running ipconfig shows a local IPv4 address, although its global IPv6 address is missing. pinging does not work, also not for local addresses.
A snippet of such a disconnect happening:
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED [mac redacted]
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: authenticated
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] IEEE 802.11: associated (aid 1)
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: STA-OPMODE-SMPS-MODE-CHANGED [mac redacted] off
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED [mac redacted] auth_alg=open
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] RADIUS: starting accounting session 5AB6FC4A7C292C6A
Wed Dec 28 12:00:14 2022 daemon.info hostapd: phy1-ap0: STA [mac redacted] WPA: pairwise key handshake completed (RSN)
Wed Dec 28 12:00:14 2022 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED [mac redacted]
Wed Dec 28 12:00:20 2022 daemon.notice hostapd: phy1-ap0: BEACON-REQ-TX-STATUS [mac redacted] 185 ack=1
Let me know if more information on the configuration is wanted. I've mostly used LuCi to do the configuration. I have also installed DAWN on all devices, but the disconnects are not preceded by anything dawn related in the logs as far as I can tell.
Nope, there is no activity on the other switches when this happens -- I guess they are out of range. The only other activity I see in the log is in that of the DHCP server, which looks like a regular renewal after the reconnect occurs.
I haven't touched the debug levels. I'll try increasing them and will post back later if this gets me more information.
I haven't mentioned RADIUS because I haven't touched or configured RADIUS in any way. The wifi network has a regular password, no need to enter credentials or anything.
I'm not sure why it occurs in the log, that is curious.
Yes, I have installed DAWN on both the TP-Links, and on the R7800.
Taking a different angle, I've tried to track down where in the source of hostapd these log entries come from, in the hope of learning more. I found out that this line:
Fri Jan 6 11:19:24 2023 daemon.debug hostapd: phy1-ap0: STA [mac redacted] MLME: MLME-DEAUTHENTICATE.indication([mac redacted], 1)
indicates that the client sends a 802.11 deauthentication frame to the access point, with reason code 1 = Unspecified reason. This differs from when I manually disconnect from the wifi with e.g. my phone, that will show reason code 3 = station is leaving. Not sure how to do something with this information, though.
Sorry for the radio silence: I had not much to go on, and the problem appeared less frequently. Unfortunately, I also started noticing connection problems on my Linux laptop.
My laptop appeared to disconnect randomly, but never when I was using it. If I left the laptop running unattended, it would disconnect, but once I started using it again, it would reconnect. I initially thought this might be some power saving thing, but I have since ruled that out. I can now reproduce the following reliably: My laptop will lose its connection within 2-6 minutes when it is the only WiFi device connected to the access point. My laptop disconnects when I leave because I take my phone with me, which disconnects from the access point. My laptop reconnects when I return, because my phone reconnects to the access point. If left the phone and laptop unattended in the same room, the connection stays stable.
By losing connection I mean: I am unable to ping any address from my laptop, even local addresses: I get a 'Destination Host Unreachable' error for everything. Using the advice given above, I ran hostapd manually with the -dd flag, although it did not give me more information than before. Curiously, hostapd does not see the disconnect, and reports succes when sending the BEACON- messages to my device. Pinging the access point from my laptop does not show anything interesting, except that it does really lose connection: I get Destination Host Unreachable at some point, which repeats until my phone is back in the room. A couple of ping responses arrive very late. Pinging my laptop from the access point is somewhat more interesting: no errors, but alot of the ping messages somehow clog up until the phone is back, then they all return at once. In the log, this shows as a sudden jump in response time.