OK, I'm in the process of a bit of a headscratcher here. After not updating OpenWRT for a while, I rebuilt it about a week ago and flashed my test device. Since then, it is only able to connect to wifi for 10 seconds or so approximately every 2 minutes. From another machine, here's a snippet of ping results:
64 bytes from <redacted>: icmp_seq=1100 ttl=61 time=6.81 ms
64 bytes from <redacted>: icmp_seq=1101 ttl=61 time=2.46 ms
64 bytes from <redacted>: icmp_seq=1102 ttl=61 time=2.29 ms
64 bytes from <redacted>: icmp_seq=1103 ttl=61 time=2.30 ms
64 bytes from <redacted>: icmp_seq=1104 ttl=61 time=2.53 ms
64 bytes from <redacted>: icmp_seq=1105 ttl=61 time=2.38 ms
64 bytes from <redacted>: icmp_seq=1106 ttl=61 time=2.39 ms
64 bytes from <redacted>: icmp_seq=1107 ttl=61 time=2.87 ms
64 bytes from <redacted>: icmp_seq=1108 ttl=61 time=2.66 ms
64 bytes from <redacted>: icmp_seq=1109 ttl=61 time=2.59 ms
64 bytes from <redacted>: icmp_seq=1231 ttl=61 time=120 ms
64 bytes from <redacted>: icmp_seq=1232 ttl=61 time=2.34 ms
64 bytes from <redacted>: icmp_seq=1233 ttl=61 time=2.85 ms
64 bytes from <redacted>: icmp_seq=1234 ttl=61 time=2.76 ms
64 bytes from <redacted>: icmp_seq=1235 ttl=61 time=2.40 ms
64 bytes from <redacted>: icmp_seq=1236 ttl=61 time=2.39 ms
64 bytes from <redacted>: icmp_seq=1237 ttl=61 time=2.12 ms
64 bytes from <redacted>: icmp_seq=1238 ttl=61 time=2.30 ms
64 bytes from <redacted>: icmp_seq=1239 ttl=61 time=2.43 ms
In system.log, I see repeatedly:
Tue May 16 13:33:01 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: CTRL-EVENT-SSID-REENABLED id=0 ssid="<REDACTED>"
Tue May 16 13:33:01 2023 daemon.notice wpa_supplicant[1409]: BSSID <REDACTED> ignore list count incremented to 3, ignoring for 60 seconds
Tue May 16 13:33:01 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: Trying to associate with SSID 'Automation'
Tue May 16 13:33:04 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: Associated with <REDACTED>
Tue May 16 13:33:04 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: Authentication with <REDACTED> timed out.
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: BSSID bc:26:c7:fe:4f:40 ignore list count incremented to 4, ignoring for 120 seconds
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: CTRL-EVENT-DISCONNECTED bssid=bc:26:c7:fe:4f:40 reason=3 locally_generated=1
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: BSSID <REDACTED> ignore list count incremented to 5, ignoring for 600 seconds
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="<REDACTED>" auth_failures=12 duration=120 reason=CONN_FAILED
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: CTRL-EVENT-DSCP-POLICY clear_all
So here's what is puzzling me: If authentication is timing out, how in the heck is this thing passing traffic?
The interval between "Associated" and "Authentication timed out" is 10 seconds - which is almost exactly the duration for which the interface actually passed traffic.
Other OpenWRT devices running an older Pi4 or Pi3 build connect to the same network with no issues whatsoever - this appears to be a regression that crept in sometime within the past month or so. (Sadly, I haven't been updating and rebuilding quite as often as I used to. I also made the rather stupid mistake of not backing up the last "known working" factory and sysupgrade images, although worst case I can probably mangle some together by dding from a working device...)
Has anyone seen a scenario where wpa_supplicant disconnects with an "Authentication timed out" error, but the interface passed traffic prior to the timeout, indicating that it was properly authenticated (otherwise it would not have been able to pass traffic)?
The 2 minute duration is consistent with:
Tue May 16 13:33:14 2023 daemon.notice wpa_supplicant[1409]: phy0-sta0: CTRL-EVENT-SSID-TEMP-DISABLED id=0 ssid="<REDACTED>" auth_failures=12 duration=120 reason=CONN_FAILED