Archer C7 2.4 GHz wireless dies in 24~48 hours

@kacheng

I don't know about CA hardware as I just have EU models here. Maybe someone else could answer this.

For the mesh thingy, I'd recommend my writeup from this thread: BATADV (mesh) the best decision?

Perosnally, I prefer putting the mesh on 5 Ghz because I've got more bandwidth on the backhaul. e.g. my mesh "repeater" bridges the internet from "mesh-wifi (input) to lan (output)" (input/output not technically) and I use it on my tv "area" in the living room so the mesh "pumps" data for the lan clients tv, playstation, chromecast and a guest notebook. (my "base" router is connected to the internet and throws about 500mbit/sec over the mesh "bridge"). With 2.4 ghz it would be slower under my wifi interference environment with neighbors.

Basically your setup looks fine as you'd described it. I am for example running a lot of bash scripts on one archer and that was a while ago the "most instabelest" until I discovered that heavy "spike" loads could "crash" wifi drivers. e.g. performing heavy usb mass storage writes and find / searches simultaneously. If in doubt, please test the router in dumb access point mode, so without wan/nat, routing, firewall, etc. "just bridge lan to wifi" and see if it gets better.

So, my network has still been terrible with freezes and drop-outs.

I am now back to stock TP-Link firmware, using WDS.
It has been much more stable, which I guess is due to the proprietary drivers?

I DO miss mesh/Wireguard/VPN/ad block/DLNA, though.
A lot.

new year's greetings,

As I don't need ipv6 for the moment, I removed the following packages and my 2 C7V2 seem to be stable:

opkg remove ip6tables
opkg remove kmod-ip6tables
opkg remove kmod-nf-ipt6
opkg remove kmod-nf-reject6
opkg remove odhcp6c
opkg remove odhcpd-ipv6only

Maybe it is just some magic coincidences.

4 Likes

On OpenWrt 19.07.X, have you tried removing kmod-ath10k-ct and installing kmod-ath10k-ct-smallbuffers? I have an Archer C58 v1 router which constantly crashed every 8-12 h with a high 5GHz network load - several devices using VPN services with MS Teams etc. - as there were 3 people in the house working from home. I removed the default kmod-ath10k-ct, as the Archer C58 v1 has 64Mb of RAM, and when I replaced with kmod-ath10k-ct-smallbuffers I have had no issues for several days. I carried out the following steps.

opkg update
opkg remove kmod-ath10k-ct
opkg install kmod-ath10k-ct-smallbuffers
reboot

Using non ct drivers because batman mesh doesn't work sae wpa3 encrypted using ct drivers.

Not too much talk about Ethernet above, which makes me think I actually have the same problem but have been thinking about it wrong.

Is anyone seeing Ethernet go out too, or is that a parallel problem? If anything, it's gotten worse with the recent 19.07 point releases here, and since I can't access the router at all once it happens until it's cold-booted, it's a little difficult to figure out.

It's definitely a router under heavy use, but that's always been true.

1 Like

Ethernet has always worked for me flawlessly. Thought the same for 2.4GHz WiFi for 5 years until recently. I used to run C7 v2 as dumb AP with two SSIDs backhauled to another router via VLAN trunk. However, I have recently started using C7 as a Wireguard server for my remote IP camera recording. Although Wireguard runs only through wired, 2.4 GHz wifi began freezing every other day or so. It just seems that high CPU usage kills it.

I will try replacing the power supply to rule that one out, at least...

1 Like

I'm running an GL Inet AR300M with OpenWrt 21.02rc1 which uses ath9k driver.

I'm also experiencing the same symptom where the wifi dies or slow to a crawl with heavy traffic.
I ran iperf over wifi which causes it to crash within 1-3hrs. I have tried to diagnose the problem and have found a workaround where it stops the 2.4GHz wifi crashing without a wifi down/wifi commands.

For some reason, doing a quick scan stops the wifi dying and/or also recovers it should it dies. Your clients will stay connected.

You will need to add to /etc/rc.local,

while sleep 300; do iw dev $(iwinfo|grep -m 1 -B 2 'Master.*2\.4'|grep ESSID|awk '{print $1}') scan trigger freq 2447 flush >/dev/null 2>&1; done &

I have also monitored the wifi event without the workaround

iw event -f -t

and notice when things goes bad you get event number 64,84
64 = notify_cqm
84 = probe_client

1621413328.127835: wlan1 (phy #0): unknown event 60
1621413328.143278: wlan1 (phy #0): unknown event 60
1621413388.122103: wlan1 (phy #0): unknown event 60
1621413388.140276: wlan1 (phy #0): unknown event 60
1621413448.118700: wlan1 (phy #0): unknown event 60
1621413448.137628: wlan1 (phy #0): unknown event 60
1621413508.129351: wlan1 (phy #0): unknown event 60
1621413508.147463: wlan1 (phy #0): unknown event 60
1621413568.114870: wlan1 (phy #0): unknown event 60
1621413568.139285: wlan1 (phy #0): unknown event 60
1621413628.174945: wlan1 (phy #0): unknown event 60
1621413628.182307: wlan1 (phy #0): unknown event 60
1621413688.166942: wlan1 (phy #0): unknown event 60
1621413688.396813: wlan1 (phy #0): unknown event 60
1621413740.243602: wlan0 (phy #0): unknown event 64
1621413749.670027: wlan0 (phy #0): unknown event 64
1621413755.386236: wlan0 (phy #0): unknown event 64
1621413998.245463: wlan1 (phy #0): unknown event 84
1621414001.235986: wlan1 (phy #0): unknown event 60
1621414002.233819: wlan1: del station xxxxxxxxx
1621414002.234499: wlan1 (phy #0): unknown event 60
1621414039.392124: wlan0 (phy #0): unknown event 84
1621414039.845343: wlan1: del station xxxxxxxxxxx
1621414039.845921: wlan1 (phy #0): unknown event 84
1621414039.846183: wlan1 (phy #0): unknown event 60
1621414039.846339: wlan1 (phy #0): unknown event 60
1621414040.113765: wlan1: del station xxxxxxxxxxxxx
1621414040.114644: wlan1 (phy #0): unknown event 84
1621414040.114929: wlan1 (phy #0): unknown event 60
1621414040.115079: wlan1 (phy #0): unknown event 60
1621414040.437906: wlan1: del station xxxxxxxxxxxxxxx
1621414040.438638: wlan1 (phy #0): unknown event 84
1621414040.438917: wlan1 (phy #0): unknown event 60
1621414040.439070: wlan1 (phy #0): unknown event 60
1621414042.378414: wlan0 (phy #0): unknown event 60
1621414043.379161: wlan0: del station xxxxxxxxxxxxxxxx
1621414043.379591: wlan0 (phy #0): unknown event 60

Maybe a trigger scan can be implemented based on the event instead time period if it works.

2 Likes

Just a follow up notice: I saw your command to trigger the scan depends on iwinfo, which is by default not available and requires "opkg update; opkg install iwinfo".

I've found a command to do the same without requiring iwinfo.

iw dev $(iw dev|grep "Interface\|channel"|grep -B 2 'channel.*24..'|tail -n 2|grep Interface|awk '{print $2}') scan trigger freq 2447 flush

results to:

iw dev wlan1 scan trigger freq 2447 flush

After triggering the scan with the command, I noticed the following event IDs come up:

1621509124.638080: wlan1 (phy #1): unknown event 33
1621509124.713185: wlan1 (phy #1): unknown event 34

ref # State of TP-Link Archer C7v2|v5 in 2021 - #18 by hecz0r

Trying to stress my AP at the moment ... got event 84 (probe_client) von phy0 (5 GHz) but everything seems still to work okay.

Will watch out if I can trigger event 64 (notify_cqm) ....

iw event -f -t | grep "station\|event 84\|event 64"

Thanks, I think it needs to get AP and flush

iw dev $(iw dev|grep "Interface\|channel\|type"|grep -B 2 'channel.*24..'|grep -B 1 'AP'|tail -n 2|grep Interface|awk '{print $2}') scan trigger freq 2447 flush >/dev/null 2>&1

I notice it slowly dying until I tuned the period to 30 secs, not ideal. It would be great if we could detect the slowness and then trigger the command

1 Like

@sammo : If we get to detect it reliably I can update my wrtwatchdog script I'm currently using on my Archers :slight_smile:

@sammo Which channel are you on? Do I need to adjust freq e.g. to 2412 for channel 1 I'm using ?

I scan on the channel which is not my AP channel

1 Like

@sammo: Just to understand your findings better, I've got those event 84 on wlan0 (5 GHz radio) but they occur rarely. Does that mean my AP got - according to your findings - instable already or does that have nothing to say because it's not the 2.4 GHz radio which sent the event?

@Catfriend1

I'm not sure as I dont have 5Ghz on my device. iw event should tell you what events are happening on wifi.
i have the scan command run every 30 sec and I haven't experience any drop out. It's not a fix but a workaround

1 Like

would you be kind to explain why use scan trigger freq 2447 instead of simply scan?
is this due to CPU usage or are there other reasons?

1 Like

one channel scan is quicker than full scan. I have noticed iperf 'beep' and the rate drops on full scan

2 Likes

I found that rc.local will never exit when adding this background process, nohup does not exist with ash.

I'm running background services forked off rc.local like this: Ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon - #5 by Catfriend1

Also planning to update this watchdog to act iw event code based just-in-time with the iw scan if this thread and input by @sammo turns out to be a better solution to work around wifi problems on the Archers.