Archer C7 2.4 GHz wireless dies in 24~48 hours

Hi

using the wdr3600 gives same problems, both on ar71xx and ath79 at least on 19.0 version.
Posting this since, the wdr3600 seems being used by far less persons.
Could this be then a mysterious bug on the firmware?
Does anybody know if this bug is reported on bugs.openwrt.org?

Is it possible to manually replace the Ath10k drivers with ones from OpenWrt 18.x? I'm assuming no, but not sure how to determine this. Would mesh be supported?

Could be a failing power supply?
This was an issue for me at one point for a cable modem.

@kacheng Just use the non ct drivers in 19.x and mesh is supported. I'm running mesh here.

@Catfriend1 Would you mind sharing some details of your current tweaks to achieve stability?

I have configured the following. Still get occassional drop-outs and iPhones requiring wifi off/on cycling on 2.4GHz.

Router Archer C7 v2
-19.07.5, replaced ct drivers with non ct drivers
-5GHz SSID WPA2 on channel 36 with 80MHz width
-mesh with WPA3 on 5Ghz, replaced wpad-basic with wpad-mesh-openssl
-2.4GHz SSID x 3 WPA2 on channel 1 with 20MHz width
-Network|Firewall|Software Flow Offloading checked
-SQM, configured to cake and piece_of_cake.qos
-Ad Block
-WireGuard VPN
-DDNS
-IPV6 disabled
-no cron nightly reboot yet
-no watchdog script yet

Mesh Node Archer C7 v2
-19.07.4, replaced ct drivers with non ct drivers
-mesh with WPA3 on 5Ghz, replaced wpad-basic with wpad-mesh-openssl
-2.4GHz SSID x 1 WPA2 on channel 1 with 20MHz width
-miniDLNA
-samba exFAT usb3 l
-IPV6 disabled
-no cron nightly reboot yet
-no watchdog script yet

I am hoping that I can buy more Archer C7s and quickly add more mesh nodes by simply restoring the config. However, the wifi instability is stopping me from building out just yet.
Also, I have always been wondering if I should put the mesh backhaul on 2.4GHz?

Thanks!

1 Like

@Catfriend1 I just found your post State of Archer C7 v2 in mid-2020 with some details and your posts afterwards with your EU and US custom builds. So great. Thanks for sharing those. Looks like your setup is similar to what I aspire to. Do you think I can use the US build for CA Canadian devices, or will I need to build my own config? Roughly how long does compiling a build take?

Mesh question: Does batman-adv mesh work better than configuring mesh with just 802.11s? I didn't use batman-adv when I configured my mesh, as I thought it was actually an old protocol or something. After some reading, I now understand it is a routing overlay that goes over top of 802.11s, but I don't quite understand it's significance or whether it's needed in my use case, which is several nodes around a house with one node that does dhcp and connects to an external ISP.

Thanks!

1 Like

@kacheng

I don't know about CA hardware as I just have EU models here. Maybe someone else could answer this.

For the mesh thingy, I'd recommend my writeup from this thread: BATADV (mesh) the best decision?

Perosnally, I prefer putting the mesh on 5 Ghz because I've got more bandwidth on the backhaul. e.g. my mesh "repeater" bridges the internet from "mesh-wifi (input) to lan (output)" (input/output not technically) and I use it on my tv "area" in the living room so the mesh "pumps" data for the lan clients tv, playstation, chromecast and a guest notebook. (my "base" router is connected to the internet and throws about 500mbit/sec over the mesh "bridge"). With 2.4 ghz it would be slower under my wifi interference environment with neighbors.

Basically your setup looks fine as you'd described it. I am for example running a lot of bash scripts on one archer and that was a while ago the "most instabelest" until I discovered that heavy "spike" loads could "crash" wifi drivers. e.g. performing heavy usb mass storage writes and find / searches simultaneously. If in doubt, please test the router in dumb access point mode, so without wan/nat, routing, firewall, etc. "just bridge lan to wifi" and see if it gets better.

So, my network has still been terrible with freezes and drop-outs.

I am now back to stock TP-Link firmware, using WDS.
It has been much more stable, which I guess is due to the proprietary drivers?

I DO miss mesh/Wireguard/VPN/ad block/DLNA, though.
A lot.

new year's greetings,

As I don't need ipv6 for the moment, I removed the following packages and my 2 C7V2 seem to be stable:

opkg remove ip6tables
opkg remove kmod-ip6tables
opkg remove kmod-nf-ipt6
opkg remove kmod-nf-reject6
opkg remove odhcp6c
opkg remove odhcpd-ipv6only

Maybe it is just some magic coincidences.

4 Likes

On OpenWrt 19.07.X, have you tried removing kmod-ath10k-ct and installing kmod-ath10k-ct-smallbuffers? I have an Archer C58 v1 router which constantly crashed every 8-12 h with a high 5GHz network load - several devices using VPN services with MS Teams etc. - as there were 3 people in the house working from home. I removed the default kmod-ath10k-ct, as the Archer C58 v1 has 64Mb of RAM, and when I replaced with kmod-ath10k-ct-smallbuffers I have had no issues for several days. I carried out the following steps.

opkg update
opkg remove kmod-ath10k-ct
opkg install kmod-ath10k-ct-smallbuffers
reboot

Using non ct drivers because batman mesh doesn't work sae wpa3 encrypted using ct drivers.

Not too much talk about Ethernet above, which makes me think I actually have the same problem but have been thinking about it wrong.

Is anyone seeing Ethernet go out too, or is that a parallel problem? If anything, it's gotten worse with the recent 19.07 point releases here, and since I can't access the router at all once it happens until it's cold-booted, it's a little difficult to figure out.

It's definitely a router under heavy use, but that's always been true.

1 Like

Ethernet has always worked for me flawlessly. Thought the same for 2.4GHz WiFi for 5 years until recently. I used to run C7 v2 as dumb AP with two SSIDs backhauled to another router via VLAN trunk. However, I have recently started using C7 as a Wireguard server for my remote IP camera recording. Although Wireguard runs only through wired, 2.4 GHz wifi began freezing every other day or so. It just seems that high CPU usage kills it.

I will try replacing the power supply to rule that one out, at least...

1 Like

I'm running an GL Inet AR300M with OpenWrt 21.02rc1 which uses ath9k driver.

I'm also experiencing the same symptom where the wifi dies or slow to a crawl with heavy traffic.
I ran iperf over wifi which causes it to crash within 1-3hrs. I have tried to diagnose the problem and have found a workaround where it stops the 2.4GHz wifi crashing without a wifi down/wifi commands.

For some reason, doing a quick scan stops the wifi dying and/or also recovers it should it dies. Your clients will stay connected.

You will need to add to /etc/rc.local,

while sleep 300; do iw dev $(iwinfo|grep -m 1 -B 2 'Master.*2\.4'|grep ESSID|awk '{print $1}') scan trigger freq 2447 flush >/dev/null 2>&1; done &

I have also monitored the wifi event without the workaround

iw event -f -t

and notice when things goes bad you get event number 64,84
64 = notify_cqm
84 = probe_client

1621413328.127835: wlan1 (phy #0): unknown event 60
1621413328.143278: wlan1 (phy #0): unknown event 60
1621413388.122103: wlan1 (phy #0): unknown event 60
1621413388.140276: wlan1 (phy #0): unknown event 60
1621413448.118700: wlan1 (phy #0): unknown event 60
1621413448.137628: wlan1 (phy #0): unknown event 60
1621413508.129351: wlan1 (phy #0): unknown event 60
1621413508.147463: wlan1 (phy #0): unknown event 60
1621413568.114870: wlan1 (phy #0): unknown event 60
1621413568.139285: wlan1 (phy #0): unknown event 60
1621413628.174945: wlan1 (phy #0): unknown event 60
1621413628.182307: wlan1 (phy #0): unknown event 60
1621413688.166942: wlan1 (phy #0): unknown event 60
1621413688.396813: wlan1 (phy #0): unknown event 60
1621413740.243602: wlan0 (phy #0): unknown event 64
1621413749.670027: wlan0 (phy #0): unknown event 64
1621413755.386236: wlan0 (phy #0): unknown event 64
1621413998.245463: wlan1 (phy #0): unknown event 84
1621414001.235986: wlan1 (phy #0): unknown event 60
1621414002.233819: wlan1: del station xxxxxxxxx
1621414002.234499: wlan1 (phy #0): unknown event 60
1621414039.392124: wlan0 (phy #0): unknown event 84
1621414039.845343: wlan1: del station xxxxxxxxxxx
1621414039.845921: wlan1 (phy #0): unknown event 84
1621414039.846183: wlan1 (phy #0): unknown event 60
1621414039.846339: wlan1 (phy #0): unknown event 60
1621414040.113765: wlan1: del station xxxxxxxxxxxxx
1621414040.114644: wlan1 (phy #0): unknown event 84
1621414040.114929: wlan1 (phy #0): unknown event 60
1621414040.115079: wlan1 (phy #0): unknown event 60
1621414040.437906: wlan1: del station xxxxxxxxxxxxxxx
1621414040.438638: wlan1 (phy #0): unknown event 84
1621414040.438917: wlan1 (phy #0): unknown event 60
1621414040.439070: wlan1 (phy #0): unknown event 60
1621414042.378414: wlan0 (phy #0): unknown event 60
1621414043.379161: wlan0: del station xxxxxxxxxxxxxxxx
1621414043.379591: wlan0 (phy #0): unknown event 60

Maybe a trigger scan can be implemented based on the event instead time period if it works.

1 Like

Just a follow up notice: I saw your command to trigger the scan depends on iwinfo, which is by default not available and requires "opkg update; opkg install iwinfo".

I've found a command to do the same without requiring iwinfo.

iw dev $(iw dev|grep "Interface\|channel"|grep -B 2 'channel.*24..'|tail -n 2|grep Interface|awk '{print $2}') scan trigger freq 2447 flush

results to:

iw dev wlan1 scan trigger freq 2447 flush

After triggering the scan with the command, I noticed the following event IDs come up:

1621509124.638080: wlan1 (phy #1): unknown event 33
1621509124.713185: wlan1 (phy #1): unknown event 34

ref # State of TP-Link Archer C7v2|v5 in 2021 - #18 by hecz0r

Trying to stress my AP at the moment ... got event 84 (probe_client) von phy0 (5 GHz) but everything seems still to work okay.

Will watch out if I can trigger event 64 (notify_cqm) ....

iw event -f -t | grep "station\|event 84\|event 64"

Thanks, I think it needs to get AP and flush

iw dev $(iw dev|grep "Interface\|channel\|type"|grep -B 2 'channel.*24..'|grep -B 1 'AP'|tail -n 2|grep Interface|awk '{print $2}') scan trigger freq 2447 flush >/dev/null 2>&1

I notice it slowly dying until I tuned the period to 30 secs, not ideal. It would be great if we could detect the slowness and then trigger the command

1 Like

@sammo : If we get to detect it reliably I can update my wrtwatchdog script I'm currently using on my Archers :slight_smile:

@sammo Which channel are you on? Do I need to adjust freq e.g. to 2412 for channel 1 I'm using ?

I scan on the channel which is not my AP channel

1 Like