Archer C6 on 22.03.2 crashes every ~3 days

While I am still on the lookout for the actual cause, I have implemented a script via cronjob that restarts the wifi in case of excessive memory consumption. Under (somewhat) regular usage, the device seems to use around 30000K to 35000K when checking via free. I took the data you posted, with 83236K used as indicator for memory consumption level before crash/wifi restart command.

To be on the safe side for now, and while I am waiting for user reports over the next days and week, I have set the threshold to 75000k. Time will tell if that is sufficient or too low.

Here is the short script I wrote, it is running via cronjob every 10 minutes:

#!/bin/ash

#Due to currently unknown issues, MT7621 targets experience infrequent complete memory consumption. 
#This leads to wifi and device crashing. This is possibly related to wifi driver issues under v22.03 releases 0 to 3. 
#Workaround test is implementing a free memory check that issues a wifi down & wifi up command in case of apparent memory consumption close to max. 

usedmem=$(free | grep "Mem" | cut -d' ' -f17)
freemem=$(free | grep "Mem" | cut -d' ' -f24)

#Archer C6v3 runs with 128MB RAM.
#Standard memory consumption fluctuates around 30MB.

if [ $usedmem -gt 75000 ]; then
    wifi
    echo "WARNING: Wifi restart due to excessive memory consumption"
fi

Hopefully there opens up an explanation for the issues somewhat soon. Workarounds in cases like these are never satisfying.

2 Likes

How to install the script? I use Wi-Fi Scheduler with a one minute radio off setting everynight at 2 am, I managed to get almost 23 days of uptime but today we had a power outage and I can't show the router page with the uptime.
Here's a screenshot from my ISP log page, first says "Connection date", second "Disconnect date" and the third is "Total Uptime" of PPPoE session, total 550 hours and 9 minutes of PPPoE uptime.
image

I guess you need to do it via CLI after login onto the router via ssh. If you are not familiar with that route, you'll find the necessary information here: https://openwrt.org/docs/guide-user/base-system/user.beginner.cli

After loging in, I created a file with vi /usr/sbin/memorycheck.sh and pasted my script into it. Then I made the file executable with chmod +x /usr/sbin/memorycheck.sh. Finally I created a cronjob that executes the script every 10 minutes with crontab -e and adding a new line with */10 * * * * /usr/sbin/memorycheck.sh

1 Like

Thank you! I did exactly as you sad.
I guess it's working?

1 Like

To stop the "cron.err" message to spam in systemlog every 10 minutes I changed in /etc/config/system the value of cronloglevel to 9, default is 8, that in case anyone don't know how to get rid of it.

TP-LINK has released patches for Archer C6 v3.20, you can see that there is a problem.
I noticed that the problem is with the WAN input autonegotiation does not work properly.
I'll check if they fixed it in the new patch.
Openwrt no longer installs at all until this is fixed.
Archer C6(EU)_V3.2_220923 original tplink

Charge
Date of publication: 2023-01-10 Language: multilingual File size: 14.87 MB
Modifications and bug fixes:

  1. Fixed some dial-up bugs;
  2. Fixed some bugs related to wireless connectivity;
  3. Fix the bug that the website does not show wireless clients in AP mode;
  4. Fix the error that the computer in the LAN cannot access the device via the DDNS domain name when enabling DynDNS and remote control;
  5. IPTV and QoS mutual restriction removed;
  6. Optimized smart linking function.
1 Like

Good to know, I will refrain from upgrading stock firmware before flashing new devices then. Thank you for the headsup.

From his ranting all over the place about his routers crashing it's not clear what he means with "OpenWrt no longer installs"
Yet we saw no bug report about his crashy routers.

1 Like

I've been running this script for a while. I am still getting crashes, but not as regularly.

1 Like

Crashes = reboots?

I am running it with little differences, first of all the time interval is 1 minute and also I register in a file en /tmp the occurences. I am planning to register also the memory and cpu usage over time.

Later I can give more details.

For me it’s worse than a reboot because the router doesn’t come back up, instead it gets stuck in a loop. I don’t believe that anyone has tried connecting a serial console to see what is reported when this occurs.

As I described, the behavior I have found is:

  1. The RAM fills almost to full
  2. The device gets "frozen" (it is impossible to connect to the device via wifi or ethernet)
  3. After some minutes the devices reboots, or continues "frozen" indefinitely

With the script running, I could verify that it stopped the issue before point 1 occurs (at 75 MB of usage the "wifi" command goes back to normal values of RAM usage).

It looks like one of the developers is working on some improvements to the MT76 support which may help to properly fix this issue!

2 Likes

Ok, so part of the fix has is now in the OpenWRT master branch.

It should be available in the SNAPSHOT builds in the next 24-48 hours, or you could build it yourselves.

1 Like

Just to keep you all in the loop, I’ve built a patched version of 22.03.3 containing the fix for mac80211. I’m hoping that by basing my test on top of the stable release branch I won’t hit unrelated bugs and can verify whether the problem is fixed.

UPDATE

4 days later and the crash has happened again, so this patch isn’t the silver bullet, although it may be part of the solution.

ANOTHER UPDATE
I recently took the plunge and upgraded to SNAPSHOT. There are some definite improvements to WiFi speeds and range, but the crash remains.

2 Likes

Hello! Any news regarding crash bug? I used for the last weeks official FW but I miss all OpenWRT features and costumizations, version 22.03.5 still has this bug?

Honestly - I gave up and got a cheap ZyXel WSM20 for £30. It has lots more RAM and flash space, as well as support for AX1800. It's been working fine without crashes.

In my case, after more than 20 days of uptime with the latest version on several units, the "memory fill" problem did not occur.

In any case, I left the script running that restarts the Wi-Fi radios if the 82 MB occupancy in RAM is crossed. But so far it has not been activated.

3 Likes

I want to clarify this, the last script had an error because it was disconting the "cached" amount in memory o something like that. I do not remember properly but for sure the script in my case was not being triggered because the numbear measured never reached 82 MB before the crash (while in the reality the memory really was filled to the top).

To fix that, is better to use a "free memory threshold", like this:

#!/bin/ash

#Due to currently unknown issues, MT7621 targets experience infrequent complete memory consumption.
#This leads to wifi and device crashing. This is possibly related to wifi driver issues under v22.03 releases 0 to 3.
#Workaround test is implementing a free memory check that issues a wifi down & wifi up command in case of apparent memory consumption close to max.

usedmem=$(free | grep "Mem" | cut -d' ' -f17)
freemem=$(free | grep "Mem" | cut -d' ' -f24)
#Archer C6v3 runs with 128MB RAM.
#Standard memory consumption fluctuates around 30MB.

echo "Mem used:" $usedmem
echo "Mem free:" $freemem

if [ $freemem -lt 40000 ]; then
    wifi
    echo "WARNING: Wifi restart due to excessive memory consumption"
    date >> /t

With this the wifi restart will occur if are less than 40 MB free.

And yes, unfortunatelly sometimes the issue still occuring. And that leaded me check why my script was not being triggered.

1 Like

I think I'm facing another bug but with similar symptoms, my archer c6 is occasionally doing a error flood in the log with this message:

dnsmasq[1]: failed to send packet: resource temporarily unavailable

And the internet begin to fail, the browser warns about a DNS error and the sites still can be accessed by their IPs instead the domain, if I keep using the router like that it will finally freeze and the wifi will be down after a while

after be able to catch that message error what does not help that much, I tried write this workaround script:

#!/bin/sh

watcher()
{
    echo dnsmasq watcher started
    logger -p notice -t dnsmasq.watcher "dnsmasq watcher service started"
    sig="failed to send packet: Resource temporarily unavailable"
    logread -f | while read line; do
        if [ "${line#*$sig*}" != "$line" ]; then
            top -b -n 1 -o %MEM >/tmp/top
            result=$(wget --quiet -O- --post-file='/tmp/top' 'https://paste.c-net.org/')
	    echo $result>/root/dnsmasq-crash-htop.txt
            logger -p err -t dnsmasq.watcher "htop at: $result"
            logger -p err -t dnsmasq.watcher "Restarting dnsmasq"
            service dnsmasq restart
        fi
    done

}

logger -p notice -t dnsmasq.watcher "dnsmasq watcher trying startup...."

lockfile=/tmp/dnsmasq.lock
if ( set -o noclobber; echo "$$" > "$lockfile") 2> /dev/null;
then
    trap 'rm -f "$lockfile"; exit $?' INT TERM EXIT KILL
    
    watcher

    rm -f "$lockfile"
    trap - INT TERM EXIT
else
	logger -p notice -t dnsmasq.watcher "Failed to acquire lockfile: $lockfile. Held by $(cat $lockfile) (this: $$) "
    echo "Failed to acquire lockfile: $lockfile, Held by $(cat $lockfile), (this: $$)"
fi

and I registred as an service in the router, I will keep it on for a while for testing now, but anyone had similar issue or has any idea about what might be wrong?
I'm using OpenWrt 23.05.0 (r23497-6637af95aa)