Unexplained hangs freezes and reboots with Archer c2600

Some update: Been running on 800MHz-max with ondemand governer for 55 days now and not a single reboot during these days. So 800MHz seems to be the safe limit.

(I think my previous post was not accurate since I forgot to uncomment the lines that set the min freq to 800MHz, so it ran on the default low and therefore still crashed after a few weeks.)

Maybe you should just use performance then. Could actually be bad silicon if 800MHz min freq doesn't resolve it or at least make the reboots a lot less often.

1 Like

Hello,
I am quite lucky with these additional commands in /etc/rc.local:

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state0/disable
echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state1/disable
echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state0/disable
echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state1/disable
2 Likes

I'm trying these settings, i've only adjusted scaling_min_freq to 800000 . Let's see how stable it is.

Hello.
I also have C2600 v1.1 with OpenWrt 19.07
On my case system work stabile when scaling_governor set to performance but my router got three times reboot after seven days.
Reboot was always when I try download big ISO file and on the same time I tried copy some file on LAN.

My max uptime was 83 day and currently 39 days.

I think this settings don't have sense.

performance - always set CPU clock to max and on this point at 1200MHz
Setting scaling_min_freq got sense only when scaling_governor is set to ondemand.

I think bellow settings should also be set to policy1 directory (I got this on OpenWrt 19.07)

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq

I think correctly should be (or other CPU frequency):

echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo ondemand > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_max_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_min_freq

The settings avoid any switching between frequencies, that is intentional. It is a bit redundant, however without any side effects.

As mentioned, at least for my routers it is stable and reboots are not unscheduled:

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
echo 1200000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state0/disable
echo 1 > /sys/devices/system/cpu/cpu0/cpuidle/state1/disable
echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state0/disable
echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state1/disable

The routers are consuming more energy by not clocking down. However, I prefer it to keep running without reboots. The maximum possible speed is also not used, as that might also trigger stability issues. 1.2GHz seems stable to what I can observe.

The ondemand scheduler always caused issues sooner or later.

I only use scaling_governor, my devices are at 345, 166 and 87 (it was powered down, not a reboot) days.
Use them as APs though, so there might be a difference, the one with the longest uptime, uses 19.07.3, the other two 19.07.6.

1 Like

Hmm, okay - I follow stable, so I am at 21.02.0 now. Perhaps a regression?

Update 03.12.2021: I followed to 21.01.1 and still the C2600 devices did not reboot unscheduled (so far?).

1 Like

Just bought a used C2600 v1.1 and I am on 21.02.1, don't see many posts about this version's behavior. Have everything on defaults ( scaling_min_freq at 600000, governor ondemand), let's see how it goes

2 Likes

I'm using the same combination, so far no issues to report:

Uptime	57d 22h 56m 42s

EDIT: Damn! ..unexpected power outage at 68days

EDIT2: 68 + 38 = 106days total, without any spontaneous reboots!

I have now moved on to 21.02.2, see how this performs

2 Likes

Strange indeed, but who knows what is causing it. Let's please update each other if unplanned reboots occur. For the interim I added the hints in the Wiki as well for others that might have issues and are looking for ideas on how to navigate around the issue.

1 Like

I'm currently at 20d with no reboots.

Running 21.02.5 and multiple reboots within minutes of installing.
Since no resolution, in desperation,
Tried the Ramon's suggestion meant for x86 following but still rebooting.

echo ondemand > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
echo ondemand > /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
echo 800000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
echo 800000 > /sys/devices/system/cpu/cpufreq/policy1/scaling_min_freq
echo 25 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
echo 800000 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq

Nothing in system log.

Running 21 releases since I use BanIP.

EDIT: Time permitting, will revert to 21.02.4 to compare.

Iirc, release 21 has packet scheduler issues beyond 21.02.1 ... after that other algorithms than round robin were tried, to mixed results and this stuttering (while still operating) may be that.

Also the reboots are probably centered around krait (cpu) clocking and vlan stats gathering in the kernel.

Much work has gone into those areas for 22.03 which was not back-ported to v21.

If possible, strongly suggest you try the latest v22 release or even a snapshot.

EDIT saw that you're stuck at v21 - you might just go back to 21.02.1, set min, max cpu clock rate to 1.2ghz (fixed @ that rate).

And if you're using ath10-ct firmware, kmod ... switch to non-ct, and vice versa.

Mmmv,
M.

1 Like

Thanks Mpilon

dibdot is making headway on BanIP, so hopefully, I can start using current releases.

1 Like

So my router started crashing quite frequently back in April, like once per week or so. Was not even rebooting, I had to power cycle it.
I now upgraded from 21.02.1 to 22.03.5 r20134. It looked better and got something like 25d uptime, but yesterday it rebooted (but no crash and need to power cycle).
These are my settings, any suggestions to change them given all the above comments?

root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
ondemand
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
ondemand
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
ondemand
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy1/scaling_governor
ondemand
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq
600000
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/policy1/scaling_min_freq
600000
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
50
root@OpenWrt:~# cat /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor
10
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
600000
root@OpenWrt:~# cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq
600000

Also do I need to reboot after any change to these?

No need to reboot for these to take effect to the best of my knowledge.

This is all I have in /etc/rc.local. Has been very stable. Using the default "ondemand" scheduler too so have nothing to override it.

echo 600000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
echo 600000 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq
echo 25 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

Only difference is the up_threshold at 25 vs 50.

Today early morning after ~45 days of uptime my C2600 froze again. Unfortunately I was not on-site and could not even power-cycle and restart it. I was kind of in a bad situation, as I could not access my network and no one was on site to help.
However somehow magically after roughly 12 hours (!) the router restarted itself and is now back on-line.
Really thinking of buying a relay and hook it up to my RPi, and make a small program to power-cycle it when it detects no network connectivity.

It comes back from limbo after 12h?? That's interesting. Means that even though something dies and the thing turns into a turnip there is still some life in it that eventually makes it trip over.

I have never been able to communicate with the thing once it goes numb and becomes a switch. I have no clue what's going on in there. I thought nothing was... you give me hope. I have not been able to figure out what does it but multicasts seem to be a big part of it.

I can imagine your pain... that feeling of despair when the green light is no longer on and there's nobody around to physically click the button. The RPi idea sounds cool... kind of your proprietary ILO mechanism.

Suggest you set your min AND max scaling frequencies to the same value, and stop using the ondemand cpu governor. The nss builds for this target have given up on ondemand because of behaviors like this.