So the NanoPi R2s worked fine ever since July 2021. I compiled a new firmware from time to time, flashed it and it worked.
Until I tried v23.05.3 earlier this year. The router started to crash. It became completely unresponsive. No response to ping, no internet, nothing. Turns out it crashed after approx. 24 days. Only a power cycle gets it running again.
I decided to go back to the last working firmware: 22.03.3.
A few days ago I decided to give it another try. With 22.03.7 this time. I compile a new firmware, flash it and it crashes today, after only two days.
What can be the problem? How do I find out why it crashes if it is impossible to even login?
Can you flash standard firmware from firmware-selector.openwrt.org
23.05.5 and maube 24.10-rcX ?
I used to have one, it was running extremely hot for semi-closed locker it had to be in.
That is a bit difficult because it is running a router / gateway for about 20 machines / devices.
I've had collectd running over years and the only long term changes it shows correspond to changes in ambient temperature. The NanoPi is tied to the metal frame of a hallway wall unit that works as a huge heat sink .
ksoftirqd is immediate processes after hardware interrupt is handled - firewall and qdisc on a router, disk schedulers and video sync in other devices.
First check cat /proc/interrupts -> if same device looks like having multiple IRQ-s start with irqbalance (enable and start after install) to spread interrupts across cores.
Interrupts are balanced across cpu-s
cpu1 holds eth0 card, if ethtool -g/-G is supporyed you may get multiple interrupts, if not click steering (and keep irqbalance)
Servers: https://iperf.fr/iperf-servers.php
From some client on LAN side, it should not hit 100% on that ethernet CPU, and sometimes re-balance, and sooner or later settle on some layout.
Will do later, at a more appropriate moment. If the router crashes and my family find out it was done deliberately, just to run a test, they will come for me
BTW, if the core temperature were a problem, limiting its frequency would be a solution. I've done that successfully with a Raspberry Pi 3B+ in the past. I managed to prevent throttling by lowering the CPU speed once set up a webRTC session remotely.
Is there anything in OpenWrt for this? It seems there is no cpufrequtils package.
However, reading /sys/devices/system/cpu/cpufreq/policy0/cpuinfo_cur_freq gives encouraging results:
1008000 (most of the time)
816000 (less frequently)
408000 (rare)
Although the numbers are a bit weird, they suggest there is already a policy in place.
This is an overview of what's in the /sys/devices/system/cpu/cpufreq/policy0 folder: