Auto reboot if RAM too low or Luci "dead"

Hello,
I have Netgear R6220 and I'm having the same problem this person had: Netgear R6220 eating through ram and crashing

However - this router is pretty old, this bug will probably stay (and I guess it's not that common and depend on specific config maybe if it's not caught and fixed as this router is pretty popular I think) and I'm installing this router in a place where a reboot from time to time isn't a problem.

Router is partially functional when problems start to happen until all crashes (2,4GHz goes down, but 5 GHz stays, no Luci/SSH available) so I guess the OS still runs and can reboot itself.

So. I need help with setting up a cron command that will reboot the router if:

  • free RAM is below 40MB (kind of "risky", because it can reboot healthy router)
  • or even better - Luci reports out of memory (this is perfect solution for me)

Rebooting periodically (like every 24h during the night) isn't perfect because router can go crazy within few hours, so I'd stay with broken router until next reboot.

Thanks in advance

Would the built-in hardware watchdog help here?

1 Like

Have you tried 21.02-snapshot?

I haven't. Is something related to my issues fixed there or just a random shoot? I was using this router for 1,5 years and every version I tried (I tried 2 or 3?) had this issue. I want to install this router in my parents house so I won't be able to play around with differnet firmwares.

I don't know. I thought hardware watchdog is useable only when system hangs/stop responding?

cat << "EOF" > /etc/syscheck.sh
MEM_AVAIL="$(ubus call system info \
| jsonfilter -e "@['memory']['available']")"
MEM_LIM="40000000"
if [ "${MEM_AVAIL}" -lt "${MEM_LIM}" ] \
|| ! pgrep uhttpd > /dev/null
then reboot
fi
EOF
cat << "EOF" >> /etc/crontabs/root
*/5 * * * * . /etc/syscheck.sh
EOF
uci set system.@system[0].cronloglevel="9"
uci commit system
/etc/init.d/cron restart
3 Likes

I am really impressed... What does the
! pgrep uhttpd > /dev/null do ? pgrep returns the pid of uhttpd, and if that one is not running, what exactly is happening ?

1 Like

The "!" means "logical not". A translation of that line might be, "if the available memory is lower than the limit, or the uhttpd process does not exist (does not have a process ID), then reboot".

If you use a different webserver for Luci (e.g. lighttpd) then modify that line accordingly.

1 Like

You can download it here

My two MT7621-NAND routers, Xiaomi R3G and HiWiFi HC5962, have been running snapshot version for months, and never had OOM.

Note that upgrading to 21.02 requires resetting settings because of the swconfig to DSA migration.

1 Like

@ivgaetera thanks! looks promising

however i think uhttpd still lives, because as dbg2950 mentioned in Netgear R6220 eating through ram and crashing

web interface shows error page "Internal Server Error Failed to create CGI process: Out of Memory"

i did not stored raw http response however, it could be useful when creating such script to avoid mistakes

is there a chace for an updated version that instead will check if http request to web panel contain given substring?

1 Like

Which is the purpose of the either/or check. Even if uhttpd still lives, if the available RAM drops below a certain threshold then the reboot will still be triggered.

2 Likes

yeah, but I've never seen uhttpd crashing, so I don't need this check :slight_smile:

Check the output when this happens:

wget --no-check-certificate -q -O - http://localhost/
ubus call luci-rpc getBoardJSON
1 Like

I can't really check this while router goes crazy because I cannot access SSH when this happens, but I will put a cron job or something to keep posting the results to my vps server.

I will post back after I will get something new - for other people or even for myself a year later :slight_smile:

1 Like

Another thought is grepping the system log for the oom-killer (can't remember it's exact process name). If it has been invoked, the system is definitely out of memory and you could trigger a reboot.

1 Like