Very occasional WAN down unknown error (user_request)

I get this very occasionally unknown error (user_request) and the WAN is down nothing other than restarting the router seems to restore, restart the wan or the ISP fibre modem makes no difference. It will run weeks and weeks between occurrences so I cannot be sure what the trigger is. Not a huge issue when I am at home but sometimes we travel for a few weeks at a time and I foresee I could get locked out.

Pi4B running OpenWrt 21.02.0 r16279-5cc0535800

Any thoughts on chasing this down or worse case scripting a method to detect the failure and reboot?

Of course having had to restart to get connection back I now have no logs from the event.

Dont know the cause but you can set a script to reboot it when pings to a certain ip fails for x times.
Use a public dns provider for x.

1 Like

Not tested but I cobbled together

#!/bin/bash

# add to crontab
# */5 * * * * "/root/Scripts/reboot-on-fail.sh"

# variables
ipregtest="^((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$"
lockfile="/root/Scripts/lockdir/reboot.lock"
logfile="root/Scripts/logs/reboot.log"
counter=0
# functions
get_wan () {
. /lib/functions/network.sh
network_flush_cache
network_find_wan NET_IF
network_find_wan6 NET_IF6
network_get_ipaddr NET_ADDR "${NET_IF}"
network_get_ipaddr6 NET_ADDR6 "${NET_IF6}"
}
# Start here
if [ -f "$lockfile" ]; then exit 0; else touch "$lockfile"; fi

while [ $counter -lt 5 ] 
do
get_wan
WAN_IP=${NET_ADDR}
echo "IPv4 is $WAN_IP"
if [[ $WAN_IP =~ $ipregtest ]]; then echo "Good IP"; rm "$lockfile"; exit 0; else echo "Bad IP"; ((counter++));fi
sleep 180
done
echo "Failed 5 times, assume not recovering"
rm "$lockfile"
echo "At $(date) Wan is down for 5 checks ie. 15 minutes. Rebooting to recover" >> "$logfile"
reboot
exit

Of course this will keep rebooting during any planned outage, admittedly these are very rare

Would this fail if the wan ip remains but the internet drops?

watchcat seems ideal for this scenario maybe.

@IanBlakeley in terms of diagnosing the root cause, it would be helpful if you had logs from when the issue occurs.

Thanks. Looks like Watchcat does exactly what I was trying to implement.
It happens rarely and of course I am isolated when it does. Any idea which logs to grab? & could I add that to a script. It had certainly lost the Wan IP when I checked in Luci and as mentioned resetting the interface did nothing.

In Luci you can get them from the system logs section. With the logs you'll be able to see if there's anything significant that happens prior to losing the IP.

You can send the logs to a remote system if you have one running

Cool thanks. Okay it is logging to my NAS Syslog server.

1 Like

Also have a look at https://github.com/gSpotx2f/luci-app-internet-detector

This can ping public dns servers anf restart when ping fails for a set amount of time.