That means, try to answer questions or give solution or dont say anything like "ur the admin", because this is not helpfull.
But I asssme the information about the longer cron period wasn't helpful?
Cant see a suggestion for a longer cron period.
Afterward, add the following entry in System → Scheduled Tasks in LuCI:
*/10 * * * * /root/wg-watchdog.sh
This will call the wg-watchdog.sh every 10 Minutes. So i would expect, if there are connection issues, the router would reboot after 10 Minutes, not 0 or 1 Minutes but after 10 Minutes. So making */10 bigger would not solve the problem. The problem is with this watchdog the router reboots withing 60 secondes after powering it up. This is a problem. Can u help solve it?
Script is pinging private IP 10.64.0.1 to check for a bad connection. Its a mullvad specific thing.
Why do you think a watchdog is needed?
To get rid of connectivity issues which occurs after some weeks of operation (uptime).
Does this remote [Mullvad endpoint] IP fail often?
Not so often, most mullvad wireguard servers are very stable. It would be great to be able to automatically switch to another Wireguard / Mullvad endpoint, if the rare case happens that e.g.
fails. But we not talking about failing remote endpoints. Because if one of my wireguard routers running a few weeks, the wlan clients connected to the router saying "no internet connection" and i still was not able to determine what the problem is (what would u do to debug this?). In such a case i have to manual reboot router or disconnect power for 1 sec and repower to get a working connection back. Thank u.
*/10 in the crontab does not mean "after 10 minutes", it means "every time the clock shows minutes that are evenly divisible by 10".
Another hint is that the router, lacking a realtime clock, will at bootup assume as the current time the modification time of the most recently changed file in /etc (before getting a better idea about the current time later on, usually from a time server).
Which would suggest that the minute timestamp for your most recently changed file in /etc for you ends in a "9", and that those 60 seconds aren't enough for your router to exit the failstate for your homebrew watchdog.
This is a common problem, usually solved by touching /etc/banner before rebooting, so the current time will be picked up at the next reboot. But that may, again, be quite close and maybe too close to the next divisible-by-10 minute marker.
I must say, though, that rebooting the router within minutes of a failstate is a rather blunt instrument. Especially since more refined ones, like the aforementioned wireguard watchdog, exist.
Remember: I have never had any problem with dns / resolving name, because mullvad recommend using hard coded ip addresses. I not need reresolv anything. So pls stop pointing me to solutions that resolv such dns issues.
No because at the moment there is no problem. But still want to know how to change the watchdog from mullvad or get good suggestions from people who know what there are talking about, so i can do this without hanging in a boot loop.
Temporarly i have removed this cronjob until solve the issue.
Give me a second to figure out if this would be helpfull (i not have any resolving issues, because im using hardcoded ip adresses instead of hostnames).
We are talking about a freshly installed wdr-3500. Maybe u have better devices, which can run for a long period without memory leaks or random "not working anymore". I have solved problems in the past by just power off power and power on the other openwrt wireguard routers. How to debug problems to solve them if any problems occur in future. I have experienced connection issues, syslog doesnt tell me enough about a problem, maybe im not enough administrator and still learning. So I ask for help. It help me a lot u point me to solutions. Thank u for that.
Since today i dont want to reboot it manual. I ask for help what is wrong with the script, why it is triggering so early and what could a script look like.
Maybe the problem in the other routers - which still are long time in use - i forgot the persistent keep alive. In this new router we talking about today there is a 25 sec keep alive. Maybe we dont need any solution because there will never be any problem, i dont know. Because i do this a few times every year (setting up router with mullvad wireguard) i want it works more relaiable (every few weeks, the wlan clients says: there is no internet connection --> i have to reboot the wg router).
root@wdr3500:~# wg show
interface: wg_mullvad
public key: snip
private key: (hidden)
listening port: snip
peer: snip
endpoint: 185.209.196.78:51820
allowed ips: 0.0.0.0/0
latest handshake: 1 minute, 37 seconds ago
transfer: 21.02 KiB received, 35.55 KiB sent
persistent keepalive: every 25 seconds
You shouldn't need a listening port on the local side; but I don't think this would cause an issue.
I haven't once mentioned DNS issues - I have no clue what you're talking about. Please re-review my post. If you're asking me to stop assisting you, no worries - no need to be rude.
Yea, we already discussed this. To be clear - given your setup (i.e. with IPs only) I wouldn't think any watchdog is needed.
I believe takimata explain cron and how you can make the watchdog run at longer intervals - but I don't think it's necessary.
I would try restarting Wireguard when the issue occurs and see if that fixes it.
Yes - I understand you'll have to wait until the issue occurs.
I did not know that. It looks like the perfect solution for the cases I have experienced in the past. So i not even have to manually reboot the routers.
This watchdog script tries to reresolve and reconnect to inactive wireguard peers.
Use it for peers with a frequently changing dynamic IP.
persistent_keepalive must be set, recommended value is 25 seconds.
#This watchdog script tries to re-resolve hostnames for inactive WireGuard peers.
#Use it for peers with a frequently changing dynamic IP.
#persistent_keepalive must be set, recommended value is 25 seconds.
snip
#skip IP addresses
#check taken from packages/net/ddns-scripts/files/dynamic_dns_functions.sh
local IPV4_REGEX="[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}"
local IPV4=$(echo ${endpoint_host} | grep -m 1 -o "$IPV4_REGEX$") #do not detect ip in 0.0.0.0.example.com
Because it checks if wireguard peers still connected or not and try to reconnect and logs any reconnect triggered event to syslog. Sounds like a good solution. I have no idea why mullvad.net gives us a instruction to boot loop the router... (also keep in mind to touch some file under /etc to get a proper time, sounds buggy). Im sorry for the post before where i said this would not be helpfull because i not use hostnames. Ur suggestion looks very helpfull. Good job @lleachii
I thought you may have specified one in the config, if not no worries. Apologies if that wasn't already obvious and clear. Since you used wg show and didn't show the config, we wouldn't know.
My TP-Link routers with openwrt + wireguard from mullvad are sometimes powered off for minutes or hours but after plug in they work as expected for days and weeks.
We also have to check, if the needed 25 second "keep alive" in wg-interface is set. Because the routers are connected as dhcp client next to another router by wifi or ethernet cable (blue port) and this sometimes changes and I want to work without leaks any or dropping internet/vpn-connection.
In wireshark we should expect on all the routers before create a wireguard tunnel and send all data through it there must be a ntp connection to update the exact time from a timeserver to be able to do the crypto things. Openwrt has a inbuilt preconfigured and working ntp-client. Normally, we have to do nothing to keep the correct time. Maybe the VPN tunnel prevent the inbuilt ntp-client from asking to the server and after few weeks problems occur... Need to know what is expected how often ntp-client would ask for time. I could check this with wireshark or have a look into syslog for "ntp" to figure out, is the wg-interface blocking the ntp-client. But for now, this is a good solution for wireguard users:
You just need to do this one thing:
Thank u @takimata@ncompact@lleachii for the great help and sorry for my rude speaking. All of you did a good job helping me and have to earn respect from people for your time you invest here and your experience u share with us.