Watchdog: check link local connectivity?

We have some IPQ40xx SoC (Fritz!Box 7530) running in a mesh network. The network driver seems a bit unreliable. We guess that sometimes it crashes, but it does not restart the IPQ40xx SoC, so the router is no longer reachable in our mesh network. I would like to write some script that checks if on some interface a connection is still existing, otherwise restart the router. Is there something like this exisint, or what would u use? Parse ping ff02::1? Any suggestion? Of course we will try to debug the Fritz!Box 7530 issue to fix is upstream, however this failure only sometimes happen, and we need to put a serial to connection to a router, to see what happens.

3 Likes

I have something like this to restart the radio on downstream APs linked over fixed WDS.

In crontab (Scheduled Tasks):

*/2 * * * * ping -c 1 -w 5 10.0.1.2 || (wifi down radio2; sleep 3; wifi up radio2)

Every 2 minutes, ping that IP address, trying only once and fail after 5 seconds of waiting. If the ping fails, run that sequence of commands to bounce the radio. Change the commands in brackets to suit: reboot, fail-over onto another link, etc.

This was suddenly needed lots on Archer C7s in 19.xx, where before the links would run for months without issue. It seems to trigger less often now, whether later 19.xx have fixed things, or moving back to non-CT firmware has helped. (Non-CT firmware certainly has much faster net data rates on ath10k WDS)

1 Like

Thanks! I copied the reboot part from watchcat to write my own daemon:

2 Likes

shame you had to fork...

1 Like

Can I add similar things to watchcat? Like pinging an interface and look for local neighbors?

1 Like

i'd hoped this would be a priority when the request for input from users in the above thread was made several months ago...

seems not so much and most of the changes to date have been around modemmanager rather than making watchcat more extensible...

so yeah... maybe a fork was a better option...

ideally watchcat will be modified to support stuff like;

. /lib/watchcat/handlers/* (babeld.sh)

if type handler_$(user_handler); then
  #call handler with params after validated
fi

or similar extensible framework similar to /lib/upgrade/BOARD.sh

makes alot of sense to leverage existing logic... (i.e. alot of cool stuff exists in mwan... and some other protos)

1 Like

That would be awesome. However, I thought about adding as ping target something like ff02::1%eth0.42, however for this case I need to remove my own interface from the ping answers. :S

1 Like

ideally... if they are sourced and a list(or instance)... it's possible to drop in and instantiate multiple handlers per 'instance'

i.e.;
config watchcat checkbabeld
list user_handler 'babeld ping6neigh'
list user_handler 'luci-command fault c999dd37'

the actual implementation is up for grabs and above is just theory...

the pre-init 'boot_hook_add' handling method for function list also has its advantages... (sqm is another package that also exhibits great extensibility)

. /usr/lib/watchcat/methods/*

something like the above allows may sub-watch methods to be dropped in... and called by
main handlers...

i.e. if I write a low level wireguard function tommorrow... it can be dropped in...

just returns 0 or 1 (or similar) but is available to all 'handlers' or future comers...

if i want to improve the 'ping_timeout' logic... i simply need to submit a PR on the 'method' and not the whole core logic...

1 Like

But that is currently not possible?

at the moment... afair...

'handlers' are static 'case' with hardcoded params... so it's possible but not in any extensible way...

('methods' afaik are inline within each handler so even less extensible and these are the core of what watchcat does)

(fwiw... if your implementation gets a more generic name 'watchcatNG, link-actions' or something like that... i'm willing to fork/pr or whatever and add some extensibility samples to it if you are interested but that's totally up to you... like i said... your fork with the single dedicated task does have it's advantages...)

Of course, that would be welcome

Sorry, it's just that I have had limited time to work on some of those ideas. The code is always there online though if you want to send a patch through. It's a good idea, I'd like to see it realised too.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.