My problem is that every now and then, the USB dongle stops working and can't be persuaded to cooperate again. The shell script stops running and in subsequent starts immediately stops again. All that is very correctly identified and caught by the respawn parameter: after five attempts, the init script gives up and logs the service as irredeemably crashed.
What I would like to do now is to react to that correctly identified crash. In my case, I would simply take the nuclear option and reboot the router.
Is there a way to do that, ideally not while accidentally catching a regular service shutdown? I am hoping that, rather than writing a small cronjob that checks for the service status every few minutes, procd caters for this scenario and calls a subroutine with an error code or something to that effect.
Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button:
Remember to redact passwords, MAC addresses and any public IP addresses you may have:
Another service that watches the service is my first thought?
edit: cthulu88 looks like they actually know what to listen for and just beat me to it haha.
Or write another script that is a wrapper to the currently crashing script and make that be what the service runs?
i.e.
service start -> my start then wait until exit with timer -> /bin/sh "/var/myscript.sh"
That sounds very much like what I'm looking for. How do I catch that in a script (within the init script maybe)? I can't seem to find any documentation or examples on how to react to service events.
As a simple work around, you can permanently run a small script, pipe the log messages to it, to be parsed.
In case of crash, trigger reboot.
You can already filter the messages, passed to your script, using the appropriate options for logread.
Thanks, and yes, I know. Even simpler, one can test for service rtl433_json running and parse $? in a cronjob. The workaround is in place, I'm looking for the non-workaround.
That seems very promising, thank you! However, in a first test it does not work. procd recognizes the crash loop:
Thu Oct 24 15:21:24 2024 daemon.info procd: Instance rtl433_json::rtl433_json s in a crash loop 6 crashes, 1 seconds since last crash
But it does not trigger the reboot (or any other script I put into the service trigger's second parameter. Yes, I used the edited version, and I replaced "service_name" with the name of my service/instance, "rtl433_json".)
For kicks and giggles, I also tried
procd_add_raw_trigger "instance.fail" 500 reboot
with the same non-effect. That was a bit of a shot in the dark, though.
Yes. it's a generic ubus event you're listening to.
Everything in the procd source code indicates that this should trigger the correct call to trigger_event from ubus.
Try changing reboot for an actual shell script with execution permission and/or try naming the absolute path to reboot/script.
procd_set_param watch instance.fail then? I only ever see the watch param with network.interface, and it's a bit obscure by which mechanism it then actually causes something to happen.