Status of procd service on script termination

Lynx · May 5, 2023, 3:22pm

Using a service script file like this:

#!/bin/sh /etc/rc.common

START=97
STOP=4
USE_PROCD=1

start_service() {
        procd_open_instance
        procd_set_param command /root/program.sh
        # uncomment if you want procd to restart your script if it terminated for whatever reason
        #procd_set_param respawn
        procd_close_instance
}

If 'program.sh' terminates, e.g. owing to an error, 'service X status' still shows 'running', whereas I would like such termination to result in 'service X status' showing 'inactive' because the program terminated.

How can I achieve this?

dave14305 · May 17, 2023, 3:44pm

Add a status_service() function to your init script that will be called when you request status.

There are some examples with Adblock, Simple-Adblock, PBR, etc.

Or figure out what your service looks like when it’s died

ubus call service list

Lynx · May 17, 2023, 3:50pm

I have implemented something similar for normal init.d scripts:

github.com

lynxthecat/cake-qos-simple/blob/8e0d9cf83294e24194114af6dae28289b2a1f99a/cake-qos-simple#L106


      
          		dl_if=""
          	else 
          		printf "\nRemoving CAKE on ${dl_if}.\n"
          		tc qdisc del dev ${dl_if} root
          	fi
          
          
	printf "\nStopped cake-qos-simple.\n\n"
          	logger -t cake-qos-simple "Stopped cake-qos-simple."
          } 
          
          
status()
          {
          	tc qdisc ls
          }
          
          
upload() 
          {
          	tc -s qdisc show dev $ul_if
          }
          
          
download()

But will that work in a procd service script context?

dave14305 · May 17, 2023, 3:54pm

I expect so. It’s part of /etc/rc.common to check for a predefined custom function. Then you can code whatever you like in that function to determine your status.

All my advice is theoretical but based on reviewing the procd base scripts.

Lynx · May 18, 2023, 5:54am

I see from the documentation that there is a respawn concept:

The comments seem to me a little cryptic:

# respawn automatically if something died, be careful if you have an alternative process supervisor
# if process exits sooner than respawn_threshold, it is considered crashed and after 5 retries the service is stopped
# if process finishes later than respawn_threshold, it is restarted unconditionally, regardless of error code
# notice that this is literal respawning of the process, no in a respawn-on-failure sense

respawn if something died, but literal respawning of the process, no in a respawn-on-failure sense?

In any case, I'm looking for something a bit different.

I don't want procd to respawn service on termination. I just want the status reported by procd to reflect that the command exited.

If respawn can pick up on the termination (in order to respawn the process on termination and eventually stop the service after e.g. 5 failed attempts), then it strikes me as a deficiency in the OpenWrt procd service implementation that the status reported by 'service X status' does not actually pick up on the command having terminated.

I think 'service X status' should not show up as 'running' if the command terminated and respawn is not set, right? I mean a service is surely not running if respawn would have brought it back to life? That service crashed to a halt and should be stopped just like after the five failed respawn attempts:

if process exits sooner than respawn_threshold, it is considered crashed and after 5 retries the service is stopped

But perhaps I am misunderstanding something or missing something obvious?

jow · May 18, 2023, 9:48am

It seems to be a quirk / sloppy implementation of the default status_service() implementation in /lib/functions/procd.sh:

+ instance='*'
+ echo '{ "instance1": { "running": false, "command": [ "\/bin\/sh", "-c", "sleep 10" ], "term_timeout": 5, "exit_code": 0 } }'
+ jsonfilter -e '$[*]'
+ '[' -z '{ "running": false, "command": [ "\/bin\/sh", "-c", "sleep 10" ], "term_timeout": 5, "exit_code": 0 }' ]
+ echo running
running
+ return 0

Note that ubus correctly reports "running": false but the shell script wrappers report any existing, non-unknown instance as "running". It should probably gain a further state "present, not running".

jow · May 18, 2023, 10:08am

Here's a potential fix:

diff --git a/package/system/procd/files/procd.sh b/package/system/procd/files/procd.sh
index 5148b2f03c..8ee25f4f08 100644
--- a/package/system/procd/files/procd.sh
+++ b/package/system/procd/files/procd.sh
@@ -524,7 +524,10 @@ _procd_send_signal() {
 _procd_status() {
        local service="$1"
        local instance="$2"
-       local data
+       local data state
+       local n_running=0
+       local n_stopped=0
+       local n_total=0
 
        json_init
        [ -n "$service" ] && json_add_string name "$service"
@@ -539,10 +542,29 @@ _procd_status() {
        fi
 
        [ -n "$instance" ] && instance="\"$instance\"" || instance='*'
-       if [ -z "$(echo "$data" | jsonfilter -e '$['"$instance"']')" ]; then
-               echo "unknown instance $instance"; return 4
+
+       for state in $(jsonfilter -s "$data" -e '$['"$instance"'].running'); do
+               n_total=$((n_total + 1))
+               case "$state" in
+               false) n_stopped=$((n_stopped + 1)) ;;
+               true)  n_running=$((n_running + 1)) ;;
+               esac
+       done
+
+       if [ $n_total -gt 0 ]; then
+               if [ $n_running -gt 0 ] && [ $n_stopped -eq 0 ]; then
+                       echo "running"
+                       return 0
+               elif [ $n_running -gt 0 ]; then
+                       echo "running ($n_running/$n_total)"
+                       return 0
+               else
+                       echo "not running"
+                       return 5
+               fi
        else
-               echo "running"; return 0
+               echo "unknown instance $instance"
+               return 4
        fi
 }

When testing with this /etc/init.d/test init script:

#!/bin/sh /etc/rc.common

START=90

USE_PROCD=1

start_service() {
        procd_open_instance i1
        procd_set_param command /bin/sh -c "sleep 10"
        procd_close_instance

        procd_open_instance i2
        procd_set_param command /bin/sh -c "sleep 20"
        procd_close_instance

        procd_open_instance i3
        procd_set_param command /bin/sh -c "sleep 30"
        procd_close_instance

}

Then /etc/init.d/test status will report "running" w/ code 0, "running (2/3)" w/ code 0, "running(1/3)" w/ code 0, "not running" w/ code 5 within 1s, 11s, 21s and 31s after service start respectively.

If the intent is to programmatically query service state, then maybe consider bypassing all the shell fluff entirely and directly checking

ubus call service list '{ "name": "yourservicename" }' | \
    jsonfilter -l 1 -e '@[*].instances[*].running'

Lynx · May 18, 2023, 1:23pm

Fantastic @jow!

The context for me is for cake-autorate, which is launched (thanks to your help some time ago) with this procd service wrapper:

github.com

lynxthecat/cake-autorate/blob/master/cake-autorate

#!/bin/sh /etc/rc.common
 
START=97
STOP=4
USE_PROCD=1
 
start_service() {
        procd_open_instance
        procd_set_param command /root/cake-autorate/cake-autorate_launcher.sh
        # uncomment if you want procd to restart your script if it terminated for whatever reason
        #procd_set_param respawn   
        procd_close_instance
}

which launches this launcher:

github.com

lynxthecat/cake-autorate/blob/master/cake-autorate_launcher.sh

#!/bin/bash

cake_instances=(/root/cake-autorate/cake-autorate_config*sh)
cake_instance_pids=()

trap kill_cake_instances INT TERM EXIT

kill_cake_instances()
{
	trap - INT TERM EXIT

	echo "Killing all instances of cake one-by-one now."

	for ((cake_instance=0; cake_instance<${#cake_instances[@]}; cake_instance++))
	do
		kill "${cake_instance_pids[${cake_instance}]}" 2>/dev/null || true
	done
	wait
}

This file has been truncated. show original

Namely I would like 'service cake-autorate status' not to show 'running' when cake-autorate.sh has errored out or crashed (e.g. owing to a configuration error or bug). cake-autorate.sh cleans up after itself and that is caught by the cake-autorate_launcher.sh (which may run multiple instances for multiple interfaces).

Given your example showing multiple instances, looks to me like it might be a good idea to fold the cake-autorate_launcher.sh into the procd service script? Any chance you could help with that?

@moeller0 does it make sense to you to keep alive cake-autorate instances separately so that if one goes down it doesn't tear the others down? I think this makes sense, but I wonder what you think?

@jow can I test your code above? In my /lib/functions/procd.sh (RT3200 with 22.03.05) I see:

_procd_status() {
        local service="$1"
        local instance="$2"
        local data

        json_init
        [ -n "$service" ] && json_add_string name "$service"

        data=$(_procd_ubus_call list | jsonfilter -e '@["'"$service"'"]')
        [ -z "$data" ] && { echo "inactive"; return 3; }

        data=$(echo "$data" | jsonfilter -e '$.instances')
        if [ -z "$data" ]; then
                [ -z "$instance" ] && { echo "active with no instances"; return 0; }
                data="[]"
        fi

        [ -n "$instance" ] && instance="\"$instance\"" || instance='*'
        if [ -z "$(echo "$data" | jsonfilter -e '$['"$instance"']')" ]; then
                echo "unknown instance $instance"; return 4
        else
                echo "running"; return 0
        fi
}

jow · May 18, 2023, 1:50pm

Given that the default service status procedure implementation is bugged or at least unintuitively behaving, you would need to override the default one and provide a custom status_service() procedure in your init script. It would be a copy of _procd_status() from procd.sh with something similar to my suggested fix applied and with the service name hardcoded.

Lynx · May 18, 2023, 2:31pm

BTW for the folding the multi-instance launcher into single procd script, does this look about right? It's OK to use /bin/bash (since cake-autorate requires bash anyway)?

#!/bin/bash /etc/rc.common

START=97
STOP=4
USE_PROCD=1

start_service() {

        cake_instances=(/root/cake-autorate/cake-autorate_config*sh)

        for cake_instance in "${!cake_instances[@]}"
        do
                procd_open_instance "${cake_instance}"
                procd_set_param command /root/cake-autorate/cake-autorate.sh "${cake_instances[cake_instance]}"
                # uncomment if you want procd to restart your script if it terminated for whatever reason
                #procd_set_param respawn
                procd_close_instance
        done
}

At the moment I presume if one errors out the others will be kept running (and status will show 'running' regardless of whether the one or more instances are running)?

jow · May 18, 2023, 3:16pm

Whether it shows running depends on how you implement status_service() in your init script but yes, procd will keep other instances running if one crashes.

moeller0 · May 18, 2023, 4:15pm

I would guess these are better handled as individual entities with separate "fates".

Are you trying to abandon the launcher script completely or just for procd? Because for testing on the command line the script is quite convenient, you know my crude way of using screen to background running instances without actually ever installing/running them as service?

Lynx · May 18, 2023, 5:16pm

Keeping and maintaining the launcher script for that exact reason. This is just modifying the procd script to act in place of the launcher rather than call the launcher, which was a bit cumbersome. I didn't realise procd scripts could call multiple things until now.

moeller0 · May 18, 2023, 6:13pm

In that case, excellent idea.

system · May 28, 2023, 6:13pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.