Procd process handling

I’m writing a script to get cake-autorate data into Home Assistant. So far I have the below as proof of concept. Works, but owing to the way procd handles backgrounded processes and ignores SIGPIPE, not all the tail, awk and mosquitto_pub processes are killed after the service is stopped.

Any suggestions? I’m open to abandoning procd.


#!/bin/sh /etc/rc.common

USE_PROCD=1
START=98
STOP=10

LOG_FILE="/var/log/cake-autorate.primary.log"

MQTT_HOST="192.168.x.x"
MQTT_PORT="1883"
MQTT_USER=""
MQTT_PASS=""
MQTT_TOPIC="cake-autorate"

DISC_PREFIX="homeassistant"
DEVICE_ID="openwrt"
DEVICE_NAME="OpenWrt"

MIN_INTERVAL_S=5

publish_sensor () {
mosquitto_pub -h "$MQTT_HOST" -p "$MQTT_PORT" -u "$MQTT_USER" -P "$MQTT_PASS" -r -q 1 \
-t "$DISC_PREFIX/sensor/$DEVICE_ID/$1/config" -m "$2"
}

publish_discovery() {

publish_sensor "dl_achieved_rate_kbps" \
'{"name":"CAKE DL Achieved Rate","state_topic":"cake-autorate","value_template":"{{ value_json.dl_achieved_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_dl_achieved_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "ul_achieved_rate_kbps" \
'{"name":"CAKE UL Achieved Rate","state_topic":"cake-autorate","value_template":"{{ value_json.ul_achieved_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_ul_achieved_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "cake_dl_rate_kbps" \
'{"name":"CAKE DL Rate","state_topic":"cake-autorate","value_template":"{{ value_json.cake_dl_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_cake_dl_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "cake_ul_rate_kbps" \
'{"name":"CAKE UL Rate","state_topic":"cake-autorate","value_template":"{{ value_json.cake_ul_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_cake_ul_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "dl_sum_delays" \
'{"name":"DL Delay Sum","state_topic":"cake-autorate","value_template":"{{ value_json.dl_sum_delays }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_dl_delay","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "ul_sum_delays" \
'{"name":"UL Delay Sum","state_topic":"cake-autorate","value_template":"{{ value_json.ul_sum_delays }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_ul_delay","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "dl_avg_owd_delta_us" \
'{"name":"DL OWD Delta","state_topic":"cake-autorate","value_template":"{{ value_json.dl_avg_owd_delta_us }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_dl_owd","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "ul_avg_owd_delta_us" \
'{"name":"UL OWD Delta","state_topic":"cake-autorate","value_template":"{{ value_json.ul_avg_owd_delta_us }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_ul_owd","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "dl_load_condition" \
'{"name":"DL Load Condition","state_topic":"cake-autorate","value_template":"{{ value_json.dl_load_condition }}","unique_id":"openwrt_dl_condition","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

publish_sensor "ul_load_condition" \
'{"name":"UL Load Condition","state_topic":"cake-autorate","value_template":"{{ value_json.ul_load_condition }}","unique_id":"openwrt_ul_condition","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

for c in 0 1 2; do
publish_sensor "cpu_core$c" \
"{\"name\":\"CPU Core $c\",\"state_topic\":\"cake-autorate\",\"value_template\":\"{{ value_json.cpu_core$c }}\",\"unit_of_measurement\":\"%\",\"state_class\":\"measurement\",\"unique_id\":\"openwrt_cpu_$c\",\"device\":{\"identifiers\":[\"openwrt\"],\"name\":\"OpenWrt\"}}"
done
}

start_service() {

publish_discovery

procd_open_instance
procd_set_param command /bin/sh -c "\
tail -F '$LOG_FILE' 2>/dev/null | \
awk -F'; ' -v min_int=$MIN_INTERVAL_S '\
BEGIN {
 last_summary = systime() - min_int
 last_cpu     = systime() - min_int
}

\$1==\"SUMMARY\" && NF>=13 {
 n=systime()
 if(n-last_summary>=min_int){
  last_summary=n
  printf \"{\\\"dl_achieved_rate_kbps\\\":%s,\\\"ul_achieved_rate_kbps\\\":%s,\\\"dl_sum_delays\\\":%s,\\\"ul_sum_delays\\\":%s,\\\"dl_avg_owd_delta_us\\\":%s,\\\"ul_avg_owd_delta_us\\\":%s,\\\"dl_load_condition\\\":\\\"%s\\\",\\\"ul_load_condition\\\":\\\"%s\\\",\\\"cake_dl_rate_kbps\\\":%s,\\\"cake_ul_rate_kbps\\\":%s}\\n\",
   (\$4~/^[0-9.]+$/)?\$4:0,
   (\$5~/^[0-9.]+$/)?\$5:0,
   (\$6~/^[0-9.]+$/)?\$6:0,
   (\$7~/^[0-9.]+$/)?\$7:0,
   (\$8~/^[0-9.]+$/)?\$8:0,
   (\$9~/^[0-9.]+$/)?\$9:0,
   (\$10!=\"\"?\$10:\"unknown\"),
   (\$11!=\"\"?\$11:\"unknown\"),
   (\$12~/^[0-9.]+$/)?\$12:0,
   (\$13~/^[0-9.]+$/)?\$13:0
 }
}

\$1==\"CPU\" && NF>=7 {
 n=systime()
 if(n-last_cpu>=min_int){
  last_cpu=n
  printf \"{\\\"cpu_core0\\\":%s,\\\"cpu_core1\\\":%s,\\\"cpu_core2\\\":%s}\\n\",
   (\$5~/^[0-9.]+$/)?\$5:0,
   (\$6~/^[0-9.]+$/)?\$6:0,
   (\$7~/^[0-9.]+$/)?\$7:0
 }
}
' | \
mosquitto_pub -h '$MQTT_HOST' -p '$MQTT_PORT' -u '$MQTT_USER' -P '$MQTT_PASS' -t '$MQTT_TOPIC' -l -q 0 -r
"
procd_set_param respawn 3600 5 5
procd_close_instance
}

stop_service(){ :; }
reload_service(){ stop; start; }

Does it help if you do the parsing of the log file with tail and awk outside the procd instance?

1 Like

Yeah, I think having a separate script that procd calls makes sense.

I haven't read the script (yet), but it sounds like a use case for

1 Like

Perhaps - procd has some weird handling of processes / SIGPIPE in that if you have procd service run a command with a bunch of piped processes, then when the service is stopped the piped processes tend to linger. One can write out PIDs to files and then read them in on service termination, but this seems messy.

Is the parent process which called those lingering processes alive at that point? If yes then kill_pids_recursive can discover all its children, grandchildren etc and kill them automatically.

If the parent process is rc.common then killing it along with its offspring doesn't make sense, of course, but you could spawn a new subshell which would call all other commands, register its PID, and specify a trap calling kill_pids_recursive.

Also may be useful to specify set -o pipefail so the pipeline will return code 1 if any of the piped commands fail.

Taking the suggestions from @egc and @dante about calling a wrapper, how about the following separate script to be called by procd? Will this work?

#!/usr/bin/env bash

CPU_CORES=2 # cake-autorate outputs CPU usage in the form:
            # 0: Overall CPU usage; 1: First Core, 2: Second Core, etc.

MQTT_HOST="192.168.x.x"
MQTT_PORT="1883"
MQTT_USER=""
MQTT_PASS=""
MQTT_TOPIC="cake-autorate"

DISC_PREFIX="homeassistant"
DEVICE_ID="openwrt"
DEVICE_NAME="OpenWrt"

MIN_INTERVAL_S=5

cleanup()
{
        trap - INT TERM EXIT
        for pid in "${publish_stats_pids[@]}"
        do
                kill -- -$pid 2>/dev/null || true
        done
        exit 0
}

publish_config ()
{
        mosquitto_pub -h "$MQTT_HOST" -p "$MQTT_PORT" -u "$MQTT_USER" -P "$MQTT_PASS" -r -q 1 -t "$DISC_PREFIX/sensor/$DEVICE_ID/$1/config" -m "$2"
}

publish_discovery()
{
        publish_config "dl_achieved_rate_kbps" \
        '{"name":"CAKE DL Achieved Rate","state_topic":"cake-autorate","value_template":"{{ value_json.dl_achieved_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_dl_achieved_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_achieved_rate_kbps" \
        '{"name":"CAKE UL Achieved Rate","state_topic":"cake-autorate","value_template":"{{ value_json.ul_achieved_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_ul_achieved_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "cake_dl_rate_kbps" \
        '{"name":"CAKE DL Rate","state_topic":"cake-autorate","value_template":"{{ value_json.cake_dl_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_cake_dl_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "cake_ul_rate_kbps" \
        '{"name":"CAKE UL Rate","state_topic":"cake-autorate","value_template":"{{ value_json.cake_ul_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_cake_ul_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "dl_sum_delays" \
        '{"name":"DL Delay Sum","state_topic":"cake-autorate","value_template":"{{ value_json.dl_sum_delays }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_dl_delay","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_sum_delays" \
        '{"name":"UL Delay Sum","state_topic":"cake-autorate","value_template":"{{ value_json.ul_sum_delays }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_ul_delay","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "dl_avg_owd_delta_us" \
        '{"name":"DL OWD Delta","state_topic":"cake-autorate","value_template":"{{ value_json.dl_avg_owd_delta_us }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_dl_owd","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_avg_owd_delta_us" \
        '{"name":"UL OWD Delta","state_topic":"cake-autorate","value_template":"{{ value_json.ul_avg_owd_delta_us }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_ul_owd","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "dl_load_condition" \
        '{"name":"DL Load Condition","state_topic":"cake-autorate","value_template":"{{ value_json.dl_load_condition }}","unique_id":"openwrt_dl_condition","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_load_condition" \
        '{"name":"UL Load Condition","state_topic":"cake-autorate","value_template":"{{ value_json.ul_load_condition }}","unique_id":"openwrt_ul_condition","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        for c in $(seq 0 ${CPU_CORES})
        do
                publish_config "cpu_core$c" \
                "{\"name\":\"CPU Core $c\",\"state_topic\":\"cake-autorate\",\"value_template\":\"{{ value_json.cpu_core$c }}\",\"unit_of_measurement\":\"%\",\"state_class\":\"measurement\",\"unique_id\":\"openwrt_cpu_$c\",\"device\":{\"identifiers\":[\"openwrt\"],\"name\":\"OpenWrt\"}}"
        done
}

publish_stats()
{
        local log_file_path=${1}

        publish_discovery

        tail -F "${log_file_path}" 2>/dev/null | \
        awk -F'; ' -v min_int=$MIN_INTERVAL_S '\
        BEGIN {
         last_summary = systime() - min_int
         last_cpu     = systime() - min_int
        }

        \$1==\"SUMMARY\" && NF>=13 {
         n=systime()
         if(n-last_summary>=min_int){
          last_summary=n
          printf \"{\\\"dl_achieved_rate_kbps\\\":%s,\\\"ul_achieved_rate_kbps\\\":%s,\\\"dl_sum_delays\\\":%s,\\\"ul_sum_delays\\\":%s,\\\"dl_avg_owd_delta_us\\\":%s,\\\"ul_avg_owd_delta_us\\\":%s,\\\"dl_load_condition\\\":\\\"%s\\\",\\\"ul_load_condition\\\":\\\"%s\\\",\\\"cake_dl_rate_kbps\\\":%s,\\\"cake_ul_rate_kbps\\\":%s}\\n\",
           (\$4~/^[0-9.]+$/)?\$4:0,
           (\$5~/^[0-9.]+$/)?\$5:0,
           (\$6~/^[0-9.]+$/)?\$6:0,
           (\$7~/^[0-9.]+$/)?\$7:0,
           (\$8~/^[0-9.]+$/)?\$8:0,
           (\$9~/^[0-9.]+$/)?\$9:0,
           (\$10!=\"\"?\$10:\"unknown\"),
           (\$11!=\"\"?\$11:\"unknown\"),
           (\$12~/^[0-9.]+$/)?\$12:0,
           (\$13~/^[0-9.]+$/)?\$13:0
         }
        }

        \$1==\"CPU\" && NF>=7 {
         n=systime()
         if(n-last_cpu>=min_int){
          last_cpu=n
          printf \"{\\\"cpu_core0\\\":%s,\\\"cpu_core1\\\":%s,\\\"cpu_core2\\\":%s}\\n\",
           (\$5~/^[0-9.]+$/)?\$5:0,
           (\$6~/^[0-9.]+$/)?\$6:0,
           (\$7~/^[0-9.]+$/)?\$7:0
         }
        }
        ' |
        mosquitto_pub -h '$MQTT_HOST' -p '$MQTT_PORT' -u '$MQTT_USER' -P '$MQTT_PASS' -t '$MQTT_TOPIC' -l -q 0 -r &
}

trap cleanup INT TERM EXIT

publish_stats_pids=()

for log_file_path in /var/log/cake-autorate.*.log
do
        publish_stats ${log_file_path} &
        publish_stats_pids+=(${!})
done

wait

Nope, this will only kill the subshell which calls the binaries but not the binaries themselves.

You can verify with this test script:

#!/bin/sh

cleanup()
{
        trap : INT TERM EXIT
        kill -- -${pipe_pid} 2>/dev/null || true
        exit 0
}

stop_process() {
    trap : INT TERM EXIT PIPE
    exit
}

trap_and_catch() {
	sig='' trap_inst="$1" trap_call="${2}"
	shift 2
	for sig in "${@}"; do
		# shellcheck disable=SC2064
		trap "echo \"$trap_inst exiting on receipt of ${sig} signal.\" >&2; ${trap_call}" "${sig}"
	done
}


test1() {
    trap_and_catch test1 'stop_process' INT TERM EXIT PIPE
    sleep 5 &
    wait $!
    cat > /dev/null
}

test2() {
    trap_and_catch test2 'stop_process' INT TERM EXIT PIPE
    sleep 7 &
    wait $!
    cat > /dev/null
}


trap cleanup INT TERM EXIT

echo a | test1 | test2 &
pipe_pid=$!
wait

Call it, press Ctrl+C, watch the subshell exit and later processes report exit when their timer expires.

1 Like

Got it - how about this:

#!/usr/bin/env bash

CPU_CORES=2 # cake-autorate outputs CPU usage in the form:
            # 0: Overall CPU usage; 1: First Core, 2: Second Core, etc.

MQTT_HOST="192.168.x.x"
MQTT_PORT="1883"
MQTT_USER=""
MQTT_PASS=""
MQTT_TOPIC="cake-autorate"

DISC_PREFIX="homeassistant"
DEVICE_ID="openwrt"
DEVICE_NAME="OpenWrt"

MIN_INTERVAL_S=5

cleanup()
{
        trap - INT TERM EXIT
        for pid in "${publish_stats_pids[@]}"
        do
                kill -- -$pid 2>/dev/null || true
        done
        exit 0
}

publish_config ()
{
        mosquitto_pub -h "$MQTT_HOST" -p "$MQTT_PORT" -u "$MQTT_USER" -P "$MQTT_PASS" -r -q 1 -t "$DISC_PREFIX/sensor/$DEVICE_ID/$1/config" -m "$2"
}

publish_discovery()
{
        publish_config "dl_achieved_rate_kbps" \
        '{"name":"CAKE DL Achieved Rate","state_topic":"cake-autorate","value_template":"{{ value_json.dl_achieved_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_dl_achieved_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_achieved_rate_kbps" \
        '{"name":"CAKE UL Achieved Rate","state_topic":"cake-autorate","value_template":"{{ value_json.ul_achieved_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_ul_achieved_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "cake_dl_rate_kbps" \
        '{"name":"CAKE DL Rate","state_topic":"cake-autorate","value_template":"{{ value_json.cake_dl_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_cake_dl_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "cake_ul_rate_kbps" \
        '{"name":"CAKE UL Rate","state_topic":"cake-autorate","value_template":"{{ value_json.cake_ul_rate_kbps }}","unit_of_measurement":"kbps","state_class":"measurement","unique_id":"openwrt_cake_ul_rate","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "dl_sum_delays" \
        '{"name":"DL Delay Sum","state_topic":"cake-autorate","value_template":"{{ value_json.dl_sum_delays }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_dl_delay","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_sum_delays" \
        '{"name":"UL Delay Sum","state_topic":"cake-autorate","value_template":"{{ value_json.ul_sum_delays }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_ul_delay","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "dl_avg_owd_delta_us" \
        '{"name":"DL OWD Delta","state_topic":"cake-autorate","value_template":"{{ value_json.dl_avg_owd_delta_us }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_dl_owd","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_avg_owd_delta_us" \
        '{"name":"UL OWD Delta","state_topic":"cake-autorate","value_template":"{{ value_json.ul_avg_owd_delta_us }}","unit_of_measurement":"us","state_class":"measurement","unique_id":"openwrt_ul_owd","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "dl_load_condition" \
        '{"name":"DL Load Condition","state_topic":"cake-autorate","value_template":"{{ value_json.dl_load_condition }}","unique_id":"openwrt_dl_condition","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        publish_config "ul_load_condition" \
        '{"name":"UL Load Condition","state_topic":"cake-autorate","value_template":"{{ value_json.ul_load_condition }}","unique_id":"openwrt_ul_condition","device":{"identifiers":["openwrt"],"name":"OpenWrt"}}'

        for c in $(seq 0 ${CPU_CORES})
        do
                publish_config "cpu_core$c" \
                "{\"name\":\"CPU Core $c\",\"state_topic\":\"cake-autorate\",\"value_template\":\"{{ value_json.cpu_core$c }}\",\"unit_of_measurement\":\"%\",\"state_class\":\"measurement\",\"unique_id\":\"openwrt_cpu_$c\",\"device\":{\"identifiers\":[\"openwrt\"],\"name\":\"OpenWrt\"}}"
        done
}

publish_stats()
{
        local log_file_path=${1}

        publish_discovery

        while true
        do
                tail -F "${log_file_path}" 2>/dev/null | \
                awk -F'; ' -v min_int=$MIN_INTERVAL_S '\
                BEGIN {
                 last_summary = systime() - min_int
                 last_cpu     = systime() - min_int
                }

                \$1==\"SUMMARY\" && NF>=13 {
                 n=systime()
                 if(n-last_summary>=min_int){
                  last_summary=n
                  printf \"{\\\"dl_achieved_rate_kbps\\\":%s,\\\"ul_achieved_rate_kbps\\\":%s,\\\"dl_sum_delays\\\":%s,\\\"ul_sum_delays\\\":%s,\\\"dl_avg_owd_delta_us\\\":%s,\\\"ul_avg_owd_delta_us\\\":%s,\\\"dl_load_condition\\\":\\\"%s\\\",\\\"ul_load_condition\\\":\\\"%s\\\",\\\"cake_dl_rate_kbps\\\":%s,\\\"cake_ul_rate_kbps\\\":%s}\\n\",
                   (\$4~/^[0-9.]+$/)?\$4:0,
                   (\$5~/^[0-9.]+$/)?\$5:0,
                   (\$6~/^[0-9.]+$/)?\$6:0,
                   (\$7~/^[0-9.]+$/)?\$7:0,
                   (\$8~/^[0-9.]+$/)?\$8:0,
                   (\$9~/^[0-9.]+$/)?\$9:0,
                   (\$10!=\"\"?\$10:\"unknown\"),
                   (\$11!=\"\"?\$11:\"unknown\"),
                   (\$12~/^[0-9.]+$/)?\$12:0,
                   (\$13~/^[0-9.]+$/)?\$13:0
                 }
                }

                \$1==\"CPU\" && NF>=7 {
                 n=systime()
                 if(n-last_cpu>=min_int){
                  last_cpu=n
                  printf \"{\\\"cpu_core0\\\":%s,\\\"cpu_core1\\\":%s,\\\"cpu_core2\\\":%s}\\n\",
                   (\$5~/^[0-9.]+$/)?\$5:0,
                   (\$6~/^[0-9.]+$/)?\$6:0,
                   (\$7~/^[0-9.]+$/)?\$7:0
                 }
                }
                ' |
                mosquitto_pub -h '$MQTT_HOST' -p '$MQTT_PORT' -u '$MQTT_USER' -P '$MQTT_PASS' -t '$MQTT_TOPIC' -l -q 1 -r
                sleep 5
        done
}

trap cleanup INT TERM EXIT

publish_stats_pids=()

for log_file_path in /var/log/cake-autorate.*.log
do
        (publish_stats ${log_file_path}) &
        publish_stats_pids+=(${!})
done

wait

Same, pretty sure it will kill the subshell but not the binaries.

Edit: if you want to use kill_pids_recursive (which IMO is the simplest and most reliable method) then note that I've made some changes to it with today's PR. Those changes fix a (minor) bug and remove some unnecessary code. So best to use the version in the PR.

I found a way that works using your excellent test script:

#!/bin/sh

set -m

cleanup()
{
        trap : INT TERM EXIT
        kill -- -${pipe_pid} 2>/dev/null || true
        exit 0
}

stop_process() {
    trap : INT TERM EXIT PIPE
    exit
}

trap_and_catch() {
        sig='' trap_inst="$1" trap_call="${2}"
        shift 2
        for sig in "${@}"; do
                # shellcheck disable=SC2064
                trap "echo \"$trap_inst exiting on receipt of ${sig} signal.\" >&2; ${trap_call}" "${sig}"
        done
}


test1() {
    trap_and_catch test1 'stop_process' INT TERM EXIT PIPE
    sleep 5 &
    wait $!
    cat > /dev/null
}

test2() {
    trap_and_catch test2 'stop_process' INT TERM EXIT PIPE
    sleep 7 &
    wait $!
    cat > /dev/null
}


trap cleanup INT TERM EXIT

( echo a | test1 | test2 ) &
pipe_pid=$!
wait

So subshelll and job control together seems to work in the context of this script on its own.

I don't know for sure if this will work in the procd context though.

1 Like

This sends the signal to the process group (IIRC). How individual processes handle this signal depends on their implementation. They might obey it or ignore it or do anything else. I think there was another issue with this method, but can't recall what. I was looking into this when implementing the 'sure-stop' mechanism for adblock-lean and I couldn't make it work reliably in all scenarios, which is why I eventually implemented kill_pids_recursive. YMMV.

1 Like

I was able to get procd to shut down my hacky execute-a-command-based-on-log-lines scripts without having to record PIDs for shutdown, but to do so, I needed to exec the commands and arrange for the start of the pipeline to be the thing that gets the shutdown signal instead of a shell instance.

I don't think this can be done using pipelines constructed with |, but it can be done with named pipes and backgrounding of processes. Instead of:

cmd1 | cmd2

You can do something like:

exec cmd2 <MY_FIFO &
exec cmd1 >MY_FIFO

This will make it so cmd1 is the thing that gets the shutdown signal and cmd2 is a child process of it, instead of both being children of a shell instance. When cmd1 terminates, cmd2's input gets closed, so it also exits.

A more complete example that manages creation and cleanup of the named pipe:

FIFO=/tmp/my_fifo
rm -f ${FIFO}
mkfifo -m 600 ${FIFO}
exec cmd2 <${FIFO} &
{
    rm -f ${FIFO}
    exec cmd1
} >${FIFO}

This can be extended to a 3 command pipe by adding another named pipe, but you would still need to construct it in reverse.

It's still kinda messy, but I find this preferable to keeping PID files or traps or whatnot.

Interesting. I tried to make something like your idea work as follows:

#!/bin/bash

set -m

stop_process() {
    trap : INT TERM EXIT PIPE
    exit
}

trap_and_catch() {
        sig='' trap_inst="$1" trap_call="${2}"
        shift 2
        for sig in "${@}"; do
                # shellcheck disable=SC2064
                trap "echo \"$trap_inst exiting on receipt of ${sig} signal.\" >&2; ${trap_call}" "${sig}"
        done
}


test1() {
    trap_and_catch test1 'stop_process' INT TERM EXIT PIPE
    sleep 5 &
    wait $!
    cat > /dev/null
}

test2() {
    trap_and_catch test2 'stop_process' INT TERM EXIT PIPE
    sleep 7 &
    wait $!
    cat > /dev/null
}


# echo a | test1 | test2

exec {fd1}<> <(:)
exec {fd2}<> <(:)

echo a >&"${fd1}"
test2<&"${fd2}" &
test1<&"${fd1}" >&"${fd2}"

But I'm not sure what it's missing.

You need to execute the commands in reverse of what you want the pipeline to be so that you can background the commands later in the pipeline:

test2<&"${fd2}" &
test1<&"${fd1}" >&"${fd2}" &
echo a >&"${fd1}"

... or maybe it's sufficient to just make the first command last, since that's the only one that does not get backgrounded. Without backgrounding, you're just executing the commands in sequence.

OK, but then we need to retain pids and kill like so, don't we(?):

exec {fd1}<> <(:)
exec {fd2}<> <(:)

echo a >&"${fd1}"
test2<&"${fd2}" &
pid1=${!}
test1<&"${fd1}" >&"${fd2}" &
pid2=${!}

sleep 2

kill ${pid1} ${pid2}

But then ctrl-c doesn't work to tear everything down.

The rest of the pipeline should shut down once the first command terminates. At least that is my experience. The example you posted has added a bunch more complexity with sleeps and waits and traps, though, so maybe something is holding onto something else.

I haven't tried this with anon pipes, since I don't think ash supports that, I'll see if I can test that out and provide a less abstract example.

OK, I don't think this will work with anonymous pipes because all the sub processes will inherit the pipe FDs, which will keep it open. Here's a 3 command example with named pipes:

mkfifo -m 600 /tmp/fifo1
mkfifo -m 600 /tmp/fifo2

{ sleep 20; cat >/dev/null; } </tmp/fifo2 &
{ sleep 10; cat > /dev/null; } </tmp/fifo1 >&/tmp/fifo2 &
echo a >/tmp/fifo1

It is true, though, that the sub shells here will continue running until the sleep commands exit and the subsequent cat commands gets EOF on the FIFO. I guess I've never cared about that, but if you want to guarantee that all sub processes exit before service stop returns, then this approach will not be sufficient. I've only ever cared that they don't just stick around forever.

1 Like

Thanks a lot for your input. I for one find the complexity in simply closing down a pipe launched within procd a source of frustration. It feels to me like it should be easier!

3 Likes