Procd: correctly stop process incl. its children

I’m running an inotifywait-loop-script managed by procd.

My procd init file looks like:

USE_PROCD=1
PROG=/tmp/inotifywait-test

start_service() {
procd_open_instance
procd_set_param command $PROG
procd_close_instance
}

The first version of inotifywait-test -- which is happily starting but not stopping everything/correctly -- is:

inotifywait -m -e close_write /tmp |
while read path action file
do
echo "${path}/${file} completed"
done

ps | grep inotify after start:

30738 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test
30744 root       948 S    inotifywait -m -e close_write /tmp
30746 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test
30766 root      1204 S    grep inotifywait

and after stop:

30744 root       948 S    inotifywait -m -e close_write /tmp
30746 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test
30794 root      1204 S    grep inotifywait

The script not correctly stopping is due to the pipe (|) opening a sub-shell, not being implicitly terminated when the parent is. That caveat and a solution is well documented here: https://openwrt.org/docs/guide-developer/procd-init-scripts#a_common_gotcha_with_stopping_a_service

So we adapt mentioned solution and inotifywait-test is now looking like this:

pid=
_cleanup() {
    kill $pid
    exit
}

trap _cleanup TERM INT

inotifywait -m -e close_write /tmp |
        while read path action file
        do
		echo "${path}/${file} completed"
        done &

pid=$!
wait $pid

ps | grep inotify after start:

30875 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test2
30886 root       948 S    inotifywait -m -e close_write /tmp
30887 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test2
30905 root      1204 S    grep inotify

and after stop:

30886 root       948 S    inotifywait -m -e close_write /tmp
30933 root      1204 S    grep inotify

so apparently "more" got stopped now compared to v1, but the inotifywait-process itself is still running.

What am I missing? Is there another level nested within, which also needs cleanup?

"The internet"(TM) says I should kill the PID-group e.g. via kill SIGTERM -$$, however I did not manage to actually stop the script from within procd so far.

Something similar was discussed recently here (and I suspect it has come up before that, but I found it hard to search for).

I was curious about the topic, so I looked into it a little afterwards and I ran across that wiki article you mentioned. The solution it lists as example doesn't actually work, at least not fully. The problem is that it records the PID of the last command in the pipeline, and you generally want to kill the first command in a pipeline and that will (usually) cause the remaining commands in the pipeline to exit as they see EOF on their input.

I don't know of an easy way to get the PID from the first command of a pipeline constructed using | without having the child write its PID out to a pidfile, but there were a few alternate solutions mentioned in that topic I linked. I think OP got it working using a combination of killing PID group, enabling job control, and adding an extra subshell, which I think would make your example:

set -m

pid=
_cleanup() {
    kill -- -$pid
    exit
}

trap _cleanup TERM INT

( inotifywait -m -e close_write /tmp |
        while read path action file
        do
		echo "${path}/${file} completed"
        done ) &

pid=$!
wait $pid

Personally, I find the extra subshell processes to be annoying, so I don't do it that way and instead construct the pipeline with a named pipe (fifo) instead of with |, then exec the first command and background the remainder of the pipeline. That approach has its limits, though, and does make the resulting script harder to understand:

FIFO=`mktemp -u`
rm -f ${FIFO}
mkfifo -m 600 ${FIFO}

while read path action file
do
        echo "${path}/${file} completed"
done <${FIFO} &

exec >${FIFO}
rm -f ${FIFO}

exec inotifywait -m -e close_write /tmp

This makes the inotifywait process be the one that gets the signal from procd and the only remaining subshell will be the one running the while loop, which will shut down due to its input getting EOF.

Hey, thanks a lot for shedding some light on that!

However, first solution was what I also already tried and does not work for me:

pid=
_cleanup() {
    kill -- -$pid
    exit
}

trap _cleanup TERM INT

( inotifywait -m -e close_write /tmp |
        while read path action file
        do
		echo "${path}/${file} completed"
        done ) &

pid=$!
wait $pid

after start (which is expected given the additional sub-shell):

~# ps | grep inotifywait
 1665 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test3
 1668 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test3
 1669 root       948 S    inotifywait -m -e close_write /tmp
 1670 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test3
 1694 root      1204 S    grep inotifywait

after stop:

# ps | grep inotifywait
 1668 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test3
 1669 root       948 S    inotifywait -m -e close_write /tmp
 1670 root      1204 S    {inotifywait-tes} /bin/sh /tmp/inotifywait-test3
 1720 root      1204 S    grep inotifywait

The second one -- v4, with the named pipe -- appears to work like a charm, though! Thanks a lot!

EDIT: The first one also works, when running the interpreter with -m, as proposed, but overlooked by me.