Segmentation fault in rpcd

I've run into an issue where rpcd crashes at times.

To reproduce this, I created a small dummy we can call :

# Create our `ubus call foo bar` dummy.
cat << EOF > /usr/libexec/rpcd/foo
#!/usr/bin/env lua

function list ()
    return print('{"bar":{}}')
end

function bar ()
    local json = require 'luci.jsonc'
    return print(json.stringify({
        baz = "qux"
    }))
end

if arg[1] == 'list' then
    return list()
elseif arg[1] == 'call' then
    if arg[2] == 'bar' then
        return bar()
    end
end
EOF
# Set the executable bit.
chmod +x /usr/libexec/rpcd/foo
# Define an ACL to allow calling our dummy via uhttpd's /ubus endpoint.
cat << EOF > /usr/share/rpcd/acl.d/foo.json
{
    "unauthenticated": {
        "description": "Access controls for unauthenticated requests to foo",
        "read": {
            "ubus": {
                "foo": [ "bar" ]
            }
        }
    }
}
EOF
# Reload rpcd so it picks up our dummy and registers it with ubus and picks up our ACL.
/etc/init.d/rpcd reload

Next, I wrote a small stress-test program to bomb rpcd with a lot of requests. It boils down to spamming the equivalent of the following command across all available cores:

curl --data '{ "jsonrpc": "2.0", "id": 1, "method": "call", "params": [ "00000000000000000000000000000000", "foo", "bar", {}, ], }' http://192.168.27.1/ubus

Which, if everything goes well, should return:

{"jsonrpc":"2.0","id":1,"result":[0,{"baz":"qux"}]}

Next, I ran gdbserver on the OpenWrt router and attached to it on my own machine by following the gdb instructions:

# On the OpenWrt router.
gdbserver 0.0.0.0:9000 --attach $(pidof rpcd)
# On the local machine.
./scripts/remote-gdb 192.168.27.1:9000 build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/rpcd

Spawning the stress-tester quickly leads to rpcd terminating with a segmentation fault. I've observed two distinct (but undoubtedly related) stack traces:

Program received signal SIGSEGV, Segmentation fault.
free (p=0x77f55220) at src/malloc/malloc.c:476
476             if (next->psize != self->csize) a_crash();
(gdb) bt
#0  free (p=0x77f55220) at src/malloc/malloc.c:476
#1  0x77e456a9 in json_tokener_free (tok=0x77f551e0) at json_tokener.c:132
#2  0x004057bb in rpc_plugin_call_finish_cb (blob=<optimized out>, stat=<optimized out>, priv=0xbc3070) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/plugin.c:123
#3  0x0040215b in rpc_exec_reply (c=0x77f554e0, rv=rv@entry=0) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/exec.c:136
#4  0x00402211 in rpc_exec_opipe_state_cb (s=<optimized out>) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/exec.c:266
#5  0x77e9fbbd in ustream_state_change_cb (t=0x77f55610) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/libubox-2018-07-25-c83a84af/ustream.c:109
#6  0x77e9f207 in uloop_process_timeouts (tv=<optimized out>) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/libubox-2018-07-25-c83a84af/uloop.c:505
#7  uloop_run_timeout (timeout=-1) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/libubox-2018-07-25-c83a84af/uloop.c:542
#8  0x00401d4f in uloop_run () at /home/alchiadus/development/openwrt/staging_dir/target-mipsel_24kc_musl/usr/include/libubox/uloop.h:111
#9  main (argc=<optimized out>, argv=<optimized out>) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/main.c:120
Program received signal SIGSEGV, Segmentation fault.
free (p=0xb7f120) at src/malloc/malloc.c:476
476             if (next->psize != self->csize) a_crash();
(gdb) bt
#0  free (p=0xb7f120) at src/malloc/malloc.c:476
#1  0x00402143 in rpc_exec_reply (c=0x77ee2550, rv=5, rv@entry=0) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/exec.c:153
#2  0x00402211 in rpc_exec_opipe_state_cb (s=<optimized out>) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/exec.c:266
#3  0x77efdbbd in ustream_state_change_cb (t=0x77ee2680) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/libubox-2018-07-25-c83a84af/ustream.c:109
#4  0x77efd207 in uloop_process_timeouts (tv=<optimized out>) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/libubox-2018-07-25-c83a84af/uloop.c:505
#5  uloop_run_timeout (timeout=-1) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/libubox-2018-07-25-c83a84af/uloop.c:542
#6  0x00401d4f in uloop_run () at /home/alchiadus/development/openwrt/staging_dir/target-mipsel_24kc_musl/usr/include/libubox/uloop.h:111
#7  main (argc=<optimized out>, argv=<optimized out>) at /home/alchiadus/development/openwrt/build_dir/target-mipsel_24kc_musl/rpcd-2020-05-26-7be1f171/main.c:120

To rule out multi-threading issues, I also tried setting max_requests to 1 in /etc/config/uhttpd (and restart uhttpd afterwards) as follows:

	# Maximum number of concurrent requests.
	# If this number is exceeded, further requests are
	# queued until the number of running requests drops
	# below the limit again.
	option max_requests 1

This made no difference, I could still reproduce the aforementioned crashes.

Unfortunately I am not familiar enough with the codebase to have a clear understanding which part is responsible for freeing certain things, as it looks like it could possibly be a double-free? Does anyone have some insights or pointers to help debug and fix this?