Snapshot cgroup options break docker

At least for the Nanopi R4S (which is only available in snapshot builds). Docker is broken.
When running a container, it gives something like:

Containers: start dokuwiki...code:400 OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: process_linux.go:508: setting cgroup config for procHooks process caused: open /sys/fs/cgroup/docker/e640133d85a283314d41d88aa1b27556a6899241dbfefa6aaf91e1cae47ccae8/io.weight: no such file or directory: unknown

And its right, there is no io.weight sys entries for cgroups, and the kernel config for cgroup io.weight is CONFIG_BLK_CGROUP_IOCOST and that's disabled in the base config of Kernel 5.4 in 21.02 and also snapshots for 5.4 and 5.10.

I can only conclude that docker must also therefore be broken for every other snapshot target and also every 21.02 target.

1 Like

was reported in the past too, see
https://github.com/openwrt/packages/issues/12380

someone has to open a PR to add that symbol to kernel config, or containers that require this feature do not work

1 Like

To be clear, this is not an issue with the container. its a missing feature of docker and LuCi.
At least in the snapshot. LuCi defines the "Block IO Weight" configuration, which gets passed in when docker starts. The container doesn't need it per-se. LuCi/Docker integration in OpenWrt is setting it on.

I tried removing that option, but one can not, it must be set between 10 and 1000. Hence every docker container will be broken because every docker container needs the kernel setting.

The report @bobafetthotmail linked to seems to be for a container that needed that setting. And it was proposed to have it tunable, i think it just always needs to be ON. Which is a simple matter of setting it in the defaults. I have tested this and it works. I can throw a PR together with those two line changes to the defaults in about 2 seconds, but it won't be selectable

Since you said this is not actually required by the container, I had a look at the Luci source code.

I suspect that if you just delete the number from that field, the config line for it is not generated, and you should not get errors.

see the source of the Luci app dockerman you are using

it has the o.rmempty = true property so I think this is a "remove if empty" type of data field

Otherwise you can try deleting that line manually from the container config file, I do not know where it is generated, it does not seem to use the /etc/config/xxxx config files, but you can probably find it easily enough, OpenWrt does not have a lot of stuff in root filesystem anyway.

and yes, if this isn't mandatory for the container it can be "fixed" by disabling that code, or just adding a warning in Luci for that field like "DOES NOT WORK IN OFFICIAL BUILDS -- LEAVE EMPTY"

2 Likes

@bobafetthotmail I tried to clear it when i worked out it was causing the issue in the first place. And it defaulted to 10. I couldn't set it to "not defined" like the other settings can be set.

I suspect what you say is right and this is actually a luci bug, it should be able to be not defined. I could probably edit the config manually as a solve. My kernel has those flags now though, and the router is in use, so its difficult for me to check it. It did add detectable size to the kernel, not that is worries my target, but it would be an issue for many.

if you or someone else does the test and sees that it works fine without that option, or just want to see this solved, please report this as a Luci bug https://github.com/openwrt/luci/issues
as it should be an easy fix for someone that understands Luci code. I'm not an expert

2 Likes

thanking bobafett and strontium and also linking in contemporary issue;

https://www.reddit.com/r/openwrt/comments/q5ove6/help_creating_plex_container/

found and bumped this github issue: https://github.com/openwrt/luci/issues/5327

seems like the upstream dev had already fixed this in his own repo (in the sense that with his change if you leave the block io weigth empty the container will not use this feature at all and therefore will work on kernels without that kernel option), so it's just a case of copy-pasting what he did. See the message in the github issue

2 Likes