Strategy of setting up OpenWrt on 'large' system for easy upgrade, security & expansion for long haul

kvic · June 17, 2024, 5:04am

I just bought a new router board (Banana Pi R4). It comes with built-in NAND and eMMC.

128 MB NAND (onboard, pretty slow 2MB/s -ish)
8 GB eMMC (onboard, fast in 100+ MB/s but haven't actually measured yet)

Banana Pi R4 has DIP switches to select where to boot from. So I plan to use the 128 MB NAND for a minimal OpenWrt installation. This installation will act as a 'recovery/emergency' contingency. It'll get updated from time to time but much less frequent than the main installation.

I plan to use the 8 GB eMMC as the main installation of OpenWrt. It'll be minimal and be the same image (and config etc) to the image installed on the 128 MiB NAND. This main installation will get updated as soon as OpenWrt stable is refreshed.

The OpenWrt installation will be minimal for my router to perform great as a router. The planned 'rootfs + rootfs_data' will be ~500MB which is the current OpenWrt default for my target. I want to deviate as little as possible from OpenWrt defaults so that on-going customisation & maintenance effort will be minimum for long haul, say, 10+ years.

Besides R4 as a router, I plan to run a couple of LXC containers

one for Internet facing services
one for LAN facing services (non router functions)
one or more for other stuff

All these containers will be orchestrated by OpenWrt's LXC package. Inside the containers all will run one and same copy of ArchLinux for ARM installation.

So the 8 GB eMMC will be 'partitioned' into one 500 MB for OpenWrt and 7.5 GB for ArchLinux for ARM, the containers, application data and limited user data. And my original thought on making use of the 8 GB eMMC through extroot:

But after stumbling on this discussion of extroot:

I believe extroot not a good idea, at least for my intended usage. In place of extroot, my alternative plan is to create a partition of 7.5 GB and mount it inside OpenWrt. This will retain everything stored in the 7.5 GB partition and survive OpenWrt upgrades without extra work.

Any critics? Suggestions of alternative & better practice? Soliciting feedback from anyone who had attempted or is doing something similar.

This is my ongoing exploration for the coming weeks. Sorry being a long post and thanks for spending the time to finish it here.

RadioOperator · June 17, 2024, 7:29am

The RAX3000M eMMC version, comes with 64GB eMMC flash, only us$25/set in china.
https://openwrt.org/toh/hwdata/cmcc/cmcc_rax3000m

kvic · June 18, 2024, 1:50am

CMCC is China Mobile, the giant network operator in China. For unknown reason, perhaps surplus? RAX3000M floods the grey market. I believe $25 is heavily 'subsidized' price. People should grab one when it lasts and it suits your need and purposes.

OpenWrt One will have the same SoC and radio as in RAX3000M. Comes with 128MB SPI NAND. Hope future owners won't see your post

And we digressed...

RadioOperator · June 18, 2024, 2:21am

Agree, RAX3000M factory firmware, cannot use on non-China Mobile network.
Install openwrt......OK.

I believe the qty. of the model, more than 10k in the market.

kvic · June 18, 2024, 3:47pm

Used a 8GB micro-SD as a test medium and took the baby steps for 'proof of concept'. Successfully created the "7.5 GB partition" and get it auto mounted when OpenWrt starts up.

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mtdblock0    31:0    0     2M  1 disk
mtdblock1    31:1    0   126M  0 disk
mmcblk0     179:0    0   7.3G  0 disk
├─mmcblk0p1 179:1    0     4M  0 part
├─mmcblk0p2 179:2    0   512K  0 part
├─mmcblk0p3 179:3    0     2M  0 part
├─mmcblk0p4 179:4    0     4M  0 part
├─mmcblk0p5 179:5    0    32M  0 part
├─mmcblk0p6 179:6    0    20M  0 part
├─mmcblk0p7 179:7    0   448M  0 part
└─mmcblk0p8 259:0    0   6.8G  0 part /opt
ubiblock0_1 254:0    0  54.9M  0 disk
fit0        259:1    0   7.8M  1 disk /rom
fitrw       259:2    0 434.6M  0 disk /overlay

mmcblk0 is the 8GB micro-SD test medium. Partitions 1 to 7 are created by flashing OpenWrt image. Then I manually created 'mmcblk0p8' to take up rest of the disk, which is about 6.8GiB in reality.

Partition p1 holds BL2 boot loader.
Partition p4 holds BL31 + BL33 as part of U-Boot package.
Partition p7 holds kernel + fdt + rootfs + roofs_data

All these are automatically done by the magic of OpenWrt image. I spell it out so that I won't forget and may be of interest to some in the audience.

Tested sysupgrade and it can handle nicely everything on partition p7. When necessary, I could manually update p1 and p4

Partition p8 is configured to automatically mounted on /opt. Its content will be manually managed by me. By that in the case of ArchLinux rootfs, it'll be automagically managed by me running pacman.

As for the choice of filesystem for p8, I went with f2fs because for unknown reason I can't find UBIFS on my ArchLinux PC (anyone knows why?) but f2fs is available. In the rare event, I could investigate any damaged p8 on my ArchLinux PC.

f2fs is built-in and supported by default in OpenWrt. Hence, also one less thing to customise. Played, abused and toasted a few rounds on p8 and the micro-SD card. So far pretty solid. Knock on wood.

Sad news is that I just realised multiple LXC containers can't share one rootfs. Need to start some reading & digging.

kvic · June 19, 2024, 11:32am

Things progressed a bit faster than I expected. LXC is up and running off Partition p8. For testing, two containers run default ArchLinux for ARM from the LXC team.

One is created as privileged container. The other is unprivileged container. For production, I plan to only run unprivileged containers for security reason.

# ll /opt/srv/lxc
drwxr-xr-x    4 root     root          3452 Jun 19 04:58 ./
drwxr-xr-x    3 root     root          3452 Jun 19 02:55 ../
drwxrwx---    3 100000   100000        3452 Jun 19 05:00 ups1/
drwxrwx---    3 root     root          3452 Jun 19 04:42 vps1/

# find /opt/srv/lxc -maxdepth 2
/opt/srv/lxc
/opt/srv/lxc/vps1
/opt/srv/lxc/vps1/rootfs
/opt/srv/lxc/vps1/config
/opt/srv/lxc/ups1
/opt/srv/lxc/ups1/rootfs
/opt/srv/lxc/ups1/config

# du -ksh /opt/srv/lxc/*
811.3M	/opt/srv/lxc/ups1
809.4M	/opt/srv/lxc/vps1

The two containers have its own copy of ROOTFS. Although they run the same ArchLinux, they don't share the same base ROOTFS at the moment. So not efficient use of storage space on Partition p8.

OpenWrt's Guide is pretty self-sufficient. Except that you will not want to install 'cgroupfs-mount' package. It prevented me from starting containers. Some details in this post:

Also worth noting, on Banana Pi R4, the LAN bridge is named 'br-lan'. Replace 'lxcbr0' inside /etc/lxc/default.conf to hook up my containers to the LAN bridge. Or you may create lxcbr0 but I don't see the benefit. May look into it on a future time.

Now I recall two more types of error in running unprivileged containers. Here they're and also the fixes. So that I won't forget and may help someone in the future.

lxc: Operation not permitted - Failed to mount "proc"
lxc: Operation not permitted - Failed to mount "sys"

The workaround to the above errors is to add the following two lines to /etc/rc.local:

mount -o remount,rw,nosuid,nodev,noexec,relatime proc /proc
mount -o remount,rw,nodev,noexec,relatime sysfs /sys

Source / Credit:

mount: /sys/kernel/debug: permission denied.
mount: /sys/kernel/config: permission denied.

The above two errors won't stop guest containers from running. But would be nice to get rid of them anyway. To fix, inside the guest container and run:

systemctl mask sys-kernel-debug.mount
systemctl mask sys-kernel-config.mount

Source / Credit:

github.com/lxc/lxc

sys-kernel-debug.mount and sys-kernel-config.mount fail on Bionic

opened 03:20AM - 15 Jul 21 UTC

closed 05:46PM - 21 Feb 24 UTC

joesiewert

# Required information * Distribution: Ubuntu * Distribution version: Bion…ic * The output of * `lxc-start --version`: 3.0.3 * `lxc-checkconfig` ``` Kernel configuration not found at /proc/config.gz; searching... Kernel configuration found at /boot/config-5.4.0-1049-aws --- Namespaces --- Namespaces: enabled Utsname namespace: enabled Ipc namespace: enabled Pid namespace: enabled User namespace: enabled Network namespace: enabled --- Control groups --- Cgroups: enabled Cgroup v1 mount points: /sys/fs/cgroup/systemd /sys/fs/cgroup/cpuset /sys/fs/cgroup/net_cls,net_prio /sys/fs/cgroup/rdma /sys/fs/cgroup/pids /sys/fs/cgroup/devices /sys/fs/cgroup/freezer /sys/fs/cgroup/perf_event /sys/fs/cgroup/memory /sys/fs/cgroup/blkio /sys/fs/cgroup/hugetlb /sys/fs/cgroup/cpu,cpuacct Cgroup v2 mount points: /sys/fs/cgroup/unified Cgroup v1 clone_children flag: enabled Cgroup device: enabled Cgroup sched: enabled Cgroup cpu account: enabled Cgroup memory controller: enabled Cgroup cpuset: enabled --- Misc --- Veth pair device: enabled, loaded Macvlan: enabled, not loaded Vlan: enabled, not loaded Bridges: enabled, loaded Advanced netfilter: enabled, not loaded CONFIG_NF_NAT_IPV4: missing CONFIG_NF_NAT_IPV6: missing CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, loaded CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded FUSE (for use with lxcfs): enabled, not loaded --- Checkpoint/Restore --- checkpoint restore: enabled CONFIG_FHANDLE: enabled CONFIG_EVENTFD: enabled CONFIG_EPOLL: enabled CONFIG_UNIX_DIAG: enabled CONFIG_INET_DIAG: enabled CONFIG_PACKET_DIAG: enabled CONFIG_NETLINK_DIAG: enabled File capabilities: Note : Before booting a new kernel, you can check its configuration usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig ``` * `uname -a` ``` Linux 5.4.0-1049-aws #51~18.04.1-Ubuntu SMP Fri May 14 18:38:46 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux ``` * `cat /proc/self/cgroup` ``` 12:cpu,cpuacct:/user.slice 11:hugetlb:/ 10:blkio:/user.slice 9:memory:/user/ubuntu/0 8:perf_event:/ 7:freezer:/user/ubuntu/0 6:devices:/user.slice 5:pids:/user.slice/user-1000.slice/session-145.scope 4:rdma:/ 3:net_cls,net_prio:/ 2:cpuset:/ 1:name=systemd:/user.slice/user-1000.slice/session-145.scope 0::/user.slice/user-1000.slice/session-145.scope ``` * `cat /proc/1/mounts` ``` sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 udev /dev devtmpfs rw,nosuid,relatime,size=125754456k,nr_inodes=31438614,mode=755 0 0 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=25153820k,mode=755 0 0 /dev/xvda1 / ext4 rw,relatime,discard 0 0 securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0 tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0 tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0 tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,name=systemd 0 0 pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset,clone_children 0 0 cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0 cgroup /sys/fs/cgroup/rdma cgroup rw,nosuid,nodev,noexec,relatime,rdma 0 0 cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0 systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=44,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=35493 0 0 mqueue /dev/mqueue mqueue rw,relatime 0 0 debugfs /sys/kernel/debug debugfs rw,relatime 0 0 hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0 tmpfs /tmp tmpfs rw,nosuid,nodev 0 0 configfs /sys/kernel/config configfs rw,relatime 0 0 fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0 lxcfs /var/lib/lxcfs fuse.lxcfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0 binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0 tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=25153816k,mode=700,uid=1000,gid=1000 0 0 ``` # Issue description Not sure if this is actually a LXC issue, but I'm observing it in LXC and not on the underlying Ubuntu Bionic host. On a fresh Bionic LXC instance, `systemctl;` reports `sys-kernel-debug.mount` and `sys-kernel-config.mount` are failing to mount. It appears the underlying directory permissions in `/sys/kernel` changed approximately in the last month or so (that was when I last successfully built without this issue). I've looked through every changelog, issue tracker, etc I can think of and can't find any breadcrumb as to why that changed. Part of our build process uses `systemctl is-system-running` to determine if the system is in a `running` state, but these failing services leave the system in a `degraded` state, though the container does seem to run fine otherwise. Note, if I repeat the steps below for Focal, rather than Bionic, I only see `sys-kernel-debug.mount` failing and no mention of `sys-kernel-config.mount`. # Steps to reproduce 1. `sudo DOWNLOAD_KEYSERVER="hkp://keyserver.ubuntu.com" lxc-create -n bionic-test -t download -- -d ubuntu -r bionic -a amd64` 2. `sudo lxc-start -d -n bionic-test -o /dev/stdout -l debug` 3. `sudo lxc-attach bionic-test` 4. `systemctl;` 5. `ls -la /sys/kernel/` 6. `sudo systemctl reset-failed` (running this seems to clear those errors?) `systemctl;` output: ``` sys-kernel-config.mount loaded failed failed Kernel Configuration File System sys-kernel-debug.mount loaded failed failed Kernel Debug File System ``` `ls -la /sys/kernel/` output for config and debug directories: ``` Bionic failing today: dr-xr-xr-x 2 nobody nogroup 0 Jul 14 01:53 config dr-xr-xr-x 2 nobody nogroup 0 Jul 14 01:52 debug Bionic working a month ago: dr-xr-xr-x 2 nobody nogroup 0 Jul 14 22:37 config drwx------ 32 nobody nogroup 0 Jul 14 21:40 debug ``` # Information to attach - [x] container log (The <log> file from running `lxc-start -n <c> -l TRACE -o <logfile> `) ``` $ sudo lxc-start -d -n bionic-test -o /dev/stdout -l debug lxc-start bionic-test 20210715014653.470 INFO confile - confile.c:set_config_idmaps:1555 - Read uid map: type u nsid 0 hostid 100000 range 65536 lxc-start bionic-test 20210715014653.470 INFO confile - confile.c:set_config_idmaps:1555 - Read uid map: type g nsid 0 hostid 100000 range 65536 lxc-start bionic-test 20210715014653.470 INFO lxccontainer - lxccontainer.c:do_lxcapi_start:961 - Set process title to [lxc monitor] /var/lib/lxc bionic-test lxc-start bionic-test 20210715014653.471 INFO lsm - lsm/lsm.c:lsm_init:50 - LSM security driver AppArmor lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "reject_force_umount # comment this to allow umount -f; not recommended" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for reject_force_umount action 0(kill) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for reject_force_umount action 0(kill) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for reject_force_umount action 0(kill) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:do_resolve_add_rule:505 - Set seccomp rule to reject force umounts lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for reject_force_umount action 0(kill) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "[all]" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "kexec_load errno 1" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for kexec_load action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for kexec_load action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for kexec_load action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for kexec_load action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "open_by_handle_at errno 1" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for open_by_handle_at action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for open_by_handle_at action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for open_by_handle_at action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for open_by_handle_at action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "init_module errno 1" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for init_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for init_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for init_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for init_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "finit_module errno 1" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for finit_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for finit_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for finit_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for finit_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:759 - Processing "delete_module errno 1" lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:937 - Added native rule for arch 0 for delete_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:946 - Added compat rule for arch 1073741827 for delete_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:956 - Added compat rule for arch 1073741886 for delete_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:966 - Added native rule for arch -1073741762 for delete_module action 327681(errno) lxc-start bionic-test 20210715014653.471 INFO seccomp - seccomp.c:parse_config_v2:970 - Merging compat seccomp contexts into main context lxc-start bionic-test 20210715014653.472 DEBUG terminal - terminal.c:lxc_terminal_peer_default:707 - No such device - The process does not have a controlling terminal lxc-start bionic-test 20210715014653.472 INFO start - start.c:lxc_init:897 - Container "bionic-test" is initialized lxc-start bionic-test 20210715014653.472 DEBUG storage - storage/storage.c:get_storage_by_name:231 - Detected rootfs type "dir" lxc-start bionic-test 20210715014653.474 INFO network - network.c:instantiate_veth:147 - Retrieved mtu 1500 from lxcbr0 lxc-start bionic-test 20210715014653.475 INFO network - network.c:instantiate_veth:175 - Attached "vethOK1UND" to bridge "lxcbr0" lxc-start bionic-test 20210715014653.475 DEBUG network - network.c:instantiate_veth:201 - Instantiated veth "vethOK1UND/vethYBV30X", index is "37" lxc-start bionic-test 20210715014653.475 DEBUG cgfsng - cgroups/cgfsng.c:cg_legacy_handle_cpuset_hierarchy:620 - "cgroup.clone_children" was already set to "1" lxc-start bionic-test 20210715014653.511 INFO start - start.c:lxc_spawn:1688 - Cloned CLONE_NEWUSER lxc-start bionic-test 20210715014653.511 INFO start - start.c:lxc_spawn:1688 - Cloned CLONE_NEWNS lxc-start bionic-test 20210715014653.511 INFO start - start.c:lxc_spawn:1688 - Cloned CLONE_NEWPID lxc-start bionic-test 20210715014653.511 INFO start - start.c:lxc_spawn:1688 - Cloned CLONE_NEWUTS lxc-start bionic-test 20210715014653.511 INFO start - start.c:lxc_spawn:1688 - Cloned CLONE_NEWIPC lxc-start bionic-test 20210715014653.512 DEBUG start - start.c:lxc_try_preserve_namespaces:196 - Preserved user namespace via fd 14 lxc-start bionic-test 20210715014653.512 DEBUG start - start.c:lxc_try_preserve_namespaces:196 - Preserved mnt namespace via fd 15 lxc-start bionic-test 20210715014653.512 DEBUG start - start.c:lxc_try_preserve_namespaces:196 - Preserved pid namespace via fd 16 lxc-start bionic-test 20210715014653.512 DEBUG start - start.c:lxc_try_preserve_namespaces:196 - Preserved uts namespace via fd 17 lxc-start bionic-test 20210715014653.512 DEBUG start - start.c:lxc_try_preserve_namespaces:196 - Preserved ipc namespace via fd 18 lxc-start bionic-test 20210715014653.512 DEBUG conf - conf.c:idmaptool_on_path_and_privileged:2836 - The binary "/usr/bin/newuidmap" does have the setuid bit set lxc-start bionic-test 20210715014653.512 DEBUG conf - conf.c:idmaptool_on_path_and_privileged:2836 - The binary "/usr/bin/newgidmap" does have the setuid bit set lxc-start bionic-test 20210715014653.512 DEBUG conf - conf.c:lxc_map_ids:2928 - Functional newuidmap and newgidmap binary found lxc-start bionic-test 20210715014653.519 INFO start - start.c:do_start:1136 - Unshared CLONE_NEWNET lxc-start bionic-test 20210715014653.519 DEBUG conf - conf.c:idmaptool_on_path_and_privileged:2836 - The binary "/usr/bin/newuidmap" does have the setuid bit set lxc-start bionic-test 20210715014653.520 DEBUG conf - conf.c:idmaptool_on_path_and_privileged:2836 - The binary "/usr/bin/newgidmap" does have the setuid bit set lxc-start bionic-test 20210715014653.520 DEBUG conf - conf.c:lxc_map_ids:2928 - Functional newuidmap and newgidmap binary found lxc-start bionic-test 20210715014653.525 DEBUG start - start.c:lxc_spawn:1742 - Preserved net namespace via fd 10 lxc-start bionic-test 20210715014653.578 DEBUG network - network.c:lxc_network_move_created_netdev_priv:2500 - Moved network device "vethYBV30X"/"(null)" to network namespace of 118038 lxc-start bionic-test 20210715014653.578 NOTICE utils - utils.c:lxc_switch_uid_gid:1378 - Switched to gid 0 lxc-start bionic-test 20210715014653.578 NOTICE utils - utils.c:lxc_switch_uid_gid:1387 - Switched to uid 0 lxc-start bionic-test 20210715014653.578 NOTICE utils - utils.c:lxc_setgroups:1400 - Dropped additional groups lxc-start bionic-test 20210715014653.579 INFO start - start.c:do_start:1242 - Unshared CLONE_NEWCGROUP lxc-start bionic-test 20210715014653.579 DEBUG storage - storage/storage.c:get_storage_by_name:231 - Detected rootfs type "dir" lxc-start bionic-test 20210715014653.579 DEBUG conf - conf.c:lxc_mount_rootfs:1332 - Mounted rootfs "/var/lib/lxc/bionic-test/rootfs" onto "/usr/lib/x86_64-linux-gnu/lxc" with options "(null)" lxc-start bionic-test 20210715014653.579 INFO conf - conf.c:setup_utsname:791 - Set hostname to "bionic-test" lxc-start bionic-test 20210715014653.579 ERROR utils - utils.c:lxc_setup_keyring:1801 - Disk quota exceeded - Failed to create kernel keyring lxc-start bionic-test 20210715014653.602 DEBUG network - network.c:setup_hw_addr:2767 - Mac address "00:16:3e:af:83:c2" on "eth0" has been setup lxc-start bionic-test 20210715014653.603 DEBUG network - network.c:lxc_setup_netdev_in_child_namespaces:3032 - Network device "eth0" has been setup lxc-start bionic-test 20210715014653.603 INFO network - network.c:lxc_setup_network_in_child_namespaces:3053 - network has been setup lxc-start bionic-test 20210715014653.603 INFO conf - conf.c:mount_autodev:1118 - Preparing "/dev" lxc-start bionic-test 20210715014653.603 INFO conf - conf.c:mount_autodev:1165 - Prepared "/dev" lxc-start bionic-test 20210715014653.603 INFO conf - conf.c:run_script_argv:356 - Executing script "/usr/share/lxcfs/lxc.mount.hook" for container "bionic-test", config section "lxc" lxc-start bionic-test 20210715014653.628 INFO conf - conf.c:lxc_fill_autodev:1209 - Populating "/dev" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_fill_autodev:1282 - Bind mounted host device node "/dev/full" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/full" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_fill_autodev:1282 - Bind mounted host device node "/dev/null" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/null" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_fill_autodev:1282 - Bind mounted host device node "/dev/random" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/random" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_fill_autodev:1282 - Bind mounted host device node "/dev/tty" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/tty" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_fill_autodev:1282 - Bind mounted host device node "/dev/urandom" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/urandom" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_fill_autodev:1282 - Bind mounted host device node "/dev/zero" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/zero" lxc-start bionic-test 20210715014653.628 INFO conf - conf.c:lxc_fill_autodev:1286 - Populated "/dev" lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:mount_entry:2027 - Remounting "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/sys/fs/fuse/connections" to respect bind or remount options lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:mount_entry:2048 - Flags for "/sys/fs/fuse/connections" were 4096, required extra flags are 0 lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:mount_entry:2058 - Mountflags already were 4096, skipping remount lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:mount_entry:2102 - Mounted "/sys/fs/fuse/connections" on "/usr/lib/x86_64-linux-gnu/lxc/sys/fs/fuse/connections" with filesystem type "none" lxc-start bionic-test 20210715014653.628 INFO conf - conf.c:mount_file_entries:2333 - Finished setting up mounts lxc-start bionic-test 20210715014653.628 DEBUG conf - conf.c:lxc_setup_dev_console:1771 - Mounted pts device "/dev/pts/8" onto "/usr/lib/x86_64-linux-gnu/lxc/dev/console" lxc-start bionic-test 20210715014653.628 INFO utils - utils.c:lxc_mount_proc_if_needed:1231 - I am 1, /proc/self points to "1" lxc-start bionic-test 20210715014653.630 WARN conf - conf.c:lxc_setup_devpts:1616 - Invalid argument - Failed to unmount old devpts instance lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_devpts:1653 - Mount new devpts instance with options "gid=5,newinstance,ptmxmode=0666,mode=0620,max=1024" lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_devpts:1672 - Created dummy "/dev/ptmx" file as bind mount target lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_devpts:1677 - Bind mounted "/dev/pts/ptmx" to "/dev/ptmx" lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_allocate_ttys:989 - Created tty "/dev/pts/0" with master fd 11 and slave fd 14 lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_allocate_ttys:989 - Created tty "/dev/pts/1" with master fd 15 and slave fd 16 lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_allocate_ttys:989 - Created tty "/dev/pts/2" with master fd 17 and slave fd 18 lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_allocate_ttys:989 - Created tty "/dev/pts/3" with master fd 19 and slave fd 20 lxc-start bionic-test 20210715014653.631 INFO conf - conf.c:lxc_allocate_ttys:1005 - Finished creating 4 tty devices lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_ttys:940 - Bind mounted "/dev/pts/0" onto "/dev/tty1" lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_ttys:940 - Bind mounted "/dev/pts/1" onto "/dev/tty2" lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_ttys:940 - Bind mounted "/dev/pts/2" onto "/dev/tty3" lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:lxc_setup_ttys:940 - Bind mounted "/dev/pts/3" onto "/dev/tty4" lxc-start bionic-test 20210715014653.631 INFO conf - conf.c:lxc_setup_ttys:949 - Finished setting up 4 /dev/tty<N> device(s) lxc-start bionic-test 20210715014653.631 INFO conf - conf.c:setup_personality:1716 - Set personality to "0x0" lxc-start bionic-test 20210715014653.631 DEBUG conf - conf.c:setup_caps:2506 - Capabilities have been setup lxc-start bionic-test 20210715014653.631 NOTICE conf - conf.c:lxc_setup:3692 - The container "bionic-test" is set up lxc-start bionic-test 20210715014653.631 INFO lsm - lsm/lsm.c:lsm_process_label_set_at:178 - Set AppArmor label to "lxc-container-default-cgns" lxc-start bionic-test 20210715014653.631 INFO apparmor - lsm/apparmor.c:apparmor_process_label_set:249 - Changed apparmor profile to lxc-container-default-cgns lxc-start bionic-test 20210715014653.632 DEBUG start - start.c:lxc_spawn:1817 - Preserved cgroup namespace via fd 19 lxc-start bionic-test 20210715014653.632 NOTICE start - start.c:start:2025 - Exec'ing "/sbin/init" lxc-start bionic-test 20210715014653.632 NOTICE start - start.c:post_start:2036 - Started "/sbin/init" with pid "118038" lxc-start bionic-test 20210715014653.632 NOTICE start - start.c:signal_handler:430 - Received 17 from pid 118039 instead of container init 118038 lxc-start bionic-test 20210715014653.632 DEBUG lxccontainer - lxccontainer.c:wait_on_daemonized_start:830 - First child 118023 exited ```

Again played, abused and toasted a few rounds. Sysupgrade'ed multiple times. So far pretty solid. Meet my original requirements. A bit more efficient use of storage space will be a bonus.

--

ArchLinux ROOTFS is about 800MiB. The LXC team already trims down a bit from >1GiB official images by the ArchLinux for ARM team. If I spend some effort, perhaps can further trimmed down to 500MiB. Stuff like manpages, header files and docs aren't needed.

Alpine ROOTFS is much smaller, starting from ~20MiB. I might decide to use Alpine. Or perhaps run one big ArchLinux for internet-facing services, and re-use the OpenWrt host for intranet-facing services.

The idea to share a base ROOTFS and stack an overlayfs on top saves space regardless of the ROOTFS size of a Linux favour. I might find time to give it a try. But for now consider this thread done.

OpenWrt (the software & the community) is great! It took much less time than I originally anticipated.

As usual, critics, suggestions, better practices, especially critics are welcome.

update

added a hyperlink to OpenWrt's Guide to LXC
added two more types of errors (and their solutions) that I met when running unprivileged containers

kvic · June 19, 2024, 10:47pm

This turned out simpler than I thought. I started with, at OS level, creating the layered overlayfs of the ROOTFS. Pass that rootfs to LXC without LXC realizing it's running on top of overlayfs. Surprisingly it worked. But then I figured LXC already has overlayfs support built-in. So let me skip over and present this simpler approach. I can't believe it's that straight forward.

Take Alpine Linux as genius pig. First let's create a 'base' container. 'Base' in a sense like a 'class' in OO programming:

lxc-create -n albase -t download -- --dist alpine --release 3.20 --arch arm64

Then, let's initiate two 'instances' of this base container. The two instances will share the same ROOTFS of the base container:

lxc-create -n alups1 -t none
lxc-create -n alups2 -t none

Now here is the trick which I haven't found a better way to deal with but at least the following manual way works for me:

mkdir -p /opt/srv/lxc/alups1/overlay/upper
mkdir -p /opt/srv/lxc/alups2/overlay/upper

Note 100000 below is the re-mapped UID and GID for on my OpenWrt host for running unprivileged containers. Plenty of tutorials will show you how to create them when creating unprivileged containers. Let's assume they're created already here. Then:

chown 100000:100000 -R /opt/srv/lxc/alups1/overlay
chown 100000:100000 -R /opt/srv/lxc/alups2/overlay

For /opt/srv/lxc/alups1/config, copy from /opt/srv/lxc/albase/config and overwrite the content except these three lines:

lxc.rootfs.path = overlayfs:/opt/srv/lxc/albase/rootfs:/opt/srv/lxc/alups1/overlay/upper
lxc.uts.name = <keep the original value>
lxc.net.0.hwaddr = <keep the original value>

For /opt/srv/lxc/alups2/config, again copy from /opt/srv/lxc/albase/config and overwrite the content except these three lines:

lxc.rootfs.path = overlayfs:/opt/srv/lxc/albase/rootfs:/opt/srv/lxc/alups2/overlay/upper
lxc.uts.name = <keep the original value>
lxc.net.0.hwaddr = <keep the original value>

As we might have guessed, the very important line is 'lxc.rootfs.path'. It's easy to guess its meaning. If not, look up LXC manpage.

Bravo. We're ready to launch the two unprivileged and 'instantiated' containers. Check disk usage to make sure that 'alups1' and 'alups2' are indeed sharing the ROOTFS of 'albase':

# du -ksh /opt/srv/lxc/*
14.2M	/opt/srv/lxc/albase
59.0K	/opt/srv/lxc/alups1
59.0K	/opt/srv/lxc/alups2

Information seems scarce online. This post is perhaps the first one to lay bare how to use LXC built-in overlayfs. I can understand why LXC isn't promoting the feature. Careless users will shoot themselves in the foot. IMO, it's a matter of IT policy.

For example, system update should only happen in 'albase' the base container. Never in the instantiated containers, 'alups1' and 'alups2'. Will play around for a week or two and see if any other issues.

I believe I accomplished all my goals in the OP in a surprisingly short time and as a first-time OpenWrt user.

Critics, suggestions, better practices are welcome as usual.

RadioOperator · June 21, 2024, 4:08am

Hi, have you run archlinux in opener? Or just Linux container in openwrt.

If I get an RAX3000M, then I have 64gb emmc, I wonder if I could run a full version of Ubuntu server in it.

kvic · June 21, 2024, 4:29am

For the purpose of this discussion, we shall divide a Linux distribution e.g. Ubuntu server into:

boot loader(s)
kernel
user space i.e. everything else. Most users actually see this as full distro

For RAX3000M, you definitely can run full Ubuntu server with the caveats that you still need

boot loaders from OpenWrt
kernel from OpenWrt

and you need to do some surgery to stitch them together.

Unless for science purpose, I would suggest to stick with LXC containers. The performance of Ubuntu server will be same running bare metal vs inside LXC on RAX3000M.

kvic · June 22, 2024, 1:25am

So this is already done above. But I was thinking if we could improve upon it and make the whole system a bit more streamlined and coherent. The 'new' idea in my mind was to run it like a x86-64 PC box. The root partition will be persistent storage writable, and going forward just update like a Linux PC.

I saw the option to compile a 'ext4 rootfs' and I saw the boot loaders, dts binary blobs already there in the OpenWrt build directory. So why not try to stitch them together for a spin?

I dumped the 'ext4 rootfs' to '/dev/mmcblk0p7' on an existing squashfs-style image disk. mkdir '/boot' and copied kernel and dts blobs into there. Booted into U-Boot. Poked around various macro's and tried to load the dts blobs and the kernel. All loaded by using 'ext4load' command. But failed to execute by issuing 'booti' command for the obvious reason that I had little idea about U-Boot.

By looking at the existing U-Boot macro's, I got the impression that it's very well written to be fail safe. And then I stumbled on this post (which I did a screen capture but couldn't quickly find the URL in my browser cache):

That's the moment I stopped digging further because:

it deviates from my original goal that stays to OpenWrt's default as much as possible to minimize customisation & maintenance on the long run
I can't quickly hack up a U-boot script to be as fail-safe as OpenWrt's default. Not one that's even close to functional.
I'll lose the ability to use F2FS for ROOTFS without further effort
I'll have to add custom build workflow to package for 'ext4 rootfs'

With that said, the trend for ARM systems growing bigger & bigger is a sure thing. For a device like Banana Pi R4, it is 'big' enough to not function like a traditional consumer device. So seems making lot of sense 'ext4 rootfs' will be made a standard feature/option in a future OpenWrt release and made as fail-safe as squashfs counterparts.

daniel · June 28, 2024, 1:22am

Well, in order to achieve that it will have to be read-only. Using ext4 instead of squashfs could still be seen as an advantage because it's much faster and less resource-hungry than reading from squashfs. Android does it that way as well...

Having a single read/write rootfs like traditional desktop or server Linux distributions is not really feasible on headless devices as the user doesn't have the option to "boot the old kernel" in GRUB menu or choose single-user boot, simply due to the typical lack of a local console.

A button and some LEDs is usually as good as it gets for OpenWrt devices (without having to open the case), so that has to be enough. Being able to more or less safely update remote devices, or even devices mounted in hard-to-access places like roofs or towers, is another key feature of our distribution which drives us towards trying to eliminate any possible single point of failure (such as a /boot filesystem, or a single read/write rootfs).

RadioOperator · July 2, 2024, 4:46pm

Hi, I've installed a docker / docker-compose on my RAX3000M eMMC, openwret 23.05.3.
Will try to install a database into the docker.

I donot need to run a full-version linux in the docker, my target is an IoT database. Thanks.

BlueRaspberry · October 22, 2024, 2:51pm

How did you get this working? I was able to create the extra partition and use it, but after a sysupgrade the partition was removed. I guess the GPT header was overwritten but the partition and the data is still intact - the GPT header just needs to be edited to be aware of the partition.

I don't know how to do this, but am curious how you performed a sysupgrade so that this problem didn't occur?