I'm compiling regularly images with quite a big amount of custom files using the "files" sub directory in source.
On different devices I adjust/extend some of those files once in a while and sporadically I sync them back into the source/files directory so that the next build contains the newest file set.
But since it's more important for me that a re-flash keeps the version of the current device, I also add all files/directories to sysupgrade.conf and perform a re-flash with keeping the configuration.
If I understand correctly, then all to be preserved files will be still copied back even though the new rom files would contain the same version.
At least overlayfs does not handle duplicate files:
root@test:/etc# ll /overlay/upper/etc/banner
ls: /overlay/upper/etc/banner: No such file or directory
root@test:/etc# ll /etc/banner
-rw-r--r-- 1 root root 441 Feb 20 18:13 /etc/banner
root@test:/etc# cp -p /etc/banner /tmp/banner
root@test:/etc# cp -p /tmp/banner /etc/banner
root@test:/etc# ll /overlay/upper/etc/banner
-rw-r--r-- 1 root root 441 Feb 20 18:13 /overlay/upper/etc/banner
So here my question: is there an option or any magic which doesn't copy config files after re-flashing if the rom contains already the same version?
And if not, wouldn't it make sense to add such a feature, or is there some hook where I could perform the necessary diff checks and thereby filter the configuration files to be written?
The alternative would be only to take a sysupgrade backup.tgz remove from there the files that the new rom will contain anyway, and then flush with this adjusted configs.
root@test:~# ls /overlay/upper/etc/banner
ls: /overlay/upper/etc/banner: No such file or directory
root@test:~# cat /etc/banner > /tmp/banner
root@test:~# cat /tmp/banner > /etc/banner
root@test:~# ls /overlay/upper/etc/banner
/overlay/upper/etc/banner
Will happen the same, since the file is touched when it is written, but inode remains the same.
# Here is the interesting thing, if you delete file from Overlay:
root@test:~# rm /overlay/upper/etc/banner
# it will actually be deleted
root@test:~# ls /overlay/upper/etc/banner -lath
ls: /overlay/upper/etc/banner: No such file or directory
No matter if later you create again or put other contents, file will not be linked anymore OverlayFS will ignore it, I don't know why.
update: in some way the /overlay were "dirt" when doing this, if I reboot (remount) the contents of /etc/banner are again the same as /overlay one.
I was showing this example because when config files get restored after flashing a new rom where the new rom contains already the same files as the config backup, then they still make it into overlay/upper. So overlayfs is not doing the job of performing no-ops here.
Therefore there should be some option/magic in sysupgrade which doesn't copy identical files.
Given that sysupgrade should have some space to play with, what with some post-upgrade
packages not being installed yet, it could probably get away with doing some sort of dedup
fsck once it has booted... it has to run some fixups on some platforms anyway. Where magic
is needed is in opkg so minor upgrades to base packages in the lower filesystem can prevent
overlays where possible.
yes, but I'm actually looking for something where it is done earlier, so that I don't have to worry about missing space.
E.g.: 1MB of compressed files which require this 1MB in rom, but extracted into overlay/upper may take 3MB.
If this 1MB directory is added/configured into sysupgrade.conf, as well as compiled via source/files into rom, then only the initial clean flash may work, but a sysupgrade with restoring configs may run out of space.
Sysupgrade will not know if file is actually on upper overlay or not, when backup is restored, overlayfs have nothing to do, it doesn't actively inspect or compare files in any way, even is their stats like modified date are same, nor inspect contents either.
OverlayFS, at least on the OpenWRT / LEDE implementation does not have any de-duplication mechanism enabled.
There is the posibilty to write a script that checks file status against overlay and add it to backup.tgz only if it was actually modified over readonly rootfs version. This can also be done while restoring it, but is easier to do on creation.
May be easier than cleaning in some manner existing files on upper OverlayFS.
That will backup all uci config files from /etc/config
In usual enviroments is more than enough configuration, if you have other files you could manually create a tar.gz .tgz file, if you use same path guidelines as original backup mechanism then it will be compatible too.
OK. So to recap: there is currently no such feature, and from the responses so far it looks like I'm the only one really interested in such a feature.
Then back to the last question of my original post: where do I have to hook in a "tar" wrapper which performs the necessary checks at the time when the backup.tar gets extracted over the newly flashed rootfs.
Possible tar wrapper.
#!/bin/sh
BASE="$(basename "$0")"
usage() {
cat <<EOF >&2
Usage: $BASE [--ignore-meta] [--checksum=SUM] TAR_FILE DEST_DIR [TAR_OPTS]
extracts given tar file by skipping identical files in DEST_DIR
TAR_OPTS pass further 'tar' options to underlying 'tar' command
"-C \$DEST_DIR" is already added
--ignore-meta base identical file checks only on its content and
ignore permissions, ownership and times.
--checksum=SUM use "SUM" binary for checksum building when comparing contents.
Default is to use "diff" for an exact content comparison.
RETURN 0 on success.
$*
EOF
exit 1
}
STAT="stat"
CHECKSUM=
while [ -n "$1" ]
do
case "$1" in
--ignore-meta) STAT=""; shift ;;
--checksum=) CHECKSUM="${1#--checksum=}"; shift ;;
*) break ;;
esac
done
[ -z "$1" ] && usage "Missing arguments"
TAR="$1"
shift
[ ! -f "$TAR" ] && usage "Not file: '$TAR'"
[ -z "$1" ] && usage "Missing DEST_DIR"
DEST_DIR="$1"
shift
[ ! -d "$DEST_DIR" ] && usage "Not a directory: '$DEST_DIR'"
[ -n "$STAT" ] && ! type "$STAT" >/dev/null 2>&1 && usage "Requires '$STAT' package without --ignore-meta"
if [ -z "$CHECKSUM" ]
then
type diff >/dev/null 2>&1 || usage "Requires 'diff' package without --checksum"
if [ -n "$STAT" ]
then
cmp_file() {
local s1 s2
s1="$(stat -c "%A %U:%G %s %Y" "$1")" || return 1
s2="$(stat -c "%A %U:%G %s %Y" "$2")" || return 1
[ "$1" != "$2" ] && return 1
diff -q "$1" "$2"
return $?
}
else
cmp_file() {
diff -q "$1" "$2"
return $?
}
fi
else
if [ -n "$STAT" ]
then
cmp_file() {
local s1 s2 c1 c2
s1="$(stat -c "%A %U:%G %s %Y" "$1")" || return 1
s2="$(stat -c "%A %U:%G %s %Y" "$2")" || return 1
[ "$1" != "$2" ] && return 1
c1=$("$CHECKSUM" "$1") || return 1
c2=$("$CHECKSUM" "$2") || return 1
[ "$c1" = "$c2" ]
return $?
}
else
cmp_file() {
local c1 c2
c1=$("$CHECKSUM" "$1") || return 1
c2=$("$CHECKSUM" "$2") || return 1
[ "$c1" = "$c2" ]
return $?
}
fi
fi
CTAR="$(readlink -f "$TAR")"
DUPS="${TMP:-/tmp}/tmp-$BASE.dups.$$"
touch "$DUPS"
WORK="${TMP:-/tmp}/tmp-$BASE.work.$$"
tar -t "$@" -f "$CTAR" | while read -r FILE
do
[ ! -f "$DEST_DIR/$FILE" ] && continue
mkdir -p "$WORK" || return 1
tar -x "$@" -f "$CTAR" -C "$WORK" "$FILE" || return 1
[ cmp_file "$WORK/$FILE" "$DEST_DIR/$FILE" ] && echo "$FILE" >>"$DUPS"
rm -rf "$WORK"
done
RET="$?"
rm -rf "$WORK" 2>/dev/null
[ "$RET" -ne 0 ] && exit "$RET"
echo "Skipping identical file(s):" >&2
cat "$DUPS" >&2
tar -x "$@" -f "$CTAR" -C "$DEST_DIR" -X "$DUPS"
exit $?
ISTR the tar does not get extracted over the newly flashed rootfs, rather it gets converted into a jffs2 partition containing the initial contents of the upper layer and flashed directly, along with any adjustments needed to the image length/checksums to make the bootloader happy with the image.
meanwhile I was looking a bit around the base sources.
And there I also figured that sysupgrade binary is passing sysupgrade.tgz to mtd. So it gets flashed as it is (no extraction here).
But looking into /lib/preinit/80_mount_root it looks like after reboot it still just exists as plain tar file and gets extracted there. And my understanding is that at that point root (/) is already the final root file system and not some overlay/upper.
Here would be the final patch.
I tested the /lib/upgrade/untar-minimal.sh separately with various setups (installed diff, stat) .
But before I screw up by device I would like some confirmation that I'm not doing complete crap here.
Thanks
--- /rom/lib/preinit/80_mount_root
+++ /lib/preinit/80_mount_root
@@ -8,7 +8,12 @@
[ -f /sysupgrade.tgz ] && {
echo "- config restore -"
cd /
- tar xzf /sysupgrade.tgz
+ if [ -x /lib/upgrade/untar-minimal.sh ]; then
+ echo "- config restore minimal -"
+ /lib/upgrade/untar-minimal.sh /sysupgrade.tgz /
+ else
+ tar xzf /sysupgrade.tgz
+ fi
}
}
--- /dev/null
+++ /lib/upgrade/untar-minimal.sh
@@ -0,0 +1,141 @@
+#!/bin/sh
+
+BASE="$(basename "$0")"
+
+usage() {
+ cat <<EOF >&2
+ Usage: $BASE [-v] [-d] TAR_FILE [DEST_DIR]
+ extracts TAR_FILE to DEST_DIR (default is '/') and tries to avoid
+ extracting identical files with same content, permissions, mtime and ownership.
+ This avoids occupying unnecessary flash space on overlayfs upper file systems.
+ File comparison is done using 'diff', 'sha512sum' or 'sha256sum' depending
+ on what is installed on the system and priorized in given order.
+ Falls back to standard 'tar' if none of the above is available.
+ -v verbose output
+ -d dry run. Show only, but don't change anything in \$DEST_DIR
+ RETURN 0 on success
+$*
+EOF
+ exit 1
+}
+
+DRYRUN="" # usefull for debugging this script
+VERBOSE=""
+[ "$1" = "-v" ] && VERBOSE="v" && shift
+[ "$1" = "-d" ] && DRYRUN="-d" && VERBOSE="v" && shift
+[ "$1" = "-v" ] && VERBOSE="v" && shift
+
+[ -z "$1" ] && usage "Missing arguments"
+TAR="$1"
+shift
+[ ! -f "$TAR" ] && usage "Not file: '$TAR'"
+
+
+[ -z "$1" ] && usage "Missing DEST_DIR"
+DEST_DIR="$1"
+shift
+[ ! -d "$DEST_DIR" ] && usage "Not a directory: '$DEST_DIR'"
+
+TYPE="z" # default assuming tgz
+[ -z "${TAR%%*.tar}" ] && TYPE=""
+[ -z "${TAR%%*.bz}" ] && TYPE="j"
+[ -z "${TAR%%*.tbz}" ] && TYPE="j"
+[ -z "${TAR%%*.xz}" ] && TYPE="J"
+[ -z "${TAR%%*.txz}" ] && TYPE="J"
+
+if type stat >/dev/null 2>&1
+then
+ [ -n "$VERBOSE" ] && echo "Using 'stat' for file permission, mtime and ownership comparison" >&2
+ cmp_stat() {
+ local s1 s2
+ s1="$(stat -c "%A %U:%G %s %Y" "$1")" || return 1
+ s2="$(stat -c "%A %U:%G %s %Y" "$2")" || return 1
+ [ "$s1" = "$s2" ]
+ return $?
+ }
+else
+ LS="/bin/ls"
+ [ ! -x "$LS" ] && LS="/usr/bin/ls"
+ [ -n "$VERBOSE" ] && echo "Using '$LS' for file permission, mtime and ownership comparison" >&2
+ # compare size mtime permissions and ownership of given two files
+ cmp_stat() {
+ local s1 s2
+ [ ! -f "$1" ] && return 1
+ [ ! -f "$2" ] && return 1
+ # tricky part since 'ls' output may not be stable. Skipping inode and file name
+ # Example "ls -le" output of busybox ls
+ # -rw-r--r-- 1 root root 441 Mon Feb 20 18:13:44 2017 /etc/banner
+ s1="$("$LS" -le "$1" | sed -e 's/[\t ]\+/ /g' | cut -d' ' -f '1,3-10')" || return 1
+ s2="$("$LS" -le "$2" | sed -e 's/[\t ]\+/ /g' | cut -d' ' -f '1,3-10')" || return 1
+ [ "$s1" = "$s2" ]
+ return $?
+ }
+fi
+
+tar_cmp() {
+ local ret
+ local ctar="$(readlink -f "$TAR")"
+ local dups="${TMP:-/tmp}/tmp-$BASE.dups.$$"
+ local work="${TMP:-/tmp}/tmp-$BASE.work.$$"
+ touch "$dups"
+
+ tar "-t$TYPE" -f "$ctar" | while read -r FILE
+ do
+ [ ! -f "$DEST_DIR/$FILE" ] && continue
+
+ mkdir -p "$work" || continue
+ tar "-x$TYPE" -f "$ctar" -C "$work" "$FILE" || continue
+
+ cmp_files "$work/$FILE" "$DEST_DIR/$FILE" && echo "$FILE" >>"$dups" \
+ && [ -n "$VERBOSE" ] && echo "Skipping identical: $FILE" >&2
+
+ rm -rf "$work" >/dev/null 2>&1
+ done
+ rm -rf "$work" >/dev/null 2>&1
+ [ -n "$DRYRUN" ] || tar "-x$VERBOSE$TYPE" -f "$TAR" -C "$DEST_DIR" -X "$dups"
+ ret="$?"
+ rm "$dups"
+ return $ret
+}
+
+if type "diff" >/dev/null 2>&1
+then
+ [ -n "$VERBOSE" ] && echo "Using 'diff' to compare file content" >&2
+ cmp_files() {
+ cmp_stat "$1" "$2" || return 1
+ diff -q "$1" "$2"
+ return $?
+ }
+ tar_cmp
+elif type "sha512sum" >/dev/null 2>&1
+then
+ [ -n "$VERBOSE" ] && echo "Using 'sha512sum' to compare file content" >&2
+ cmp_files() {
+ local s1 s2
+ cmp_stat "$1" "$2" || return 1
+ s1="$(sha512sum "$1")" || return 1
+ s2="$(sha512sum "$2")" || return 1
+ [ "${s1%% *}" = "${s2%% *}" ]
+ return $?
+ }
+ tar_cmp
+elif type "sha256sum" >/dev/null 2>&1
+then
+ [ -n "$VERBOSE" ] && echo "Using 'sha256sum' to compare file content" >&2
+ cmp_files() {
+ local s1 s2
+ cmp_stat "$1" "$2" || return 1
+ s1="$(sha256sum "$1")" || return 1
+ s2="$(sha256sum "$2")" || return 1
+ [ "${s1%% *}" = "${s2%% *}" ]
+ return $?
+ }
+ tar_cmp
+else
+ [ -n "$VERBOSE" ] && echo "No diff, sha512sum nor sha256sum installed. Falling back to standard 'tar'" >&2
+ [ -n "$DRYRUN" ] || tar "-x$VERBOSE$TYPE" -f "$TAR" -C "$DEST_DIR"
+fi
+
+exit $?
+