Bizarre Puzzle: Image Too SMALL To Restore Settings

I've been chasing down an intermittent failure to restore settings on sysupgrade on a device under development (GL-AR300M NOR flash, NAND-enabled ath79 kernel) and had thought I had it found a few times. No, it wasn't #!/bin.sh and it wasn't even that there was some mysterious package that was making it work after the reboot, even though adding packages seemed to resolve the problem.

After determining that the "magic package" was one that just has my build info in it, no executable content, I looked to size alone. Sure enough, adding "incompressible" data to the ROM was enough to prevent/enable the failure.

It seems as if there is a point at which the upgrade image is too small to successfully preserve/restore settings.

Does this ring a bell for anyone?

Edit: Current hypothesis is that while the data is being written for /sysupgrade.tgz in both cases, it seems to be in the "wrong" place for the smaller rootfs image, "stuck" at 0x400000, rather than 0x3f0000.

mtdsplit is done using mtd_get_squashfs_len() which reads the super-block and is consistent with binwalk.

mtd -j /tmp/sysupgrade.tgz write - looks for deadc0de to determine where to write the JFFS2 data. It finds it at 0x400000 in both cases.

Curiously, the sysupgrade.bin is the same length in both cases, even though the squasfs portion is of different sizes, crossing an erase block boundary.



On a "successful" upgrade, one can see the JFFS2 file system being initialized early, and the config from /sysupgrade.tgz being restored

1966080       0x1E0000        Squashfs filesystem, little endian, version 4.0, compression:xz, 
size: 2198678 bytes, 852 inodes, blocksize: 262144 bytes, created: 2019-06-29 22:09:24

Press the [f] key and hit [enter] to enter failsafe mode
Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
[    7.393991] eth0: link up (1000Mbps/Full duplex)
[    7.398814] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[    9.588446] jffs2_scan_eraseblock(): End of filesystem marker found at 0x10000
[    9.598227] jffs2_build_filesystem(): unlocking the mtd device... 
[    9.598263] done.
[    9.606662] jffs2_build_filesystem(): erasing all blocks after the end marker... 
[   52.629010] done.
[   52.638785] jffs2: notice: (437) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   52.656267] mount_root: overlay filesystem has not been fully initialized yet
[   52.671818] mount_root: switching to jffs2 overlay
[   52.700774] overlayfs: upper fs does not support tmpfile.
- config restore -
[   52.994026] urandom-seed: Seed file not found (/etc/urandom.seed)
[   53.146404] eth0: link down
[   53.161137] procd: - early -
[   53.165061] procd: - watchdog -
[   53.774244] procd: - watchdog -
[   53.777770] procd: - ubus -
[   53.912826] urandom_read: 5 callbacks suppressed
[   53.912835] random: ubusd: uninitialized urandom read (4 bytes read)
[   53.924779] random: ubusd: uninitialized urandom read (4 bytes read)
[   53.932833] procd: - init -
Please press Enter to activate this console.

However, if the image is too small (same kernel config and executable content), the JFFS2 initialization happens much later, and /sysupgrade.tgz isn't found

1966080       0x1E0000        Squashfs filesystem, little endian, version 4.0, compression:xz, 
size: 2152274 bytes, 842 inodes, blocksize: 262144 bytes, created: 2019-06-29 22:09:24

Press the [f] key and hit [enter] to enter failsafe mode
Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
[    7.393980] eth0: link up (1000Mbps/Full duplex)
[    7.398796] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[    9.580257] mount_root: no usable overlay filesystem found, using tmpfs overlay
[    9.612919] urandom-seed: Seed file not found (/etc/urandom.seed)
[    9.711420] eth0: link down
[    9.727194] procd: - early -
[    9.730286] procd: - watchdog -
[   10.360441] procd: - watchdog -
[   10.364026] procd: - ubus -
[   10.570984] urandom_read: 5 callbacks suppressed
[   10.570992] random: ubusd: uninitialized urandom read (4 bytes read)
[   10.583195] random: ubusd: uninitialized urandom read (4 bytes read)
[   10.591176] procd: - init -
Please press Enter to activate this console.
[   11.267951] kmodloader: loading kernel modules from /etc/modules.d/*
[   11.315830] Loading modules backported from Linux version v4.19.32-0-g3a2156c839c7
[   11.323711] Backport generated by backports.git v4.19.32-1-0-g1c4f7569
[   11.395011] xt_time: kernel timezone is -0000
[   11.483242] urngd: v1.0.0 started.
[   11.523815] PPP generic driver version 2.4.2
[   11.543766] NET: Registered protocol family 24
[   11.674518] ieee80211 phy0: Atheros AR9531 Rev:2 mem=0xb8100000, irq=13
[   11.773150] kmodloader: done loading kernel modules from /etc/modules.d/*
[   11.942003] random: crng init done
[   28.809813] jffs2_scan_eraseblock(): End of filesystem marker found at 0x20000
[   28.822994] jffs2_build_filesystem(): unlocking the mtd device... 
[   28.823057] done.
[   28.831416] jffs2_build_filesystem(): erasing all blocks after the end marker... 
[   31.938885] eth0: link up (1000Mbps/Full duplex)
[   31.954534] br-lan: port 1(eth0) entered blocking state
[   31.959941] br-lan: port 1(eth0) entered disabled state
[   31.965728] device eth0 entered promiscuous mode
[   32.016430] br-lan: port 1(eth0) entered blocking state
[   32.021837] br-lan: port 1(eth0) entered forwarding state
[   32.027694] IPv6: ADDRCONF(NETDEV_UP): br-lan: link is not ready
[   32.148202] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[   32.992832] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
[   73.697396] done.
[   73.699438] jffs2: notice: (1041) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   73.872010] overlayfs: upper fs does not support tmpfile.

I've since narrowed it down to between 2,157,546 and 2,162,766 bytes, by adding already-compressed file to the root fs

>>> print("0x{:x}".format(2157546))
0x20ebea
>>> print("0x{:x}".format(2162766))
0x21004e

>>> print("0x{:x}".format(2157546+0x1E0000))
0x3eebea
>>> print("0x{:x}".format(2162766+0x1E0000))
0x3f004e

From failsafe on reboot, working:

root@(none):/# cat /proc/mtd 
dev:    size   erasesize  name
mtd0: 00040000 00010000 "u-boot"
mtd1: 00010000 00010000 "u-boot-env"
mtd2: 00fa0000 00010000 "firmware"
mtd3: 001e0000 00010000 "kernel"
mtd4: 00dc0000 00010000 "rootfs"
mtd5: 00ba0000 00010000 "rootfs_data"
mtd6: 00010000 00010000 "art"
mtd7: 00200000 00020000 "nand_kernel"
mtd8: 07e00000 00020000 "nand_ubi"

root@(none):~# hexdump -C -n 1024 /dev/mtd5
00000000  19 85 20 03 00 00 00 0c  f0 60 dc 98 19 85 e0 01  |.. ......`......|
00000010  00 00 00 36 5d 44 48 fe  00 00 00 01 00 00 00 00  |...6]DH.........|
00000020  00 00 00 02 00 00 00 00  0e 08 00 00 97 8f 0a 5b  |...............[|
00000030  31 ff 3d bc 73 79 73 75  70 67 72 61 64 65 2e 74  |1.=.sysupgrade.t|
00000040  67 7a ff ff 19 85 e0 02  00 00 10 44 ee 2d 30 6f  |gz.........D.-0o|

and failing:

root@(none):/# cat /proc/mtd 
dev:    size   erasesize  name
mtd0: 00040000 00010000 "u-boot"
mtd1: 00010000 00010000 "u-boot-env"
mtd2: 00fa0000 00010000 "firmware"
mtd3: 001e0000 00010000 "kernel"
mtd4: 00dc0000 00010000 "rootfs"
mtd5: 00bb0000 00010000 "rootfs_data"
mtd6: 00010000 00010000 "art"
mtd7: 00200000 00020000 "nand_kernel"
mtd8: 07e00000 00020000 "nand_ubi"

root@(none):/# hexdump -C -n 1024 /dev/mtd5
00000000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00000400

Though I'm not convinced that it is a write problem, or an MTD-split / mount_root problem yet.

You mean 2 MB + 1 block of 64kB ? 2162688

I would definitely start looking from the MTD split logic.
Pure hunch, but maybe there is optimisation logic to speed up things by assuming that the rootfs is always at least 2 MB and let's skip those first megabytes.

Based on the logs, the end marker should be found from 0x10000 but in the failing run that is found from 0x20000. Likely one of the backup markers put there by padding logic (as I remember seeing surplus markers in the files)

2 Likes

Related possible MTD angle:

the EOF marker scanner skips the size of the kernel (or the reserved max size of of the kernel (not sure if that info is available)), and only starts scanning rootfs after the size of kernel plus the min 3 erase blocks ?
(your kernel is 0x1e0000, and the critical size of rootfs is that plus 3 blocks, 0x210000)

There could be a faulty logic that the size of the kernel is skipped first also in the secondary rootfs split (if e.g. the counter is not resetted after finding the kernel).

If I remember right, earlier (pre-2015 or so...) the kernel was manually split, and only the rootfs end was detected. So, on those times there was only one split for most routers.

1 Like

Looking at the jffs2 source code, you might compile with jffs debugging options enabled, so that you would see how the scannig goes.

define CONFIG_JFFS2_FS_DEBUG to be 1 in fs/jffs2/debug.h
as that seems to be widely used in scanning in https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/fs/jffs2/scan.c?h=v4.19.56

1 Like

Thanks, just getting going here in the US, but for anyone "following along at home", the location of the partitions is below.

(Coffee still not completely hitting yet, but debugging is going to be a challenge, as the "empty" config is within ~10 kB of "big enough")

2111 -- "Failing"

[    0.407200] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.412103] 4 fixed-partitions partitions found on MTD device spi0.0
[    0.418693] Creating 4 MTD partitions on "spi0.0":
[    0.423670] 0x000000000000-0x000000040000 : "u-boot"
[    0.429644] 0x000000040000-0x000000050000 : "u-boot-env"
[    0.435987] 0x000000050000-0x000000ff0000 : "firmware"
[    0.444489] 2 uimage-fw partitions found on MTD device firmware
[    0.450611] Creating 2 MTD partitions on "firmware":
[    0.455796] 0x000000000000-0x0000001e0000 : "kernel"
[    0.461768] 0x0000001e0000-0x000000fa0000 : "rootfs"
[    0.467685] mtd: device 4 (rootfs) set to be root filesystem
[    0.475146] 1 squashfs-split partitions found on MTD device rootfs
[    0.481553] 0x0000003f0000-0x000000fa0000 : "rootfs_data"
[    0.488014] 0x000000ff0000-0x000001000000 : "art"
[    0.495762] spi-nand spi0.1: GigaDevice SPI NAND was found.
[    0.501537] spi-nand spi0.1: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
[    0.510143] 2 fixed-partitions partitions found on MTD device spi0.1
[    0.516746] Creating 2 MTD partitions on "spi0.1":
[    0.521701] 0x000000000000-0x000000200000 : "nand_kernel"
[    0.532065] 0x000000200000-0x000008000000 : "nand_ubi"

2058 -- "Working"

[    0.407172] m25p80 spi0.0: w25q128 (16384 Kbytes)
[    0.412078] 4 fixed-partitions partitions found on MTD device spi0.0
[    0.418667] Creating 4 MTD partitions on "spi0.0":
[    0.423646] 0x000000000000-0x000000040000 : "u-boot"
[    0.429620] 0x000000040000-0x000000050000 : "u-boot-env"
[    0.435962] 0x000000050000-0x000000ff0000 : "firmware"
[    0.444452] 2 uimage-fw partitions found on MTD device firmware
[    0.450576] Creating 2 MTD partitions on "firmware":
[    0.455762] 0x000000000000-0x0000001e0000 : "kernel"
[    0.461735] 0x0000001e0000-0x000000fa0000 : "rootfs"
[    0.467653] mtd: device 4 (rootfs) set to be root filesystem
[    0.475115] 1 squashfs-split partitions found on MTD device rootfs
[    0.481519] 0x000000400000-0x000000fa0000 : "rootfs_data"
[    0.487978] 0x000000ff0000-0x000001000000 : "art"
[    0.495728] spi-nand spi0.1: GigaDevice SPI NAND was found.
[    0.501503] spi-nand spi0.1: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
[    0.510108] 2 fixed-partitions partitions found on MTD device spi0.1
[    0.516711] Creating 2 MTD partitions on "spi0.1":
[    0.521667] 0x000000000000-0x000000200000 : "nand_kernel"
[    0.532035] 0x000000200000-0x000008000000 : "nand_ubi"

Looks like boot-time problem:

			echo "deadc0de in file:"
			hexdump -C "$upgrade_file" | egrep -B 10 'de ad c0 de'
			default_do_upgrade "$upgrade_file"
			echo "deadc0de|sysup in /dev/mtd2:"
			hexdump -C /dev/mtd2 | egrep -B 10 'de ad c0 de|sysup'
FAILING
=======

deadc0de in file:
003eeb70  53 b0 be 2c 86 0d c4 bf  85 8e bf e8 ae ae 7b 3f  |S..,..........{?|
003eeb80  f8 38 63 9d e5 65 56 61  31 84 93 5b 40 d9 67 6f  |.8c..eVa1..[@.go|
003eeb90  bb be 10 71 b1 e8 93 26  72 a5 bc 1e 35 31 c5 13  |...q...&r...51..|
003eeba0  d2 7b d9 57 46 9e 2a 7c  f5 ce e9 cb 5a 5d 01 64  |.{.WF.*|....Z].d|
003eebb0  ae 81 de 18 cc b7 00 00  16 d8 2d 58 00 01 ef 07  |..........-X....|
003eebc0  d8 34 00 00 a0 7d e5 56  3e 30 0d 8b 02 00 00 00  |.4...}.V>0......|
003eebd0  00 01 59 5a be e7 20 00  00 00 00 00 04 80 00 00  |..YZ.. .........|
003eebe0  00 00 dc eb 20 00 00 00  00 00 ff ff ff ff ff ff  |.... ...........|
003eebf0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00400000  de ad c0 de 00 00 00 00  00 00 00 00 7b 20 20 22  |............{  "|
Unlocking firmware ...

Writing from <stdin> to firmware ...     
Appending jffs2 data from /tmp/sysupgrade.tgz to firmware..                                                                                                           
deadc0de|sysup in /dev/mtd2:
003eeba0  d2 7b d9 57 46 9e 2a 7c  f5 ce e9 cb 5a 5d 01 64  |.{.WF.*|....Z].d|
003eebb0  ae 81 de 18 cc b7 00 00  16 d8 2d 58 00 01 ef 07  |..........-X....|
003eebc0  d8 34 00 00 a0 7d e5 56  3e 30 0d 8b 02 00 00 00  |.4...}.V>0......|
003eebd0  00 01 59 5a be e7 20 00  00 00 00 00 04 80 00 00  |..YZ.. .........|
003eebe0  00 00 dc eb 20 00 00 00  00 00 ff ff ff ff ff ff  |.... ...........|
003eebf0  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00400000  19 85 20 03 00 00 00 0c  f0 60 dc 98 19 85 e0 01  |.. ......`......|
00400010  00 00 00 36 5d 44 48 fe  00 00 00 01 00 00 00 00  |...6]DH.........|
00400020  00 00 00 02 00 00 00 00  0e 08 00 00 97 8f 0a 5b  |...............[|
00400030  31 ff 3d bc 73 79 73 75  70 67 72 61 64 65 2e 74  |1.=.sysupgrade.t|
--
00401790  67 86 fb 52 42 8a c8 4d  e0 9f 7b 5e f8 c2 f8 ff  |g..RB..M..{^....|
004017a0  9d ff bf f2 8a fd 9f fe  fe bf a6 30 7d d2 97 42  |...........0}..B|
004017b0  e1 ff c5 ff 7d d8 8d 3c  e9 ff 73 23 f8 2f af ac  |....}..<..s#./..|
004017c0  ff 95 37 a9 ff bf 75 ed  ff 8f d4 2e 5d 19 97 ca  |..7...u.....]...|
004017d0  07 6c 2c 70 87 3f 79 e8  90 68 87 53 cc cb c7 5d  |.l,p.?y..h.S...]|
004017e0  5d db 27 43 40 19 0b d2  88 ee fb b3 1d 7e 3a 8d  |].'C@........~:.|
004017f0  a4 21 0d 69 48 c3 3f 12  7e 02 53 b9 1c a6 00 96  |.!.iH.?.~.S.....|
00401800  00 00 ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00401810  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00410000  de ad c0 de ff ff ff ff  ff ff ff ff ff ff ff ff  |................|

SUCCEEDING
==========

deadc0de in file:
003effd0  86 51 bb 80 53 b0 be 2c  86 0d c4 bf 85 8e bf e8  |.Q..S..,........|
003effe0  ae ae 7b 3f f8 38 63 9d  e5 65 56 61 31 84 93 5b  |..{?.8c..eVa1..[|
003efff0  40 d9 67 6f bb be 10 71  b1 e8 93 26 72 a5 bc 1e  |@.go...q...&r...|
003f0000  35 31 c5 13 d2 7b d9 57  46 9e 2a 7c f5 ce e9 cb  |51...{.WF.*|....|
003f0010  5a 5d 01 64 ae 81 de 18  cc b7 00 00 16 d8 2d 58  |Z].d..........-X|
003f0020  00 01 ef 07 d8 34 00 00  a0 7d e5 56 3e 30 0d 8b  |.....4...}.V>0..|
003f0030  02 00 00 00 00 01 59 5a  22 fc 20 00 00 00 00 00  |......YZ". .....|
003f0040  04 80 00 00 00 00 40 00  21 00 00 00 00 00 ff ff  |......@.!.......|
003f0050  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00400000  de ad c0 de 00 00 00 00  00 00 00 00 7b 20 20 22  |............{  "|
Unlocking firmware ...

Writing from <stdin> to firmware ...     
Appending jffs2 data from /tmp/sysupgrade.tgz to firmware..                                                                                                           
deadc0de|sysup in /dev/mtd2:
003f0000  35 31 c5 13 d2 7b d9 57  46 9e 2a 7c f5 ce e9 cb  |51...{.WF.*|....|
003f0010  5a 5d 01 64 ae 81 de 18  cc b7 00 00 16 d8 2d 58  |Z].d..........-X|
003f0020  00 01 ef 07 d8 34 00 00  a0 7d e5 56 3e 30 0d 8b  |.....4...}.V>0..|
003f0030  02 00 00 00 00 01 59 5a  22 fc 20 00 00 00 00 00  |......YZ". .....|
003f0040  04 80 00 00 00 00 40 00  21 00 00 00 00 00 ff ff  |......@.!.......|
003f0050  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00400000  19 85 20 03 00 00 00 0c  f0 60 dc 98 19 85 e0 01  |.. ......`......|
00400010  00 00 00 36 5d 44 48 fe  00 00 00 01 00 00 00 00  |...6]DH.........|
00400020  00 00 00 02 00 00 00 00  0e 08 00 00 97 8f 0a 5b  |...............[|
00400030  31 ff 3d bc 73 79 73 75  70 67 72 61 64 65 2e 74  |1.=.sysupgrade.t|
--
00401790  67 86 fb 52 42 8a c8 4d  e0 9f 7b 5e f8 c2 f8 ff  |g..RB..M..{^....|
004017a0  9d ff bf f2 8a fd 9f fe  fe bf a6 30 7d d2 97 42  |...........0}..B|
004017b0  e1 ff c5 ff 7d d8 8d 3c  e9 ff 73 23 f8 2f af ac  |....}..<..s#./..|
004017c0  ff 95 37 a9 ff bf 75 ed  ff 8f d4 2e 5d 19 97 ca  |..7...u.....]...|
004017d0  07 6c 2c 70 87 3f 79 e8  90 68 87 53 cc cb c7 5d  |.l,p.?y..h.S...]|
004017e0  5d db 27 43 40 19 0b d2  88 ee fb b3 1d 7e 3a 8d  |].'C@........~:.|
004017f0  a4 21 0d 69 48 c3 3f 12  7e 02 53 b9 1c a6 00 96  |.!.iH.?.~.S.....|
00401800  00 00 ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
00401810  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00410000  de ad c0 de ff ff ff ff  ff ff ff ff ff ff ff ff  |................|


debug in mtd_get_squashfs_len() looks OK, and 0x210000 definitely looks like the tipping point

Fail

[    0.467242] mtd: device 4 (rootfs) set to be root filesystem
[    0.474683] mtdsplit: squashfs length: 0x20fcea
[    0.479378] 1 squashfs-split partitions found on MTD device rootfs
[    0.485826] 0x0000003f0000-0x000000fa0000 : "rootfs_data"
[    0.492235] 0x000000ff0000-0x000001000000 : "art"

OK

[    0.467252] mtd: device 4 (rootfs) set to be root filesystem
[    0.474691] mtdsplit: squashfs length: 0x2101a6
[    0.479388] 1 squashfs-split partitions found on MTD device rootfs
[    0.485835] 0x000000400000-0x000000fa0000 : "rootfs_data"
[    0.492253] 0x000000ff0000-0x000001000000 : "art"


Edit: Might be where things are being written my default_do_upgrade() and its call to mtd -j sysupgrade.tgz write -- The JFFS2 data begins at 0x400000 in both cases.

Failure with:

003eb6f0  3e 30 0d 8b 02 00 00 00  00 01 59 5a de b2 20 00  |>0........YZ.. .|
003eb700  00 00 00 00 04 80 00 00  00 00 04 b7 20 00 00 00  |............ ...|
003eb710  00 00 ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
003eb720  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00400000  19 85 20 03 00 00 00 0c  f0 60 dc 98 19 85 e0 01  |.. ......`......|
00400010  00 00 00 36 5d 44 48 fe  00 00 00 01 00 00 00 00  |...6]DH.........|
00400020  00 00 00 02 00 00 00 00  0e 08 00 00 97 8f 0a 5b  |...............[|
00400030  31 ff 3d bc 73 79 73 75  70 67 72 61 64 65 2e 74  |1.=.sysupgrade.t|
00400040  67 7a ff ff 19 85 e0 02  00 00 10 44 ee 2d 30 6f  |gz.........D.-0o|

Success with:

003effe0  3e 30 0d 8b 02 00 00 00  00 01 59 5a ce fb 20 00  |>0........YZ.. .|
003efff0  00 00 00 00 04 80 00 00  00 00 f4 ff 20 00 00 00  |............ ...|
003f0000  00 00 ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
003f0010  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00400000  19 85 20 03 00 00 00 0c  f0 60 dc 98 19 85 e0 01  |.. ......`......|
00400010  00 00 00 36 5d 44 48 fe  00 00 00 01 00 00 00 00  |...6]DH.........|
00400020  00 00 00 02 00 00 00 00  0e 08 00 00 97 8f 0a 5b  |...............[|
00400030  31 ff 3d bc 73 79 73 75  70 67 72 61 64 65 2e 74  |1.=.sysupgrade.t|
00400040  67 7a ff ff 19 85 e0 02  00 00 10 44 ee 2d 30 6f  |gz.........D.-0o|

Looking at your edited additional debug at forum and also at the debug log lines in the mailing list message, I have some new ideas for you.

The failure may actually be on the image sysupgrade appending logic. The interesting part is that the appended sysupgrade data starts at 0x400000 in both cases, eventhough the firmware image ends at 0x3e.... in the failing case. In the failing case, there is an extra 64 kB of empty 0xff before the appended sysupgrade.tgz, and that causes the sysupgrade archive detection problem at the boot time (as the detection looks properly at 0x3f0000 and finds nothing).

Possible mismatch in the eraseblock size detection? Maybe the sysupgrade creation script (or padjffs) thinks that you have a 128 kB erase block instead of 64 kB, and pads up to 0x400000 instead of 0x3f0000

  • your secondary NAND flash seems to have 128 kB block size? Any way that you get into picture here and create confusion about the NOR flash block size? Dual flash systems are rare, which could explain why nobody has stumbled into this earlier.
    spi-nand spi0.1: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
    
1 Like

I'm thinking that is a likely cause.

That turned out to be the cause.

While I have "fixed" it for "my" devices, I'd like to help others from tripping over it.

I haven't been able to find out where in the make files the "default" generation of the canonical kernel+squashfs sysupgrade.bin is done, but it is calling padjffs2 (which is a little strange with that name, given that it is padding a squashfs image). Do you know where that definition is done?

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.