How to replicate snapshot builds exactly? (LZMA ERROR 1)

Can anyone give me the steps to replicate a snapshot build locally with exactly the same options etc? I recently added a PR to add support for a Belkin F9K1109 and it has been merged. The snapshot image does not work. It fails to uncompress the kernel image during boot with

Uncompressing Kernel Image ... LZMA ERROR 1 - must RESET board to recover

I pulled the same source down, upgraded feeds and built the image locally with several different option combinations in the image builder and they all work. Also all the images I have built have a smaller kernel than the one in the snapshot making me assume I am missing image builder options. I saw the set build defaults for automatic build option but the kernel was still smaller and still booted.

I am hoping to replicate it exactly to see if I can figure out why the snapshot image will not boot. Any tips or ideas to replicate?

Use config.seed for your build .config from the snapshots download page.

You may also want to compare your toolchain with that used by the buildbot if your going to be pedantic

May be an actual error or a random temporary error in LZMA decompression.
u-boot in some old routers has trouble in decoompressing images that have too wide LZMA dictionary size (as bits).

See discussion and links in

https://dev.archive.openwrt.org/ticket/12454.html#comment:12

Your error may not be quite the same, but the dicussion may give you hints.

Thanks guys. I will use the config.seed tonight and see if I can replicate locally.

Thanks hnyman. I will go back and look at your linked posts for hints. I ran into the same error LZMA with this router when I was working up the original image. To correct it I am using a small dictionary size (-d16) in the image make. This results in the dictionary size being 64K which matches the Belkin firmware image. The weird part is the snapshot did correctly set the dictionary size it as far as I can tell from the binwalk, so I am assuming its something else.

In case you are interested here are binwalk examples from snapshot (doesn't work) and a local (does work). Most significant difference is the kernel size. Dictionary seems to be the same.

Snapshot from 3/3/19

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             uImage header, header size: 64 bytes, header CRC: 0xB75B1894, created: 2019-03-03 01:46:05, image size: 1685837 bytes, Data Address: 0x80000000, Entry Point: 0x80000000, data CRC: 0x44230382, OS: Linux, CPU: MIPS, image type: OS Kernel Image, compression type: lzma, image name: "N750F9K1103VB"
64            0x40            LZMA compressed data, properties: 0x6D, dictionary size: 65536 bytes, uncompressed size: 4941756 bytes
1685901       0x19B98D        Squashfs filesystem, little endian, version 4.0, compression:xz, size: 1929830 bytes, 849 inodes, blocksize: 1048576 bytes, created: 2019-03-03 01:46:05

local build with source from 3/3

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             uImage header, header size: 64 bytes, header CRC: 0x5F2B9E49, created: 2019-03-03 01:46:05, image size: 1469187 bytes, Data Address: 0x80000000, Entry Point: 0x80000000, data CRC: 0x199E8E76, OS: Linux, CPU: MIPS, image type: OS Kernel Image, compression type: lzma, image name: "N750F9K1103VB"
64            0x40            LZMA compressed data, properties: 0x6D, dictionary size: 65536 bytes, uncompressed size: 4265108 bytes
1469251       0x166B43        Squashfs filesystem, little endian, version 4.0, compression:xz, size: 2246380 bytes, 1232 inodes, blocksize: 1048576 bytes, created: 2019-03-03 01:46:05

So good news. Using the config.seed locally produces the same result, which is the following during boot.

Uncompressing Kernel Image ... LZMA ERROR 1 - must RESET board to recover

Using the default build configuration produces an image that boots successfully. The source is the same of course. The binwalks look similar other than the snapshot config produces a larger kernel.

So here are my next steps. Let me know if anyone has any tips and or better ideas.

  • Diff the configurations used by the snapshot and the default. I think this will be a relatively small set of differences. Then selectively build with other options.
  • Look into the Belkin u-boot source to see if I can determine possiblities of the LZMA error

I definitely don't have a good guess on what could cause this. My current guess is it is either 1) some build option causes this or 2) the larger size of the kernel is causing it. Using no science at all I am almost wondering if #2 is somehow involved.

1 Like

You might try binwalk with extraction, or directly looking at the kernel in your build tree with some of the GNU binary examination tools.

Also check the kernel config, either from the source in target/linux/<target name>/, or in build_dir/target-a-bunch-of-stuff/linux-<target name>/linux-4.x/

Another possible point of failure: Check your u-boot bootcmd. There are two ways to get the kernel into memory, either in a "smart" way (nboot) or with a fixed length (nand read). If your bootcmd (or the environment variable it calls as a "macro") uses the latter and the kernel exceeds the given length, it would obviously not load completely and fail to decompress.

1 Like

Thanks you guys for the tips. I didn't have much time to look at it tonight, but went into the u-boot variables a bit. I re-proved to myself I don't know how u-boot works. I will dig into this more tomorrow.

In case you guys may spot something obvious. Here are the u-boot printenv output

N750 # printenv
bootargs=NoArg
bootcmd=tftp
bootdelay=5
baudrate=57600
ethaddr="00:AA:BB:CC:DD:10"
ipaddr=10.10.10.123
serverip=10.10.10.3
ramargs=setenv bootargs root=/dev/ram rw
addip=setenv bootargs $(bootargs) ip=$(ipaddr):$(serverip):$(gatewayip):$(netmask):$(hostname):$(netdev):off
addmisc=setenv bootargs $(bootargs) console=ttyS0,$(baudrate) ethaddr=$(ethaddr) panic=1
flash_self=run ramargs addip addmisc;bootm $(kernel_addr) $(ramdisk_addr)
kernel_addr=BFC40000
u-boot=u-boot.bin
load=tftp 8A100000 $(u-boot)
u_b=protect off 1:0-1;era 1:0-1;cp.b 8A100000 BC400000 $(filesize)
loadfs=tftp 8A100000 root.cramfs
u_fs=era bc540000 bc83ffff;cp.b 8A100000 BC540000 $(filesize)
test_tftp=tftp 8A100000 root.cramfs;run test_tftp
HW_BOOT_VER=1.7.4
HW_BOOT_DATE=Dec  7 2011 - 09:22:29
HW_WIFI_HIPOWER=0
ethact=Eth0 (10/100-M)
HW_LAN_MAC=08:86:3B:B6:0A:54
HW_WAN_MAC=08:86:3B:B6:0A:55
HW_WIFI_MAC=08:86:3B:B6:0A:56
HW_WIFI_PIN=06362553
HW_SN=121201GG104068
HW_VER=01C
HW_SKU_ID=1
stdin=serial
stdout=serial
stderr=serial

Also here is a failed bootlog from a snapshot image including u-boot

ARC Uboot:1.7.4 (Dec  7 2011 - 09:22:22)

Board: Ralink APSoC DRAM:  64 MB
relocate_code Pointer at: 83fa0000
******************************
Software System Reset Occurred
******************************
spi_wait_nsec: 30
spi device id: c2 20 17 c2 20 (2017c220)
find flash: MX25L6405D
..============================================
Ralink UBoot Version: 3.5.2.0
--------------------------------------------
ASIC 3883_MP (MAC to VITESSE Mode)
DRAM_CONF_FROM: Boot-Strapping
DRAM_TYPE: DDR2
DRAM_SIZE: 512 Mbits
DRAM_WIDTH: 16 bits
DRAM_TOTAL_WIDTH: 16 bits
TOTAL_MEMORY_SIZE: 64 MBytes
Flash component: SPI Flash
Date:Dec  7 2011  Time:09:22:22
============================================
icache: sets:512, ways:4, linesz:32 ,total:65536
dcache: sets:256, ways:4, linesz:32 ,total:32768

 ##### The CPU freq = 500 MHZ ####
 estimate memory size =64 Mbytes

Please choose the operation:
   2: Load system code then write to Flash via TFTP.
   3: Boot system code via Flash (default).
   4: Entr boot command line interface.
   7: Load Boot Loader code then write to Flash via Serial.
   9: Load Boot Loader code then write to Flash via TFTP.                                             4 initializing CHIP_RTL8367R_VB 1010                                                                    0
initializing CHIP_RTL8367R_VB 1010

3: System Boot system code via Flash.
## Booting image at bc050000 ...
.   Image Name:   N750F9K1103VB
   Created:      2019-03-03  15:44:47 UTC
   Image Type:   MIPS Linux Kernel Image (lzma compressed)
   Data Size:    1685852 Bytes =  1.6 MB
   Load Address: 80000000
   Entry Point:  80000000
..........................   Verifying Checksum ... OK
   Uncompressing Kernel Image ... LZMA ERROR 1 - must RESET board to recover

and a successful one

ARC Uboot:1.7.4 (Dec  7 2011 - 09:22:22)

Board: Ralink APSoC DRAM:  64 MB
relocate_code Pointer at: 83fa0000
******************************
Software System Reset Occurred
******************************
spi_wait_nsec: 30
spi device id: c2 20 17 c2 20 (2017c220)
find flash: MX25L6405D
..============================================
Ralink UBoot Version: 3.5.2.0
--------------------------------------------
ASIC 3883_MP (MAC to VITESSE Mode)
DRAM_CONF_FROM: Boot-Strapping
DRAM_TYPE: DDR2
DRAM_SIZE: 512 Mbits
DRAM_WIDTH: 16 bits
DRAM_TOTAL_WIDTH: 16 bits
TOTAL_MEMORY_SIZE: 64 MBytes
Flash component: SPI Flash
Date:Dec  7 2011  Time:09:22:22
============================================
icache: sets:512, ways:4, linesz:32 ,total:65536
dcache: sets:256, ways:4, linesz:32 ,total:32768

 ##### The CPU freq = 500 MHZ ####
 estimate memory size =64 Mbytes

Please choose the operation:
   2: Load system code then write to Flash via TFTP.
   3: Boot system code via Flash (default).
   4: Entr boot command line interface.
   7: Load Boot Loader code then write to Flash via Serial.
   9: Load Boot Loader code then write to Flash via TFTP.                                             4 initializing CHIP_RTL8367R_VB 1010                                                                    0
initializing CHIP_RTL8367R_VB 1010

3: System Boot system code via Flash.
## Booting image at bc050000 ...
.   Image Name:   N750F9K1103VB
   Created:      2019-03-03   1:46:05 UTC
   Image Type:   MIPS Linux Kernel Image (lzma compressed)
   Data Size:    1469187 Bytes =  1.4 MB
   Load Address: 80000000
   Entry Point:  80000000
.......................   Verifying Checksum ... OK
   Uncompressing Kernel Image ... OK
No initrd
## Transferring control to Linux (at address 80000000) ...
## Giving linux memsize in MB, 64

Starting kernel ...

[    0.000000] Linux version 4.14.103 (kip@dev-ub-openwrt) (gcc version 7.4.0 (OpenWrt GCC 7.4.0 r9508-c1a8054114)) #0 Sun Mar 3 01:46:05 2019
[    0.000000] SoC Type: Ralink RT3883 ver:1 eco:5

<snipped>
1 Like

You hit head overlap tail. This crappy loader (from Ralink) assume load address 0x80400000 and decompress to 0x80000000 - so you have 0x400000 (4194304) bytes for decompressed kernel.

compare not working:
uncompressed size: 4941756 bytes
with working:
uncompressed size: 4265108 bytes
decompressed data slightly overwrite compressed source.

second problem hidden in owrt package deps - it generate useless deps, build unused packages and pack it into squashfs. For example - select any package which need libpcre - wget for example.

$ cp .config.clean .config
$ echo CONFIG_PACKAGE_wget=y >>.config
$ yes n | make oldconfig >/dev/null 2>/dev/null
$ diff -u .config.clean .config | grep '^\+'
+++ .config	2019-04-16 00:34:16.332938128 +0300
+CONFIG_PACKAGE_librt=y
+# CONFIG_PACKAGE_libncurses-dev is not set
+# CONFIG_PACKAGE_zlib-dev is not set
+CONFIG_PACKAGE_libpcre=y
+CONFIG_PACKAGE_uclibcxx=y
+CONFIG_PACKAGE_zlib=y
....

Which uclibcxx??? Why ???
libpcre contains libpcrecpp package which link with c++ std library, so uclibcxx package is enabled, build and installed. No-one package in feeds or main tree is used this libpcrecpp - but you have it deps uclibcxx in your router. Great.
Same shit happens with multiple SSL libraries.

Whoa! Thanks for this tip. Is good to understand the basis for this failure. I don't know that much about it but seems like ralink (and maybe others) forked u-boot and added terrible. I am pretty sure this ralink u-boot variant ignores bootcmd for example. I tried to find the source but wasn't able to.

The good news is.. I have pulled the same source from master, but built it using the 18.06.2 config.seed with some minor tweaks this dumps enough of the unneeded stuff that it shrinks the kernel considerably and the stock u-boot loads and boots it fine.

When I get more time, I will come back to this and see if there are any alternatives to allow a bigger image. I am assuming at some point the release builds will grow too.

Patch bootloader.

@hnyman https://forum.openwrt.org/t/solved-uboot-not-enough-buffer-for-decompression-lzma-error-1/15371/13 POST13 on 
################################################################################################################
#My hunch is still, like I wrote above, about the compression parameters used in creating the LZMA image. 
#In addition try removing three non-standard parameters "-lc1 -lp2 -pb2" to see if the compressed image works.

belkin_f9k1109v1-kernel.bin -lc1 -lp2 -pb2 -d16 <

mkimage -A mips -O linux -T kernel -C lzma -a 0x80000000 -e 0x80000000 -n 'N750F9K1103VB'
						                               
                                                           8033e000    <

###################################################################################################################################
WRT 64  0x40  LZMA compressed data, properties:         0x6D,        dictionary size: 65536 bytes, uncompressed size: 4273652 bytes
OEM 64  0x40  LZMA compressed data, properties:         0x5D,        dictionary size: 65536 bytes, uncompressed size: 3526536 bytes
###################################################################################################################################

binwalk -I OEM.bin --dumb | grep LZMA #############################################################################################
64            0x40            LZMA compressed data, properties: 0x5D, dictionary size: 65536 bytes, uncompressed size: 3526536 bytes
7400073       0x70EA89        LZMA compressed data, properties: 0x5C, dictionary size: 0 bytes, uncompressed size: 1544678912 bytes
7400081       0x70EA91        LZMA compressed data, properties: 0x5C, dictionary size: 0 bytes, uncompressed size: 0 bytes
###################################################################################################################################

Also fmk-lzma tool i mentioned in previous thread might help to determine exact compression.

Thanks guys. I have been caught up in other projects. The good news is a build with the release config is running solid on the router. Wulfy, I had messed with some of the compression params at some point, but I can't remember what I did at this point. I will try some variations and see if it works, if it does then I can see about tweaking the make files in the build.

Tangent question, any ideas I where I could find the source for this u-boot variation? Is not in the firmware source from Belkin.