Unifi 6 Lite boot failure with recent master

I've had occasional boot failures with my self built Unifi 6 images and cannot figure out why.

This happened most recently on a checkout from (the arbitrary) commit 32c683ddceba from Oct 16.

After some bad experiences in the past, I'm now careful to run "make dirclean" whenever I pull from master. So there should not be any residue from previous builds, I believe.

I usually build with the most recent toolchains available. Which means binutils 2.39 and GCC 12.2. So GCC 12 ssues were one of my first suspicions. Now I have two Unifi 6 Lites, so after the first one failed I simply rebuilt the imahe (after make dirclean again) using GCC 11.3 instead. But this second image failed too. I guess that pretty much excludes the possibility of both hardware failure and a GCC 12 specific issue.

None of the Unifi 6 Lites had console, unfortunately. So built an image for ZyXEL NR7101, which also is MT7621 based, using the exact same repo and toolchain configuration. Hoping that this would fail in a similar way with console output. But that image worked just fine.

So it's not related to the SoC platform either.

I finally got around to adding console to one of the failing Unifi 6 Lites. This revealed that it booted normally up to and including the point where the kernel writes

[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)

There it just stopped with no more output at all. No panic stack trace or error messages of any kind. Just hanging.

The next expected messages would have been

[    0.000000] MIPS: machine is Ubiquiti UniFi 6 Lite
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4

But without any errors, and a similar SoC booting just fine with a kernel built from the same code and toolchain, I am pretty lost.

Does anyone have a suggestion? I'm completely lost. I would like to figure out why this happens so I can avoid it, but I don't know where to start.

So far, I've ended up working around the issue by using tftp rescue. This is the third time. And although the method is simple, it is a hassle. I have to get physically close enough to push and hold the reset button. And these APs are not mounted to be accessible, unfortunately.

Now that I have console, I might be able to fix that one "remotely". But that also requires being close to the AP, since I have no other device close enough for a permanent console connection (I wired up a bluetooth serial console. but that doesn't work over much longer distances than a serial cable anyway - just avoids making a hole in the case)

BTW, does anyone know how to possible control the Ubnt bootoader entirely using Bluetooth? The Unifi 6 Lite has a builtin bluetooth controller which is supported in the bootlaoder. Re-installing OpenWrt after such failures would be much easier if I could do that without having to touch the reset button. Can Bluetooth be used instead to trigger recovery? Or am I right guessing that the BLE functionality also depends on pressing the reset button? The cosnole output kind of suggests that. With button pressed:

-Boot 2018.03 [UniFi,v1.1.40.71] (Nov 18 2020 - 20:03:50 -0700), Build: jenkins-Bootloaders-BL_mtk_multi-1.1.40-1

MediaTek MT7621AT ver 1, eco 3
Clocks: CPU: 880MHz, DDR: 1200MHz, Bus: 220MHz, XTAL: 40MHz
DRAM:  256 MiB
Loading Environment from SPI Flash... SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
OK
In:    uartlite0@1e000c00
Out:   uartlite0@1e000c00
Err:   uartlite0@1e000c00
Net:   eth0: eth@1e100000
SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
load ubntapp ok
Board: Ubiquiti Networks MT7621 board (a612-15.0000)
UBNT application initialized 
 *WARNING*: Could not parse FW version, please check FW format
is_default true
is_ble_stp = true

~~~ p_device_model:U6-LITE
~~~ is_default:1 ~~~
~~~ p_macaddr:f4:92:bf:ac:83:58 ~~~
~~~ is_ble_stp:1 ~~~
=========================GPIO INIT=====================
GPIO_16, action is output low
GPIO_16, action is output high
GPIO_19, action is output low
GPIO_19, action is output high
=========================UART_3 INIT=====================
uartlite0@1e000c00, 1e000c00
uartlite0@1e000e00 bring up, 1e000e00
=========================FLOW 1=====================
[BT Power On Result] Success

=========================FLOW 2=====================
[HCI RESET Result] Success

=========================Extend FLOW=====================
[HCI LE BT MAC ADDR Result] Success

=========================FLOW 3=====================
[HCI LE SET ADVERTISING PARAMETER Result] Success

=========================FLOW 4=====================
[HCI LE  SET ADVERTISING DATA Result] Success

=========================FLOW 5=====================
[HCI LE  SET SCAN RESPONSE Result] Success

=========================FLOW 6=====================
[HIC LE  SET ADVERISTING ENABLE Result] Success
MT7915 BLE broadcasting successfully
Autobooting in 2 seconds, press "<Esc><Esc>" to stop
ubnt boot ...
SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB

(and then it goes on with the recovery "Erasing cfg partition ......" if the button still is held)

Without button pressed:

U-Boot 2018.03 [UniFi,v1.1.44.73] (Dec 11 2020 - 02:09:10 +0000), Build: jenkins-Bootloaders-BL_mtk_multi-1.1.44-1

MediaTek MT7621AT ver 1, eco 3
Clocks: CPU: 880MHz, DDR: 1200MHz, Bus: 220MHz, XTAL: 40MHz
DRAM:  256 MiB
Loading Environment from SPI Flash... SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
OK
In:    uartlite0@1e000c00
Out:   uartlite0@1e000c00
Err:   uartlite0@1e000c00
Net:   eth0: eth@1e100000
SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
load ubntapp ok
Board: Ubiquiti Networks MT7621 board (a612-15.0000)
UBNT application initialized 
 *WARNING*: Could not parse FW version, please check FW format
is_default <NULL>
set is_default true
is_ble_stp <NULL>
is_ble_stp = false or NULL

~~~ p_device_model:U6-LITE
~~~ is_default:1 ~~~
~~~ p_macaddr:f4:92:bf:ac:83:58 ~~~
~~~ is_ble_stp:0 ~~~
=========================GPIO INIT=====================
GPIO_16, action is output low
GPIO_16, action is output high
GPIO_19, action is output low
GPIO_19, action is output high
=========================UART_3 INIT=====================
uartlite0@1e000c00, 1e000c00
uartlite0@1e000e00 bring up, 1e000e00
=========================FLOW 1=====================
len = 8, num = 10
[BT Power On]: Tx Cmd=01 6f fc 06 01 06 02 00 00 01 
[BT Power On]: Rx Msg=00 00 00 00 00 00 00 00 
[BT Power On Result] Fail

=========================FLOW 2=====================

=========================Extend FLOW=====================

=========================FLOW 3=====================

=========================FLOW 4=====================

=========================FLOW 5=====================

=========================FLOW 6=====================
MT7915 BLE broadcasting successfully
Autobooting in 2 seconds, press "<Esc><Esc>" to stop
ubnt boot ...
SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
reading kernel 0 from: 0x1d0000, size: 0x00b7e000

So there's still that "BLE broadcasting " message, but it also says "[BT Power On Result] Fail" and all the LE messages are missing.

Current master is working on mine. What kernel are you trying?

5.15 has never worked for me on u6-lite. Have you tried 5.10?

If it's that, I never had serial access to see what was going on. These logs might be very helpful.

Thanks for replying. I've been running 5.15 for more than a year on these. Currently this (which obviously works):

root@u6-2:~# cat /proc/version 
Linux version 5.15.69 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r20813+3-39685292858c) 12.2.0, GNU ld (GNU Binutils) 2.39) #0 SMP Sun Oct 2 06:51:25 2022

So I don't think that can be it. By itself, at least. But it's certainly possible that the problem is related, so I should have mentioned that in the original post.

I am wondering if there could be some size or alignment problem which makes the issue appear and disappear with different git revisions? It looks like the kernel is hanging just about where it is supposed to load the device tree. I remember we had problems on realtek once when a bug caused the raw appended DTB to sometimes end up 4-byte aligned instead of the required 8-byte. I couldn't find any of my notes from back then, but I believe the symptom was similar. But I guess DTB alignment is up to the boot loader with FIT images? So that should be OK in any case?

Sorry if I ask many stupid question. I should probably learn a bit about the platform and FIT and all that before spitting out random ideas

Is it possible that ccache causes build bugs?

FWIW, I just tried a new build with ccache disabled and it booted just fine. These come-and-go bugs are really annoying, but I guess I have to stop changing settings if I really want to narrow it down.

Attaching the full sysupgrade console log for reference, showing how stuff now works. Hoping that it will help to have this baseline on the next failure..

Unifi 6 Lite console log showing successful sysupgrade and boot to commit a3da858ab030 (2022-10-22)
sysupgrade -v http://owrt.mork.no/r21070+3-a3da858ab030/targets/ramips/mt7621/openwrt-snapshot-ramips-mt7621-ubnt_unifi-6-lite-squashfs-sysupgrade.bin
Downloading 'http://owrt.mork.no/r21070+3-a3da858ab030/targets/ramips/mt7621/openwrt-snapshot-ramips-mt7621-ubnt_unifi-6-lite-squashfs-sysupgrade.bin'
Connecting to 192.168.99.1:80
Writing to '/tmp/sysupgrade.img'
/tmp/sysupgrade.img  100% |*******************************|  8193k  0:00:00 ETA
Download completed (8389705 bytes)
Sun Oct 23 11:27:33 CEST 2022 upgrade: Saving config files...
etc/config/dropbear
etc/config/firewall
etc/config/lldpd
etc/config/luci
etc/config/network
etc/config/network.save
etc/config/rpcd
etc/config/system
etc/config/ubootenv
etc/config/ucitrack
etc/config/uhttpd
etc/config/wireless
etc/crontabs/root
etc/dropbear/authorized_keys
etc/dropbear/dropbear_ed25519_host_key
etc/dropbear/dropbear_rsa_host_key
etc/fw_env.config
etc/group
etc/hosts
etc/inittab
etc/nftables.d/10-custom-filter-chains.nft
etc/nftables.d/README
etc/opkg/keys/99db1e0996685023
etc/opkg/keys/b5043e70f9a75cde
etc/passwd
etc/profile
etc/rc.local
etc/shadow
etc/shells
etc/shinit
etc/sysctl.conf
Sun Oct 23 11:27:33 CEST 2022 upgrade: Commencing upgrade. Closing all shell sessions.
Command failed: Connection faile[134718.694245] device wlan1 left promiscuous mode
[134718.698974] br-lan: port 2(wlan1) entered disabled state
Watchdog handover: fd=3
- watchdog -
Watchdog does not have CARDRESET support
[134719.028952] device wlan0 left promiscuous mode
[134719.033743] br-iot: port 2(wlan0) entered disabled state
Sun Oct 23 11:27:34 CEST 2022 upgrade: Sending TERM to remaining processes ...
Sun Oct 23 11:27:34 CEST 2022 upgrade: Sending signal TERM to hostapd (1070)
Sun Oct 23 11:27:38 CEST 2022 upgrade: Sending KILL to remaining processes ...
[134729.436461] stage2 (4724): drop_caches: 3
Sun Oct 23 11:27:44 CEST 2022 upgrade: Switching to ramdisk...
mount: mounting /dev/mtdblock9 on /overlay failed: Resource busy
[134732.192189] VFS: Busy inodes after unmount of jffs2. Self-destruct in 5 seconds.  Have a nice day...
Sun Oct 23 09:27:47 UTC 2022 upgrade: Performing system upgrade...
[134732.291504] do_stage2 (4724): drop_caches: 3
Unlocking firmware ...

Writing from <stdin> to firmware ...  [ ][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w][e][w]   
Appending jffs2 data from /tmp/sysupgrade.tgz to firmware..
.
Writing from <stdin> to firmware ...  [ ][e][w]    
Sun Oct 23 09:28:34 UTC 2022 upgrade: Upgrade completed
Sun Oct 23 09:28:35 UTC 2022 upgrade: Rebooting system...
umount: can't unmount /dev: Resource busy
umount: can't unmount /tmp: Resource [134780.832169] mt7530 mdio-bus:1f lan: Link is Down
busy
[134780.838413] device eth0 left promiscuous mode
[134780.845080] device lan left promiscuous mode
[134780.851719] br-iot: port 1(lan.15) entered disabled state
[134780.858855] br-lan: port 1(lan.10) entered disabled state
[134780.867807] device lan.10 left promiscuous mode
[134780.872478] br-lan: port 1(lan.10) entered disabled state
[134780.952436] device lan.15 left promiscuous mode
[134780.957066] br-iot: port 1(lan.15) entered disabled state
[134781.121602] reboot: Restarting system
DRAM cfg init type(1): predefined first

U-Boot SPL 2018.03 [UniFi,v1.1.44.73] (Dec 11 2020 - 02:09:10 +0000)
Trying to boot from MTK-MMAP


U-Boot 2018.03 [UniFi,v1.1.44.73] (Dec 11 2020 - 02:09:10 +0000), Build: jenkins-Bootloaders-BL_mtk_multi-1.1.44-1

MediaTek MT7621AT ver 1, eco 3
Clocks: CPU: 880MHz, DDR: 1200MHz, Bus: 220MHz, XTAL: 40MHz
DRAM:  256 MiB
Loading Environment from SPI Flash... SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
OK
In:    uartlite0@1e000c00
Out:   uartlite0@1e000c00
Err:   uartlite0@1e000c00
Net:   eth0: eth@1e100000
SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
load ubntapp ok
Board: Ubiquiti Networks MT7621 board (a612-15.0000)
UBNT application initialized 
 *WARNING*: Could not parse FW version, please check FW format
is_default true
is_ble_stp <NULL>
is_ble_stp = false or NULL

~~~ p_device_model:U6-LITE
~~~ is_default:1 ~~~
~~~ p_macaddr:f4:92:bf:ac:83:58 ~~~
~~~ is_ble_stp:0 ~~~
=========================GPIO INIT=====================
GPIO_16, action is output low
GPIO_16, action is output high
GPIO_19, action is output low
GPIO_19, action is output high
=========================UART_3 INIT=====================
uartlite0@1e000c00, 1e000c00
uartlite0@1e000e00 bring up, 1e000e00
=========================FLOW 1=====================
len = 8, num = 10
[BT Power On]: Tx Cmd=01 6f fc 06 01 06 02 00 00 01 
[BT Power On]: Rx Msg=00 00 00 00 00 00 00 00 
[BT Power On Result] Fail

=========================FLOW 2=====================

=========================Extend FLOW=====================

=========================FLOW 3=====================

=========================FLOW 4=====================

=========================FLOW 5=====================

=========================FLOW 6=====================
MT7915 BLE broadcasting successfully
Autobooting in 2 seconds, press "<Esc><Esc>" to stop
ubnt boot ...
SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
reading kernel 0 from: 0x1d0000, size: 0x002d5000
## Loading kernel from FIT Image at 86000000 ...
   Using 'config@1' configuration
   Verifying Hash Integrity ... OK
   Trying 'kernel-1' kernel subimage
     Description:  MIPS OpenWrt Linux-5.15.74
     Type:         Kernel Image
     Compression:  lzma compressed
     Data Start:   0x860000e4
     Data Size:    2953929 Bytes = 2.8 MiB
     Architecture: MIPS
     OS:           Linux
     Load Address: 0x80001000
     Entry Point:  0x80001000
     Hash algo:    crc32
     Hash value:   1ef60645
     Hash algo:    sha1
     Hash value:   7903580c8e8acc09f667dfe84e6c7db7c475f43a
   Verifying Hash Integrity ... crc32+ sha1+ OK
## Loading fdt from FIT Image at 86000000 ...
   Using 'config@1' configuration
   Trying 'fdt-1' fdt subimage
     Description:  MIPS OpenWrt ubnt_unifi-6-lite device tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x862d14f0
     Data Size:    11387 Bytes = 11.1 KiB
     Architecture: MIPS
     Hash algo:    crc32
     Hash value:   16bb5a14
     Hash algo:    sha1
     Hash value:   40f26bf28e33bbe661fec716929f2003301f5e4d
   Verifying Hash Integrity ... crc32+ sha1+ OK
   Booting using the fdt blob at 0x862d14f0
   Uncompressing Kernel Image ... OK
   Using Device Tree in place at 862d14f0, end 862d716a
[    0.000000] Linux version 5.15.74 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r21070+3-a3da858ab030) 12.2.0, GNU ld (GNU Binutils) 2.39) #0 SMP Sun Oct 23 08:52:20 2022
[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.000000] MIPS: machine is Ubiquiti UniFi 6 Lite
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4
[    0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000]   HighMem  empty
[    0.000000] Movable zone start for each node
[    0.0h table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] Writing ErrCtl register=0004a000
[    0.000000] Readback ErrCtl register=0004a000
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 248436K/262144K available (7215K kernel code, 660K.000004] sched_clock: 64 bits at 880MHz, resolution 1ns, wraps every 4398046511103ns
[    0.008067] Calibrating delay loop... 586.13 BogoMIPS (lpj=2930688)
[    0.066223] pid_max: default: 32768 minimum: 301
[    0.071045] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.078257] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.089202] rcu: Hierarchical SRCU implementation.
[    0.095132] smp: Bringing up secondary CPUs ...
[    0.100569] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.100596] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.100611] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.100658] CPU1 revision is: 0001992f (MIPS 1004Kc)
[    0.159942] Synchronize counters for CPU 1: done.
[    0.192478] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.192498] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.192509] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.192542] CPU2 revision is: 0001992f (MIPS 1004Kc)
[    0.251440] Synchronize counters for CPU 2: done.
[    0.282045] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.282066] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.282077] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.282112] CPU3 revision is: 0001992f (MIPS 1004Kc)
[    0.336639] Synchronize counters for CPU 3: done.
[    0.366504] smp: Brought up 1 node, 4 CPUs
[    0.374515] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.384317] futex hash table entries: 1024 (order: 3, 32768 bytes, linear)
[    0.391331] pinctrl core: initialized pinctrl subsystem
[    0.397828] NET: Registered PF_NETLINK/PF_ROUTE proto.471969] TCP established hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.479611] TCP (2009/01/31) Phillip Lougher
[    0.535910] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.549964] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    0.555860] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    0.561772] mt7621_gpio 1e000600.gpio: registering 32 gpios
[    0.567873] mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
[    0.574544] mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000, using [bus 00-ff]
[    0.583330] mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
[    0.591436] mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x0000000000
[    0.837251] mt7621-pci 1e140000.pcie: pcie2 no card, disable it (RST & CLK)
[    0.844146] mt7621-pci 1e140000.pcie: PCIE0 enabled
[    0.849003] mt7621-pci 1e140000.pcie: PCIE1 enabled
[    0.853837] PCI coherence region base: 0x60000000, mask/settings: 0xf0000002
[    0.860994] mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
[    0.867299] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.872714] pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
[    0.879561] pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
[    0.885725] pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
[    0.891679] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
[    0.897892] pci 0000:00:00.0: reg 0x14: [mem 0x60500000-0x6050ffff]
[    0.904144] pci 0000:00:00.0: supports D1
[    0.908077] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[    0.914535] pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
[    0.920547] pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
[    0.926725] pci 0000:00:01.0: reg 0x14: [mem 0x60510000-0x6051ffff]
[    0.933029] pci 0000:00:01.0: supports D1
[    0.936946] pci 0000:00:01.0: PME# supported from D0 D1 D3hot
[    0.944715] pci 0000:01:00.0: [14c3:7603] type 00 class 0x028000
[    0.950724] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff]
[    0.957043] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[    0.964438] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
[    0.969634] pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
[    0.975638] pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
[    0.982401] pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
[    0.989572] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[    0.996363] pci 0000:02:00.0: [14c3:7915] type 00 class 0x000280
[    1.002352] pci 0000:02:00.0: reg 0x10: [71] pci 0000:00:01.0: PCI bridge to [bus 02-ff]
[    1.056026] pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
[    1.062108] pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
[    1.068827] pci 0000:00:01.0:   bridge window [mem 0x60300000-0x604fffff pref]
[    1.075982] pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
[    1.082626] pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
[    1.089160] pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
[    1.096055] pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
[    1.10264gned [mem 0x60510000-0x6051ffff]
[    1.150854] pci 0000:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
[    1.156878] pci 0000:00:01.0: BAR 7: assigned [io  0x1000-0x1fff]
[    1.162967] pci 0000:01:00.0: BAR 0: assigned [mem 0x60000000-0x600fffff]
[    1.169697] pci 0000:00:00.0: PCI bridge to [bus 01]
[    1.174592] pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
[    1.180661] pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
[    1.187403] pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
[    1.194559] pci 0000:02:00.0: BAR 0: assigned [mem 0x60300000-0x603fffff 64bit pref]
[    1.202292] pci 0000:02:00.0: BAR 2: assigned [mem 0x60400000-0x6040led
[    1.257422] 1e000c00.uartlite: ttyS0 at MMIO 0x1e000c00 (irq = 19, base_baud = 3125000) is a 16550A
[    1.266411] printk: console [ttyS0] enabled
[    1.266411] printk: console [ttyS0] enabled
[    1.274683] printk: bootconsole [early0] disabled
[    1.274683] printk: bootconsole [early0] disabled
[    1.287405] spi-mt7621 1e000b00.spi: sys_freq: 220000000
[    1.294105] spi-nor spi0.0: mx25l25635e (32768 Kbytes)
[    1.299389] 8 fixed-partitions partitions found on MTD device spi0.0
[    1.305759] OF: Bad cell count for /palmbus@1e000000/spi@b00/flash@0/partitions
[    1.313c0000 : "eeprom"
[    1.365990] 0x0000000c0000-0x0000000d0000 : "bs"
[    1.371944] 0x0000000d0000-0x0000001d0000 : "cfg"
[    1.377672] 0x0000001d0000-0x0000010e0000 : "firmware"
[    1.383951] 2 fit-fw partitions found on MTD device firmware
[    1.389654] Creating 2 MTD partitions on "firmware":
[    1.394607] 0x000000000000-0x0000002e0000 : "kernel"
[    1.400625] 0x0000002d46b4-0x000000f10000 : "rootfs"
[    1.405608] mtd: partition "rootfs" doesn't start on an erase/write block boundary -- force read-only
[    1.415712] mtd: device 8 (rootfs) set to be root filesystem
[    1.421505] 1 squashfs-split partitions found on MTD device rootfs
[    1.427696] 0x0000007e0000-0x000000f10000 : "rootfs_data"
[    1.434168] 0x0000010e0000-0x000001ff0000 : "kernel1"
[    1.468803] mt7530 mdio-bus:1f: MT7530 adapts as multi-chip module
[    1.478513] mtk_soc_eth 1e100000.ethernet eth0: mediatek frame engine at 0xbe100000, irq 21
[    1.488473] i2c_dev: i2c /dev entries driver
[    1.496030] NET: Registered PF_INET6 protocol family
[    1.503217] Segment Routing with IPv6
[    1.506959] In-situ OAM (IOAM) with IPv6
[    1.511008] NET: Registered PF_PACKET protocol family
[    1.516493] 8021q: 802.1Q VLAN Support v1.8
[    1.526509] mt7530 mdio-bus:1f: MT7530 adapts as multi-chip module
[    1.549568] mt7530 mdio-bus:1f: configuring for fixed/rgmii link mode
[    1.559797] mt7530 mdio-bus:1f: Link is Up - 1Gbps/Full - flow control rx/tx
[    1.567634] mt7530 mdio-bus:1f lan (uninitialized): PHY [mt7530-0:00] driver [MediaTek MT7530 PHY] (irq=23)
[    1.581770] DSA: tree 0 setup
[    1.592345] VFS: Mounted root (squashfs filesystem) readonly on device 31:8.
[    1.603566] Freeing unused kernel image (initmem) memory: 1268K
[    1.609563] This architecture does not have kernel memory protection.
[    1.616013] Run /sbin/init as init process
[    2.039361] init: Console is alive
[    2.043103] init: - watchdog -
[    3.054618] kmodloader: loading kernel modules from /etc/modules-boot.d/*
[    3.102251] usbcore: registered new interface driver usbfs
[    3.107914] usbcore: registered new interface driver hub
[    3.113377] usbcore: registered new device driver usb
[    3.124956] kmodloader: done loading kernel modules from /etc/modules-boot.d/*
[    3.137707] init: - preinit -
[    3.899819] random: jshn: uninitialized urandom read (4 bytes read)
[    4.006072] random: jshn: uninitialized urandom read (4 bytes read)
[    4.034998] random: jshn: uninitialized urandom read (4 bytes read)
[    4.611135] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
[    4.619480] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[    4.627837] mt7530 mdio-bus:1f lan: configuring for phy/gmii link mode
[    4.634825] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Press the [f] key and hit [enter] to enter failsafe mode
Press the [1], [2], [3] or [4] key and hit [enter] to select the debug level
[    6.825539] jffs2_scan_eraseblock(): End of filesystem marker found at 0x10000
[    6.832829] jffs2_build_filesystem(): unlocking the mtd device... 
[    6.832867] done.
[    6.840947] jffs2_build_filesystem(): erasing all blocks after the end marker... 
[    7.007284] random: crng init done
[    7.018135] random: 7 urandom warning(s) missed due to ratelimiting
[    8.301653] mt7530 mdio-bus:1f lan: Link is Up - 1Gbps/Full - flow control rx/tx
[    8.309184] IPv6: ADDRCONF(NETDEV_CHANGE): lan: link becomes ready
[   40.422421] done.
[   40.424394] jffs2: notice: (392) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   40.441675] mount_root: overlay filesystem has not been fully initialized yet
[   40.450152] mount_root: switching to jffs2 overlay
[   40.457343] overlayfs: upper fs does not support tmpfile.
- config restore -
[   40.902405] urandom-seed: Seed file not found (/etc/urandom.seed)
[   40.983010] mt7530 mdio-bus:1f lan: Link is Down
[   40.997417] procd: - early -
[   41.000579] procd: - watchdog -
[   41.567446] procd: - watchdog -
[   41.571075] procd: - ubus -
[   41.650411] procd: - init -
Please press Enter to activate this console.
[   42.228506] kmodloader: loading kernel modules from /etc/modules.d/*
[   42.402111] urngd: v1.0.2 started.
[   42.436237] hid: raw HID events driver (C) Jiri Kosina
[   42.465625] Bluetooth: Core ver 2.22
[   42.469471] NET: Registered PF_BLUETOOTH protocol family
[   42.474796] Bluetooth: HCI device and connection manager initialized
[   42.481228] Bluetooth: HCI socket layer initialized
[   42.486093] Bluetooth: L2CAP socket layer initialized
[   42.491203] Bluetooth: SCO socket layer initialized
[   42.498215] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[   42.503551] Bluetooth: BNEP filters: protocol multicast
[   42.508801] Bluetooth: BNEP socket layer initialized
[   42.518760] usbcore: registered new interface driver btusb
[   42.525308] Loading modules backported from Linux version v5.15.58-0-g7d8048d4e064
[   42.532922] Backport generated by backports.git v5.15.58-1-0-g42a95ce7
[   42.541202] Bluetooth: HCI UART driver ver 2.3
[   42.545677] Bluetooth: HCI UART protocol H4 registered
[   42.550833] Bluetooth: HCI UART protocol BCSP registered
[   42.558262] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[   42.564222] Bluetooth: HIDP socket layer initialized
[   42.584671] Bluetooth: RFCOMM TTY layer initialized
[   42.589681] Bluetooth: RFCOMM socket layer initialized
[   42.594855] Bluetooth: RFCOMM ver 1.11
[   42.785449] pci 0000:00:00.0: enabling device (0006 -> 0007)
[   42.791210] mt7603e 0000:01:00.0: enabling device (0000 -> 0002)
[   42.797459] mt7603e 0000:01:00.0: ASIC revision: 76030010
[   43.828620] mt7603e 0000:01:00.0: Firmware Version: ap_pcie
[   43.834215] mt7603e 0000:01:00.0: Build Time: 20160107100755
[   43.877268] mt7603e 0000:01:00.0: firmware init done
[   44.061573] pci 0000:00:01.0: enabling device (0006 -> 0007)
[   44.067345] mt7915e 0000:02:00.0: enabling device (0000 -> 0002)
[   44.348988] mt7915e 0000:02:00.0: HW/SW Version: 0x8a108a10, Build Time: 20211222184017a
[   44.348988] 
[   44.803561] mt7915e 0000:02:00.0: WM Firmware Version: ____000000, Build Time: 20211222184052
[   44.944376] mt7915e 0000:02:00.0: WA Firmware Version: DEV_000000, Build Time: 20211222184111
[   45.150605] kmodloader: done loading kernel modules from /etc/modules.d/*
[   51.704089] mtk_soc_eth 1e100000.ethernet eth0: Link is Down
[   51.720554] mtk_soc_eth 1e100000.ethernet eth0: configuring for fixed/rgmii link mode
[   51.728697] mtk_soc_eth 1e100000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[   51.737162] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   51.744758] mt7530 mdio-bus:1f lan: configuring for phy/gmii link mode
[   51.757410] br-iot: port 1(lan.15) entered blocking state
[   51.762894] br-iot: port 1(lan.15) entered disabled state
[   51.769069] device lan.15 entered promiscuous mode
[   51.773919] device lan entered promiscuous mode
[   51.778520] device eth0 entered promiscuous mode
[   51.841242] br-lan: port 1(lan.10) entered blocking state
[   51.846711] br-lan: port 1(lan.10) entered disabled state
[   51.852894] device lan.10 entered promiscuous mode
[   54.730176] br-lan: port 2(phy1-ap0) entered blocking state
[   54.735811] br-lan: port 2(phy1-ap0) entered disabled state
[   54.742106] device phy1-ap0 entered promiscuous mode
[   54.823648] br-iot: port 2(phy0-ap0) entered blocking state
[   54.829377] br-iot: port 2(phy0-ap0) entered disabled state
[   54.835621] device phy0-ap0 entered promiscuous mode
[   54.840922] br-iot: port 2(phy0-ap0) entered blocking state
[   54.846547] br-iot: port 2(phy0-ap0) entered forwarding state
[   54.852729] IPv6: ADDRCONF(NETDEV_CHANGE): br-iot: link becomes ready
[   55.157966] IPv6: ADDRCONF(NETDEV_CHANGE): phy0-ap0: link becomes ready
[   56.039710] mt7530 mdio-bus:1f lan: Link is Up - 1Gbps/Full - flow control rx/tx
[   56.047173] IPv6: ADDRCONF(NETDEV_CHANGE): lan: link becomes ready
[   56.055306] br-lan: port 1(lan.10) entered blocking state
[   56.060817] br-lan: port 1(lan.10) entered forwarding state
[   56.068419] br-iot: port 1(lan.15) entered blocking state
[   56.073911] br-iot: port 1(lan.15) entered forwarding state
[   56.081241] IPv6: ADDRCONF(NETDEV_CHANGE): lan.203: link becomes ready
[   56.088558] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready
[  116.631511] IPv6: ADDRCONF(NETDEV_CHANGE): phy1-ap0: link becomes ready
[  116.638411] br-lan: port 2(phy1-ap0) entered blocking state
[  116.643982] br-lan: port 2(phy1-ap0) entered forwarding state



BusyBox v1.35.0 (2022-10-23 08:52:20 UTC) built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt SNAPSHOT, r21070+3-a3da858ab030
 -----------------------------------------------------
root@u6-2:/#

Interesting... I test it periodically but it always results in a tftp recovery. I use the same toolchain, I wonder what's different? Would you mind sharing your config.buildinfo?

Sure. This is it. You'll obviously have to change the CONFIG_VERSION_REPO stuff since that's pointing to my internal URLs.

r21070-a3da858ab030 config.buildinfo Unifi 6 Lite
CONFIG_TARGET_ramips_mt7621=y
CONFIG_TARGET_ramips_mt7621_DEVICE_ubnt_unifi-6-lite=y
CONFIG_DEVEL=y
CONFIG_TOOLCHAINOPTS=y
CONFIG_BINARY_FOLDER="$(TOPDIR)/bin/$(REVISION)"
# CONFIG_BINUTILS_USE_VERSION_2_37 is not set
CONFIG_BINUTILS_USE_VERSION_2_39=y
CONFIG_BINUTILS_VERSION="2.39"
CONFIG_BINUTILS_VERSION_2_39=y
# CONFIG_GCC_USE_VERSION_11 is not set
CONFIG_GCC_USE_VERSION_12=y
CONFIG_GCC_VERSION="12.2.0"
CONFIG_GCC_VERSION_12=y
CONFIG_IMAGEOPT=y
# CONFIG_JSON_OVERVIEW_IMAGE_INFO is not set
CONFIG_KERNEL_DYNAMIC_DEBUG=y
CONFIG_LINUX_5_15=y
CONFIG_LLDPD_WITH_CDP=y
CONFIG_LLDPD_WITH_CUSTOM=y
CONFIG_LLDPD_WITH_DOT1=y
CONFIG_LLDPD_WITH_DOT3=y
CONFIG_LLDPD_WITH_EDP=y
CONFIG_LLDPD_WITH_FDP=y
CONFIG_LLDPD_WITH_LLDPMED=y
CONFIG_LLDPD_WITH_PRIVSEP=y
CONFIG_LLDPD_WITH_SONMP=y
CONFIG_LUA_ECO_DEFAULT_OPENSSL=y
CONFIG_LUA_ECO_OPENSSL=y
# CONFIG_MUSL_DISABLE_CRYPT_SIZE_HACK is not set
CONFIG_OPENSSL_ENGINE=y
CONFIG_OPENSSL_PREFER_CHACHA_OVER_GCM=y
CONFIG_OPENSSL_WITH_ASM=y
CONFIG_OPENSSL_WITH_CHACHA_POLY1305=y
CONFIG_OPENSSL_WITH_CMS=y
CONFIG_OPENSSL_WITH_DEPRECATED=y
CONFIG_OPENSSL_WITH_ERROR_MESSAGES=y
CONFIG_OPENSSL_WITH_PSK=y
CONFIG_OPENSSL_WITH_SRP=y
CONFIG_OPENSSL_WITH_TLS13=y
CONFIG_PACKAGE_bluez-daemon=m
CONFIG_PACKAGE_bluez-libs=m
CONFIG_PACKAGE_bluez-utils=m
CONFIG_PACKAGE_bluez-utils-extra=m
CONFIG_PACKAGE_dbus=m
# CONFIG_PACKAGE_dnsmasq is not set
CONFIG_PACKAGE_ethtool=y
CONFIG_PACKAGE_glib2=m
CONFIG_PACKAGE_hostapd-openssl=y
CONFIG_PACKAGE_ip-bridge=y
CONFIG_PACKAGE_ip-full=y
CONFIG_PACKAGE_kmod-bluetooth=y
CONFIG_PACKAGE_kmod-crypto-ecb=y
CONFIG_PACKAGE_kmod-crypto-ecdh=y
CONFIG_PACKAGE_kmod-crypto-kpp=y
CONFIG_PACKAGE_kmod-hid=y
CONFIG_PACKAGE_kmod-input-core=y
CONFIG_PACKAGE_kmod-input-evdev=y
# CONFIG_PACKAGE_kmod-lib-crc-ccitt is not set
CONFIG_PACKAGE_kmod-lib-crc16=y
CONFIG_PACKAGE_kmod-nf-nat6=y
CONFIG_PACKAGE_kmod-nls-base=y
# CONFIG_PACKAGE_kmod-ppp is not set
CONFIG_PACKAGE_kmod-regmap-core=y
CONFIG_PACKAGE_kmod-usb-core=y
CONFIG_PACKAGE_libattr=m
CONFIG_PACKAGE_libbpf=y
CONFIG_PACKAGE_libcap=y
CONFIG_PACKAGE_libdbus=m
CONFIG_PACKAGE_libelf=y
CONFIG_PACKAGE_libevent2=y
CONFIG_PACKAGE_libexpat=m
CONFIG_PACKAGE_libffi=m
CONFIG_PACKAGE_libical=m
CONFIG_PACKAGE_libip4tc=y
CONFIG_PACKAGE_libip6tc=y
CONFIG_PACKAGE_libiptext=y
CONFIG_PACKAGE_libiptext6=y
CONFIG_PACKAGE_libncurses=m
CONFIG_PACKAGE_libopenssl=y
CONFIG_PACKAGE_libpcap=y
CONFIG_PACKAGE_libpcre=m
CONFIG_PACKAGE_libpcre2=m
CONFIG_PACKAGE_libreadline=m
CONFIG_PACKAGE_librt=m
CONFIG_PACKAGE_libustream-openssl=y
# CONFIG_PACKAGE_libustream-wolfssl is not set
# CONFIG_PACKAGE_libwolfssl is not set
CONFIG_PACKAGE_libxtables=y
CONFIG_PACKAGE_lldpd=y
# CONFIG_PACKAGE_odhcp6c is not set
# CONFIG_PACKAGE_odhcpd-ipv6only is not set
# CONFIG_PACKAGE_ppp is not set
# CONFIG_PACKAGE_procd-ujail is not set
CONFIG_PACKAGE_tcpdump=y
CONFIG_PACKAGE_terminfo=m
CONFIG_PACKAGE_uboot-envtools=y
# CONFIG_PACKAGE_wpad-basic-wolfssl is not set
CONFIG_PACKAGE_zlib=y
# CONFIG_PER_FEED_REPO is not set
CONFIG_TESTING_KERNEL=y
CONFIG_VERSIONOPT=y
CONFIG_VERSION_BUG_URL=""
CONFIG_VERSION_CODE=""
CONFIG_VERSION_DIST="OpenWrt"
CONFIG_VERSION_FILENAMES=y
CONFIG_VERSION_HOME_URL=""
CONFIG_VERSION_HWREV=""
CONFIG_VERSION_MANUFACTURER=""
CONFIG_VERSION_MANUFACTURER_URL=""
CONFIG_VERSION_NUMBER=""
CONFIG_VERSION_PRODUCT=""
CONFIG_VERSION_REPO="http://owrt.mork.no/%R"
CONFIG_VERSION_SUPPORT_URL=""
# CONFIG_WPA_WOLFSSL is not set
# CONFIG_VERSION_CODE_FILENAMES is not set
1 Like

ccache is known to miscompile at times, especially after toolchain changes (but not necessarily only that), while it should have gotten better in recent times, occasional issues are to be expected.

Thanks. Then I'll assume ccache is the problem, although I don't understand why the NR7101 didn't fail - having the exact same SoC.

Will continue to build without ccache and reopen the issue if I can reproduce the problem without it.

Re-opening this since I managed to reproduce the issue, this time without ccache and using default toolchain versions. But now with console.

And I believe this is an alignment issue. Look at the log below. The device tree blob is placed at 0x862d1d0c, i.e. 4 byte aligned. This is no good, is it? Shouldn't that be 8 byte aligned?

Looks lke the device tree is completely ignored by the OpenWrt kernel. There is no "MIPS: machine is Ubiquiti UniFi 6 Lite" printed after the CPU revision line, and as you can see the boot fails due to a missing mediatek,mt7621-sysc node. Which obviously is there in the included mt7621.dtsi.

The BIG question now is: How do we fix this? Anyone knowing how the fit building works?

SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
reading kernel 0 from: 0x1d0000, size: 0x002d5000
## Loading kernel from FIT Image at 86000000 ...
   Using 'config@1' configuration
   Verifying Hash Integrity ... OK
   Trying 'kernel-1' kernel subimage
     Description:  MIPS OpenWrt Linux-5.15.76
     Type:         Kernel Image
     Compression:  lzma compressed
     Data Start:   0x860000e4
     Data Size:    2956005 Bytes = 2.8 MiB
     Architecture: MIPS
     OS:           Linux
     Load Address: 0x80001000
     Entry Point:  0x80001000
     Hash algo:    crc32
     Hash value:   e1bc9460
     Hash algo:    sha1
     Hash value:   6510c4ada31aeea81f2e8e537f78cb367e1c7fab
   Verifying Hash Integrity ... crc32+ sha1+ OK
## Loading fdt from FIT Image at 86000000 ...
   Using 'config@1' configuration
   Trying 'fdt-1' fdt subimage
     Description:  MIPS OpenWrt ubnt_unifi-6-lite device tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x862d1d0c
     Data Size:    11387 Bytes = 11.1 KiB
     Architecture: MIPS
     Hash algo:    crc32
     Hash value:   16bb5a14
     Hash algo:    sha1
     Hash value:   40f26bf28e33bbe661fec716929f2003301f5e4d
   Verifying Hash Integrity ... crc32+ sha1+ OK
   Booting using the fdt blob at 0x862d1d0c
   Uncompressing Kernel Image ... OK
   Using Device Tree in place at 862d1d0c, end 862d7986
[    0.000000] Linux version 5.15.76 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r21167-1673b7dca384) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed Nov 2 15:53:34 2022
[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4
[    0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000]   HighMem  empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000] OF: fdt: No valid device tree found, continuing without
[    0.000000] percpu: Embedded 11 pages/cpu s15632 r8192 d21232 u45056
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 64960
[    0.000000] Kernel command line: rootfstype=squashfs,jffs2
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] Writing ErrCtl register=0004a000
[    0.000000] Readback ErrCtl register=0004a000
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 248504K/262144K available (7250K kernel code, 660K rwdata, 1536K rodata, 1228K init, 242K bss, 13640K reserved, 0K cma-reserved, 0K highmem)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] 	Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 256
[    0.000000] Kernel panic - not syncing: Failed to find mediatek,mt7621-sysc node
[    0.000000] Rebooting in 1 seconds..
[    0.000000] Reboot failed -- System halted

AFAIK, bootloader is the one that chooses where to relocate the DTB

Thanks. Then I guess the boot loader on these devices is buggy? The spec is pretty clear:

https://devicetree-specification.readthedocs.io/en/latest/chapter5-flattened-format.html#alignment

Even if the address is selected by the bootloader I wonder if we can affect it? In my experience, this is related to the image. Reinstalling a known good OpenWrt image always works.

EDIT: In fact, it's pretty straight forward:

bjorn@canardo:/usr/local/src/openwrt$ binwalk bin/r21167-1673b7dca384/targets/ramips/mt7621/openwrt-snapshot-ramips-mt7621-ubnt_unifi-6-lite-squashfs-sysupgrade.bin 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             Flattened device tree, size: 2969296 bytes, version: 17
228           0xE4            LZMA compressed data, properties: 0x6D, dictionary size: 8388608 bytes, uncompressed size: 9882912 bytes
2956556       0x2D1D0C        Flattened device tree, size: 11387 bytes, version: 17
2969296       0x2D4ED0        Squashfs filesystem, little endian, version 4.0, compression:xz, size: 5224142 bytes, 1061 inodes, blocksize: 262144 bytes, created: 2022-11-02 15:53:34

I guess the boot loader takes a shortcut. Loading this FIT image at 86000000 results in the DTB being paced at 862D1D0C

Hm, there is an option -B for mkimage that makes it align nodes to passed byte size

I believe 8 byte alignment of the DTB should be a default in either mkits or mkimage. There are bound to be more devices with a similar behaviour.

Anyway, just to confirm that this in fact is the issue, I took the stupid route and simply rebuilt the same image with a hard coded 4 byte padding of the kernel (test only - this will obviously fail unless we are 4 byte aligned):

bjorn@canardo:/usr/local/src/openwrt$ git diff
diff --git a/target/linux/ramips/image/mt7621.mk b/target/linux/ramips/image/mt7621.mk
index 3ef4cf4efb8f..228c792e9162 100644
--- a/target/linux/ramips/image/mt7621.mk
+++ b/target/linux/ramips/image/mt7621.mk
@@ -1960,7 +1960,7 @@ define Device/ubnt_unifi-6-lite
   DEVICE_MODEL := UniFi 6 Lite
   DEVICE_DTS_CONFIG := config@1
   DEVICE_PACKAGES += kmod-mt7603 kmod-mt7915e
-  KERNEL := kernel-bin | lzma | fit lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb
+  KERNEL := kernel-bin | lzma | pad-extra 4 | fit lzma $$(KDIR)/image-$$(firstword $$(DEVICE_DTS)).dtb
   IMAGE_SIZE := 15424k
 endef
 TARGET_DEVICES += ubnt_unifi-6-lite

Resulting image looks like

bjorn@canardo:/usr/local/src/openwrt$ binwalk bin/r21167-1673b7dca384/targets/ramips/mt7621/openwrt-snapshot-ramips-mt7621-ubnt_unifi-6-lite-squashfs-sysupgrade.bin 

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             Flattened device tree, size: 2969300 bytes, version: 17
228           0xE4            LZMA compressed data, properties: 0x6D, dictionary size: 8388608 bytes, uncompressed size: 9882912 bytes
2956560       0x2D1D10        Flattened device tree, size: 11387 bytes, version: 17
2969300       0x2D4ED4        Squashfs filesystem, little endian, version 4.0, compression:xz, size: 5224166 bytes, 1061 inodes, blocksize: 262144 bytes, created: 2022-11-02 15:53:34

and boots fine:

SF: Detected mx25l25635f with page size 256 Bytes, erase size 64 KiB, total 32 MiB
reading kernel 0 from: 0x1d0000, size: 0x002d5000
## Loading kernel from FIT Image at 86000000 ...
   Using 'config@1' configuration
   Verifying Hash Integrity ... OK
   Trying 'kernel-1' kernel subimage
     Description:  MIPS OpenWrt Linux-5.15.76
     Type:         Kernel Image
     Compression:  lzma compressed
     Data Start:   0x860000e4
     Data Size:    2956009 Bytes = 2.8 MiB
     Architecture: MIPS
     OS:           Linux
     Load Address: 0x80001000
     Entry Point:  0x80001000
     Hash algo:    crc32
     Hash value:   7f050b69
     Hash algo:    sha1
     Hash value:   a97d4c098d8dfd67864e9a17c35754371c883cf3
   Verifying Hash Integrity ... crc32+ sha1+ OK
## Loading fdt from FIT Image at 86000000 ...
   Using 'config@1' configuration
   Trying 'fdt-1' fdt subimage
     Description:  MIPS OpenWrt ubnt_unifi-6-lite device tree blob
     Type:         Flat Device Tree
     Compression:  uncompressed
     Data Start:   0x862d1d10
     Data Size:    11387 Bytes = 11.1 KiB
     Architecture: MIPS
     Hash algo:    crc32
     Hash value:   16bb5a14
     Hash algo:    sha1
     Hash value:   40f26bf28e33bbe661fec716929f2003301f5e4d
   Verifying Hash Integrity ... crc32+ sha1+ OK
   Booting using the fdt blob at 0x862d1d10
   Uncompressing Kernel Image ... OK
   Using Device Tree in place at 862d1d10, end 862d798a
[    0.000000] Linux version 5.15.76 (bjorn@canardo) (mipsel-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r21167-1673b7dca384) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 SMP Wed Nov 2 15:53:34 2022
[    0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001992f (MIPS 1004Kc)
[    0.000000] MIPS: machine is Ubiquiti UniFi 6 Lite
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] VPE topology {2,2} total 4
[    0.000000] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.000000] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000]   HighMem  empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000000fffffff]
[    0.000000] percpu: Embedded 11 pages/cpu s15632 r8192 d21232 u45056
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 64960
[    0.000000] Kernel command line: console=ttyS0,115200
[    0.000000] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes, linear)
[    0.000000] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[    0.000000] Writing ErrCtl register=0004a000
[    0.000000] Readback ErrCtl register=0004a000
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 248440K/262144K available (7250K kernel code, 660K rwdata, 1536K rodata, 1228K init, 242K bss, 13704K reserved, 0K cma-reserved, 0K highmem)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] 	Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 256
[    0.000000] clocksource: GIC: mask: 0xffffffffffffffff max_cycles: 0xcaf478abb4, max_idle_ns: 440795247997 ns
[    0.000004] sched_clock: 64 bits at 880MHz, resolution 1ns, wraps every 4398046511103ns
[    0.008066] Calibrating delay loop... 586.13 BogoMIPS (lpj=2930688)
[    0.066217] pid_max: default: 32768 minimum: 301
[    0.071042] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.078254] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.089214] rcu: Hierarchical SRCU implementation.
[    0.095142] smp: Bringing up secondary CPUs ...
[    0.100604] Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
[    0.100632] Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
[    0.100647] MIPS secondary cache 256kB, 8-way, linesize 32 bytes.
[    0.100694] CPU1 revision is: 0001992f (MIPS 1004Kc)
etc

Check out the discussion thread:
https://www.mail-archive.com/u-boot@lists.denx.de/msg386990.html

Basically, you can try passing -B 8 to mkimage in OpenWrt, that should force everything to be 8 byte aligned.

I can't make that work. The -B doesn't seem to make any difference unless paired with -E.

So I'm still looking for a way to make mkimage align the dtb without changing anything else. Or at least without creating a broken image for some other reason.

The code that makes this fail seems to com frommainline u-boot:

"disable_relocation" is set if the "fdt_high" environment variable is ~0UL. This, and the alignment issues it causes, is documented on https://u-boot.readthedocs.io/en/latest/usage/environment.html :

If this is set to the special value 0xffffffff (32-bit machines) or 0xffffffffffffffff (64-bit machines) then the fdt will not be copied at all on boot. For this to work it must reside in writable memory, have sufficient padding on the end of it for u-boot to add the information it needs into it, and the memory must be accessible by the kernel. This usage is strongly discouraged however as it also stops U-Boot from ensuring the device tree starting address is properly aligned and a misaligned tree will cause OS failures.

Now, the "fdt_high" variable does not exist in the default env, and attempting to define it to something else makes no difference. So I assume that Ubnt in their wisdom have patched this U-boot to always use this "strongly discouraged" feature.

Been knocking my head against a brick wall for a while now. This is NOT trivial to fix.

What we need to make the Unifi 6 Lite boot loader work is a way to make sure the "data" property of the "/images/fdt-1" node is aligned to 8 bytes. And to make it generic we should be able to guarantee that for any number of "/images/fdt*" nodes.

I really cannot see any way this can be done perfectly without extending dts and dtc with an alignment flag or similar, and let dtc insert FDT_NOPs (0x00000004) if necessary.

I have a strong feeling that something like that is impossible.....

Failing that solution, the next best I can come up with is to try to create the image, and optionally redo it with 4 dummy bytes added in front of "data" properties containing embedded fdt(s). The "description" property of the "/images/fdt-1" node looks like a good place to put those dummy bytes, making it local to the node and allowing the method to work for any number of embedded fdt blobs.

It would obviously be better to pre-calculate the required alignment dummy injections instead of having to do a two-pass image creation. But I find that calculation very hard to do without actually doing all the work of mkits and dtc outside those tools. Which will be both fragile and rather pointless. That's why I ended up with the "let's just do it twice and adjust as necessary".

What do you think? I guess this will be a one-off rule for the Unifi 6 Lite for now.

Except that there is something I'm missing here. Took a quick look at the OEM firmware, which embeds no less that 5 fdt blobs for different models. I assumed they would all be 8 byte aligned, but they are not:

$ hexdump -C /tmp/mtd6| egrep 'd0 *0d *fe *ed'
00000000  d0 0d fe ed 00 b7 da af  00 00 00 38 00 b7 d3 70  |...........8...p|
00b6e2b0  00 00 00 27 d0 0d fe ed  00 00 2b 51 00 00 00 38  |...'......+Q...8|
00b71060  00 00 00 03 00 00 2b 76  00 00 00 27 d0 0d fe ed  |......+v...'....|
00b73e40  00 00 2a 7f 00 00 00 27  d0 0d fe ed 00 00 2a 7f  |..*....'......*.|
00b76b30  d0 0d fe ed 00 00 2a 83  00 00 00 38 00 00 27 64  |......*....8..'d|
00b79810  00 00 2b 55 00 00 00 27  d0 0d fe ed 00 00 2b 55  |..+U...'......+U|

Doh! Of course. This is related to changes in the kernel, isn't it? We know the boot loader on the Unifi 6 Lite accepts the unaligned blobs - my failing boot log shows that. The problem is with the 5.15 kernel. I seem to recall that there were some libfdt changes wrt alignment, wasn't there?

Yes, back to solving the problem. No help from OEM here. We do need to align the blobs if we are to boot our kernels. Or re-add the kernel alignment fixups, which I think I hate more than trying to extend dts...

FYI

The kernel commit introducing the 8 bte alignment enforcement was https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=79edff12060f in v5.12

This imported the dtc commit https://git.kernel.org/pub/scm/utils/dtc/dtc.git/commit/?id=5e735860c478 in the kernel, and has caused quite a bit of fallout. Just go look for Fixes referring to 79edff12060f i

But in this case, fixing means replacing U-Boot, And I don't think we want to make that mandatory for running OpenWrt on these devices unless we have to,

The other alternative is to create FIT images with 8 byte alignment of the /images/fdt-1 "data" property. I've not found a way to do that using the dtc and mkimage tools, except manipulating the input to make the end result work. So that's what I've attempted to do: https://patchwork.ozlabs.org/project/openwrt/patch/20221103175447.2941098-1-bjorn@mork.no/

(feel free to laiugh :slight_smile:

Finally, for the record, the OEM images are not ensuring 8 byte aligned fdt blobs either. They survive because the run an old kernel. Grepping for the fdt magic in an OEM image shows this:

$ hexdump -C /tmp/mtd6|grep 'd0 0d fe ed'
00000000  d0 0d fe ed 00 b7 da af  00 00 00 38 00 b7 d3 70  |...........8...p|
00b6e2b0  00 00 00 27 d0 0d fe ed  00 00 2b 51 00 00 00 38  |...'......+Q...8|
00b71060  00 00 00 03 00 00 2b 76  00 00 00 27 d0 0d fe ed  |......+v...'....|
00b73e40  00 00 2a 7f 00 00 00 27  d0 0d fe ed 00 00 2a 7f  |..*....'......*.|
00b76b30  d0 0d fe ed 00 00 2a 83  00 00 00 38 00 00 27 64  |......*....8..'d|
00b79810  00 00 2b 55 00 00 00 27  d0 0d fe ed 00 00 2b 55  |..+U...'......+U|

The magic at 0 is of course the FIT image itself. The other 5 are different fdts for different models supported by this image. As you can see, the 1st and 2nd of these are aligned to 4 bytes. That's the "u6Lite-fdt" and "u6IW-fdt".

Nice one bisecting to the commit that exposed this.

The proper way to fix this would be to expand mkimage with an option to enforce 8-byte alignment.

Must admit that I didn't bother bisecting this. I was already onto that commit. Not the first time we hit it. It broke the realtek target as one of the few using raw appended dtbs on MIPS: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6654111c893fec1516d83046d2b237e83e0d5967

That one was a bit easier to find though, since the initramfs code appends a lone 32bit integer to an aligned image, guaranteeing that the dtb always was unaligned.

I usually use bisect not as to use git bisect but pretty much for any kind of finding the culprit commit, usually by looking at the history you can spot potential commits as you did.

Its always interesting to see a fallout of actually enforcing a spec that has been there for years