The router
I have a QNAP QHora-322 router. This is an iEi Puzzle M902 with a
different software load. (iEi owns QNAP, so it's an in-house reuse.)
The software on both the QHora-322 and the Puzzle M902 are derived
from OpenWRT, I believe. The QHora software is a lot of extra stuff,
however, to enable scalable management.
It's a pretty nice box: six 2.5 Gb/s ethernet ports and three 10 Gb/s
ports, plenty of flash & RAM, and an M.2 slot for an NVME SSD.
OpenWRT supports this hardware and has a page for it, with a link from
its entry in the Table of Hardware:
https://openwrt.org/toh/iei/puzzle-m902
But there are some problems with the information on that page, as I'll
relate below. Some of these stem from tiny differences in the boot
setup between the Puzzle M902 box and the QHora-322, that go all the
way down to the u-boot configuration. The OpenWRT web page gives
firmware-flashing details for the Puzzle M-902; they're close but not
identical for what you need to do when the hardware is running the
QHora-322 firmware.
I'm trying to understand the low-level details of the setup and have
run into some mysteries. I'm wondering if anyone knowledgeable can
educate me.
Puzzle M902 & QHora-322 boot are not the same
There's a mystery environment variable, $current_entry
, that must
be set in u-boot. It is not mentioned in the OpenWRT web page, and
because of this, the firmware-flashing instructions will fail on a
QHora-322. If you hunt around this forum, you can find one or two
posts that show this setting in firmware-installation instructions,
but it's not explained, just presented.
Here's the root of the problem: u-boot is configured to load the
kernel and its accompanying device-tree blob (dtb) from a pair of
files that live on a filesystem in partition /dev/mmcblk0p1 of the 4GB
eMMC chip. Here are the names of these two files that u-boot loads:
System | kernel | DTB |
---|---|---|
iEi Puzzle M902 stock | Image | cn9132-puzzle-m902.dtb |
OpenWRT 24.10.0 | Image | cn9132-puzzle-m902.dtb |
QHora-322 stock | Image | cn9132-db-A.dtb |
This means that on a system running Puzzle or OpenWRT, u-boot
does a
ext4load mmc 0:1 0x6000000 cn9132-puzzle-m902.dtb
to load the DTB into ram; but booting QHora-322 requires
ext4load mmc 0:1 0x6000000 cn9132-db-A.dtb
This little piece of the boot process is contained in a u-boot
environment variable $bootcmd
. This inconsistency is annoying, but
it's not that big a deal to fix. You can change $bootcmd
to the
right thing from the serial-port console, or from a running linux with
the fw_setenv
(1) program. Fine.
But it doesn't work. Here's what happens. I install the OpenWRT DTB
cn9132-puzzle-m902.dtb
onto the flash, change u-boot's $bootcmd
to
refer to it by its right name, do a u-boot saveenv
command to save
the new $bootcmd
back to flash, and reboot. The hardware comes back
up to u-boot and I interrupt to get a u-boot interactive prompt --
where I discover that $bootcmd
has somehow been reset to the old
command string, the one that tries to load the DTB from cn9132-db-A.dtb
!
Which won't work, of course: the kernel will come up without a DTB and
quietly fail.
The mystery variable $current_entry
As far as I can tell, the culprit is this other, mystery u-boot
environment variable: $current_entry
. The QHora-322 install sets it
to 1. If you change it to 0 (and change $bootcmd
to the correct boot
commands for OpenWRT, and then save both changes with a saveenv
)...
then you win.
This little bit of $current_entry
magic is not shown in the
firmware-flashing instructions on the OpenWRT web page for the Puzzle
M902. If you poke around on the OpenWRT forum, you will stumble across
firmware-flashing instructions for the QHora-322 that do specify
it... but they don't say why. Even more mysteriously, you can print
out the entire set of u-boot environment variables and look for a
command string that uses $current_entry
-- or, indeed, any
reference to this variable at all. Nothing. It's simply not
referenced. I couldn't find any mention of it in the documentation on
the u-boot web site https://docs.u-boot.org/en/latest/. I looked
around with google: nada.
Some people clearly know about it, because some of the
firmware-updating instructions provided by posts in this forum
manipulate it correctly. But none of these posts explain how it works,
what it does. It's just part of the firmware-flashing voodoo you are
told to type in.
I suspect/guess that this variable has something to do with the fact
that the Puzzle M902's 4GB eMMC comes with a dual-boot partition
structure. There are two partitions for a kernel/dtb pair, and two
more partitions for the root filesystem. Here's the total partition
structure:
Device | Label | What's in the filesystem |
---|---|---|
mmcblk0p1 | kernel_1 | Ext4: Image, cn9132-puzzle-m902.dtb, boot.scr |
mmcblk0p2 | kernel_2 | Ext4: Image, cn9132-db-A.dtb |
mmcblk0p3 | rootfs_1 | Squashfs: read-only root |
mmcblk0p4 | rootfs_2 | Ext4: usr.squashfs rootfs.squashfs |
mmcblk0p5 | sys_log | Ext4: /var/log QHora setup |
mmcblk0p6 | reserved | Ext4: empty |
mmcblk0p7 | rootfs_data | Ext4: r/w overlay root |
(Oddly, on the QHora setup, the rootfs partition has an ext4 filesystem
that contains only two files, filesystem images for /usr and /, which
are mounted via loop devices. I don't know why they do this; they could
have just made them two subdirectories and then mounted them into the
correct places with rebind mounts.)
Theories
Here are my guesses as to what's going on:
-
The QNAP people have a dual-boot structure for ease of upgrade,
so that the upgrade can be done from a linux running out of
kernel_1 and rootfs_1 writing into kernel_2 and rootfs_2. Once
the new firmware is installed, the running linux can frob
$current_entry
so that the the bootloader will use the newly
written partitions on a reboot. On the next upgrade, the roles
of the partitions are reversed.Or maybe it's to provide rollback ability in case an upgrade turns
out to have been a mistake. Maybe? -
The
$current_entry
env-var setting affects some piece of u-boot that
I can't see that is supposed to select between the two boot choices. -
And somehow there are two different places on the flash where u-boot
stores two independent sets of env vars, and somehow the
$current_entry
setting affects which set u-boot uses? Which is odd
and which I really don't understand.
But that's all conjecture. In short, I can see the effects of this machinery,
but I don't understand what the specific machinery is.
Can anyone clue me in?
Thanks.
EKH