Sysupgrade 22.03 to 22.03.1 on x86 fails with unable to determine upgrade device

I've got a x86_64 efi mini PC running 22.03 with an image built by the firmware selector. OpenWRT is running on a NVMe SSD drive, I installed it on that NVMe drive from a USB-stick. I now want to upgrade to the new 22.03.1 service release.

I've tried upgrade via luci-app-attendedsysupgrade, also tried the same on CLI with auc. Didn't work. I then requested a new build from the firmware selector for the 22.03.1 release, download the *.img.gz, unzipped it and then ran sysupgrade -i -v --test openwrt-22.03.1-6e69c7720449-x86-64-generic-ext4-combined-efi.img . I get the same result here as I see in the log output with the attended sys upgrade:

Fri Oct 14 11:10:33 CEST 2022 upgrade: Image metadata not present
Fri Oct 14 11:10:33 CEST 2022 upgrade: Unable to determine upgrade device
Image check failed.

I've traced the origin of this message back to /lib/upgrade/platform.sh, there's a function platform_check_image(). There's a reference to 2 other functions: export_bootdevice() and export_partdevice. This is probably where it goes wrong, but I don't know what exactly. If I source /lib/upgrade/common.sh and then run the export_bootdevice() function, I get an error:

root@TopTon:/lib/upgrade# . /lib/upgrade/common.sh 
root@TopTon:/lib/upgrade# export_bootdevice
-ash: cmdline_get_var: not found

This latest error cmdline_get_var: not found is the result of a local variable inside export_bootdevice():

export_bootdevice() {
        local cmdline uuid blockdev uevent line class
        local MAJOR MINOR DEVNAME DEVTYPE
        local rootpart="$(cmdline_get_var root)"
...

What do I need to fix? I basically want to upgrade from 22.03 to 22.03.1 and since this is my first x86 device for OpenWRT I wanted to get familiar with sysupgrade paths I have. The options presented are great; attended via luci, attended via CLI or download from the firmware selector, but they all seem to fail on the same check.

You need to source all scripts in /lib/upgrade/ before trying to invoke any procedure.

(
    for inc in /lib/upgrade/*.sh; do
        . "$inc"
    done

    export_bootdevice

    echo "$BOOTDEV_MAJOR:$BOOTDEV_MINOR"
)

Thank you, I've done that:

root@TopTon:/lib/upgrade# (
>     for inc in /lib/upgrade/*.sh; do
>         . "$inc"
>     done
> 
>     export_bootdevice
> 
>     echo "$BOOTDEV_MAJOR:$BOOTDEV_MINOR"
> )
-ash: cmdline_get_var: not found
:

Perhaps there's a file missing that needs to be sourced as well?

root@TopTon:/lib/upgrade# ll
drwxr-xr-x    3 root     root          4096 Oct 14 11:51 ./
drwxr-xr-x   10 root     root          4096 Aug 24 08:25 ../
-rw-r--r--    1 root     root          6860 Sep  3 04:55 common.sh
-rwxr-xr-x    1 root     root           451 Sep  3 04:55 do_stage2*
-rw-r--r--    1 root     root             0 Oct 14 11:51 done
-rw-r--r--    1 root     root             0 Oct 14 11:51 echo
-rw-r--r--    1 root     root             0 Oct 14 11:51 export_bootdevice
-rw-r--r--    1 root     root          2706 Sep  3 04:55 fwtool.sh
drwxr-xr-x    2 root     root          4096 Oct 14 10:29 keep.d/
-rw-r--r--    1 root     root           360 Sep  3 04:55 luci-add-conffiles.sh
-rw-r--r--    1 root     root          3381 Sep  3 04:55 platform.sh
-rwxr-xr-x    1 root     root          4144 Sep  3 04:55 stage2*

Ah yes, it appears that you need to source /lib/functions.sh as well.

Ah, good catch, things look a lot better than:

root@TopTon:/lib/upgrade# . /lib/functions.sh
root@TopTon:/lib/upgrade# export_bootdevice
root@TopTon:/lib/upgrade# set -x
root@TopTon:/lib/upgrade# export_bootdevice
+ export_bootdevice
+ local cmdline uuid blockdev uevent line class
+ local MAJOR MINOR DEVNAME DEVTYPE
+ cmdline_get_var root
+ local 'var=root'
+ local cmdlinevar tmp
+ cat /proc/cmdline
+ tmp='BOOT_IMAGE=/boot/vmlinuz'
+ '[' '=' '=' B ]
+ tmp='=PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb'
+ '[' '=' '=' '=' ]
+ echo 'PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb'
+ tmp=wait
+ '[' '=' '=' w ]
+ tmp='console=tty0'
+ '[' '=' '=' c ]
+ tmp='console=ttyS0,115200n8'
+ '[' '=' '=' c ]
+ tmp=noinitrd
+ '[' '=' '=' n ]
+ local 'rootpart=PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb'
+ '[' -e  ]
+ return 1

So does this mean I need to include the sourcing of /lib/functions.sh in /lib/upgrade/common.sh somewhere? Or is this an indication of something else not being in place?

Notice the return 1 at the end, which means that the function is failing. It cannot figure out the corresponding block device for the partition UUID 4fa135cd-1f1f-e744-b9e5-afe86ecd5deb

The reason is that the partition UUID does not follow a known format. It is not ending in 02 or 0x/PARTNROFF=1 and not matching the other known UUID formats.

It likely is a filesystem UUID which OpenWrt cannot obtain without invoking external utils such as blkid. You basically would need to extend the existing switch statement with a final catchall case that attempts to figure out the related blockdev by looking for the block device name in the output of blkid --match-token "$rootpart"

Here's an example from my workstation, using a randomly chosen PARTUUID from one of the mounted filesystems:

$ sudo blkid --match-token PARTUUID=6156991c-f8e1-11e9-9297-00d861a5d2da
/dev/sdf2: PARTLABEL="Microsoft reserved partition" PARTUUID="6156991c-f8e1-11e9-9297-00d861a5d2da"

So we can infer that partition UUID 6156991c-f8e1-11e9-9297-00d861a5d2da in my case corresponds to block device /dev/sdf2.

Might the notation of the NVMe SSD be the culprit?:

blkid --match-token PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb
/dev/nvme0n1p2: LABEL="rootfs" UUID="ff313567-e9f1-5a5d-9895-3ba130b4a864" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="4fa135cd-1f1f-e744-b9e5-afe86ecd5deb"

the notation of the partition in the NVMe SSD is different from USB (or SATA) naming. The device name is not /dev/sd* but /dev/nvme0n1. I ran into this when installing OpenWRT on the NVMe device using scripts from the wiki. I posted my experience.

1 Like

No, the notation is not the culprit, the UUID is. Seems you somehow generated a new FS and/or partition table with a random UUID not understood by OpenWrt‘s scripts.

The blkid fallback makes sense though and should be implemented anyway as it will make upgrades on at least x86 more robust.

I had to create a new UUID since I have a 250GB NVMe SSD and wanted to enlarge the partition and filesystem. I followed the article in the wiki on how to do that. The scripts shown as example didn't work for me so I replayed the steps manually. One of the steps is deleting and re-creating the root partition, but this time larger. By doing so, a new UUID is generated probably by fdisk because the next step is to retrieve the new PARTUUID:

lsblk -o PATH,SIZE,PARTUUID

If I manually replay one of the cases in the switch statement, I need to eventually end up with a uevent file of the root device. In the particular one I'm after this file says:

root@TopTon:/lib/upgrade# cat /sys/class/block/nvme0n1p2/uevent 
MAJOR=259
MINOR=2
DEVNAME=nvme0n1p2
DEVTYPE=partition
PARTN=2

If I for the sake of quick check just point to the correct uevent file, this seems to echo the echo "$BOOTDEV_MAJOR:$BOOTDEV_MINOR" correctly:

+ export_bootdevice            
+ local cmdline uuid blockdev uevent line class
+ local MAJOR MINOR DEVNAME DEVTYPE
+ cmdline_get_var root
+ local 'var=root'   
+ local cmdlinevar tmp
+ cat /proc/cmdline         
+ tmp='BOOT_IMAGE=/boot/vmlinuz'
+ '[' '=' '=' B ]
+ tmp='=PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb'
+ '[' '=' '=' '=' ]
+ echo 'PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb'                                                                
+ tmp=wait
+ '[' '=' '=' w ]
+ tmp='console=tty0'
+ '[' '=' '=' c ]
+ tmp='console=ttyS0,115200n8'
+ '[' '=' '=' c ]
+ tmp=noinitrd
+ '[' '=' '=' n ]
+ local 'rootpart=PARTUUID=4fa135cd-1f1f-e744-b9e5-afe86ecd5deb'
+ uevent=/sys/class/block/nvme0n1p2/uevent
+ '[' -e /sys/class/block/nvme0n1p2/uevent ]
+ read line
+ export -n 'MAJOR=259'
+ read line
+ export -n 'MINOR=2'
+ read line
+ export -n 'DEVNAME=nvme0n1p2'
+ read line
+ export -n 'DEVTYPE=partition'
+ read line
+ export -n 'PARTN=2'
+ read line
+ export 'BOOTDEV_MAJOR=259'
+ export 'BOOTDEV_MINOR=2'
+ return 0
+ echo 259:2
259:2

So that seems to be what I'm after. Now to quickly check the manual sysupgrade I get a message about having an invalid partition table on /dev/nvme0n1p2:

root@TopTon:/tmp# sysupgrade -i -v --test openwrt-22.03.1-6e69c7720449-x86-64-generic-ext4-combined-efi.img 
Fri Oct 14 13:04:25 CEST 2022 upgrade: Image metadata not present
Fri Oct 14 13:04:25 CEST 2022 upgrade: Reading partition table from bootdisk...
Fri Oct 14 13:04:25 CEST 2022 upgrade: Invalid partition table on /dev/nvme0n1p2
Failed to parse message data
sh: out of range
Keep config files over reflash (Y/n):

The disk layout of the NVMe SSD is as follows:

root@TopTon:/tmp# fdisk -l
Disk /dev/nvme0n1: 232.89 GiB, 250059350016 bytes, 488397168 sectors
Disk model: Samsung SSD 980 250GB                   
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 16384 bytes / 131072 bytes
Disklabel type: gpt
Disk identifier: 266B958D-769A-BCBD-AA7B-1BAC46AB8600

Device           Start       End   Sectors   Size Type
/dev/nvme0n1p1     512     33279     32768    16M Linux filesystem
/dev/nvme0n1p2   33280 488397134 488363855 232.9G Linux filesystem
/dev/nvme0n1p128    34       511       478   239K BIOS boot

It's an EFI system, but I'm a bit confused on the actual boot partition:

root@TopTon:/tmp# mount
/dev/root on / type ext4 (rw,noatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noatime)
tmpfs on /dev type tmpfs (rw,nosuid,noexec,noatime,size=512k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,noatime,mode=600,ptmxmode=000)
debugfs on /sys/kernel/debug type debugfs (rw,noatime)
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,noatime,mode=700)

I'm familiar with plain old BIOS but not so much with EFI based systems.

Well now the special notation for nvme disks comes into play. At some point the partition suffix should have been stripped to resolve the parent block device: /dev/nvme0n1p2 => /dev/nvme0n1.

For the other working UUID cases this happens implicitly because the UUIDs are actually mangled before being looked up (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx02 => xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx00) which causes the parent disk to get looked up.

For truly random UUIDs you can't obtain the parent UUID just by zeroing the last octets, so you do need to resort to sysfs to figure out the actual parent block device:

$ readlink /sys/class/block/nvme0n1p2
../../devices/pci0000:00/0000:00:1c.4/0000:04:00.0/nvme/nvme0/nvme0n1/nvme0n1p2

So we see that in sysfs the directory for the partition block device is subdirectory of the parent block device directory, means we can reach the uevent file of the parent block dev directly in the parent directory:

$ cat /sys/class/block/nvme0n1p2/../uevent
MAJOR=259
MINOR=0
DEVNAME=nvme0n1
DEVTYPE=disk

Means, once you figured out the nvme0n1p2 name, you can simply set the uevent path to /sys/class/block/$name/../uevent - and indeed, other cases already do exactly that:

Here is an untested diff that should implement the desired functionality. It requires the blkid tool to be installed on the system.

diff --git a/package/base-files/files/lib/upgrade/common.sh b/package/base-files/files/lib/upgrade/common.sh
index 5af061f6a4..6c582f00b0 100644
--- a/package/base-files/files/lib/upgrade/common.sh
+++ b/package/base-files/files/lib/upgrade/common.sh
@@ -183,6 +183,15 @@ export_bootdevice() {
                                fi
                        done
                ;;
+               PARTUUID=*)
+                       line=$(blkid --match-token "$rootpart" 2>/dev/null)
+                       case "$line" in
+                               /dev/*:*)
+                                       blockdev=${line%%:*}
+                                       uevent="/sys/class/block/${blockdev##*/}/../uevent"
+                               ;;
+                       esac
+               ;;
        esac
 
        if [ -e "$uevent" ]; then
3 Likes

just applying this patch above even seems to address the error I got with the boot device:

root@TopTon:/tmp# sysupgrade -i -v --test openwrt-22.03.1-6e69c7720449-x86-64-generic-ext4-combined-efi.img 
Fri Oct 14 14:15:29 CEST 2022 upgrade: Image metadata not present
Fri Oct 14 14:15:30 CEST 2022 upgrade: Reading partition table from bootdisk...
Fri Oct 14 14:15:30 CEST 2022 upgrade: Extract boot sector from the image
Fri Oct 14 14:15:30 CEST 2022 upgrade: Reading partition table from image...
Fri Oct 14 14:15:30 CEST 2022 upgrade: Partition layout has changed. Full image will be written.

I was able to get an image built and downloaded via luci attended sysupgrade, after rebooting however the upgrade doesn't seem to have been applied. Running auc has the same result, it all seems well, but after reboot still on 22.03.0:

...
kmod-igc: 5.10.138-1 -> 5.10.146-1
 kmod-i2c-algo-bit: 5.10.138-1 -> 5.10.146-1
 kmod-bnx2: 5.10.138-1 -> 5.10.146-1
 kmod-mdio-devres: 5.10.138-1 -> 5.10.146-1
 kmod-scsi-core: 5.10.138-1 -> 5.10.146-1
 kmod-slhc: 5.10.138-1 -> 5.10.146-1
 kmod-sched-core: 5.10.138-1 -> 5.10.146-1
Are you sure you want to continue the upgrade process? [N/y] y
Requesting build.....................................................................................
Downloading image from https://sysupgrade.openwrt.org/store/de8a54f86eac544bc3bdaa83ef3fdc23/openwrt-22.03.1-b212ed9c2093-x86-64-generic-ext4-combined-efi.img.gz
Writing to 'openwrt-22.03.1-b212ed9c2093-x86-64-generic-ext4-combined-efi.img.gz'
Fri Oct 14 14:24:11 CEST 2022 upgrade: Image metadata not present
Fri Oct 14 14:24:11 CEST 2022 upgrade: Reading partition table from bootdisk...
Fri Oct 14 14:24:11 CEST 2022 upgrade: Extract boot sector from the image
Fri Oct 14 14:24:11 CEST 2022 upgrade: Reading partition table from image...
Fri Oct 14 14:24:11 CEST 2022 upgrade: Partition layout has changed. Full image will be written.
invoking sysupgrade
Connection to topton closed by remote host.
Connection to topton closed.
dennis@f12:~$ ping topton
PING topton.thuis (192.168.1.86) 56(84) bytes of data.
64 bytes from TopTon.thuis (192.168.1.86): icmp_seq=4 ttl=64 time=1.01 ms
64 bytes from TopTon.thuis (192.168.1.86): icmp_seq=5 ttl=64 time=0.774 ms
64 bytes from TopTon.thuis (192.168.1.86): icmp_seq=6 ttl=64 time=0.870 ms
64 bytes from TopTon.thuis (192.168.1.86): icmp_seq=7 ttl=64 time=0.728 ms
64 bytes from TopTon.thuis (192.168.1.86): icmp_seq=8 ttl=64 time=0.749 ms
^C
--- topton.thuis ping statistics ---
8 packets transmitted, 5 received, 37.5% packet loss, time 7089ms
rtt min/avg/max/mdev = 0.728/0.825/1.006/0.102 ms
dennis@f12:~$ ssh root@topton


BusyBox v1.35.0 (2022-09-27 11:45:03 UTC) built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 22.03.0, r19685-512e76967f
 -----------------------------------------------------

I've connected a monitor and I can't really see what happens at the end of the sysupgrade, I see a message being written on console but right after that is displayed on the monitor the system reboots. Sysupgrade logs aren't stored anywhere so I have no idea what the last few lines are.

I believe that the fact my /dev/nvme0n1p1 partition isn't mounted on /boot plays a role in my struggles. I don't know why this boot partition isn't mounted on boot. I've booted the system with the USB stick I used to "install" OpenWRT on the internal NVMe disk. When running from that USB stick I see that /dev/sda1 is actually mounted on /boot. But this is not the case in the installation on my NVMe disk:

root@OpenWrt:/boot/grub# mount
/dev/root on / type ext4 (rw,noatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noatime)
/dev/sda1 on /boot type vfat (rw,noatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
/dev/sda1 on /boot type vfat (rw,noatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
tmpfs on /dev type tmpfs (rw,nosuid,noexec,noatime,size=512k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,noatime,mode=600,ptmxmode=000)
debugfs on /sys/kernel/debug type debugfs (rw,noatime)
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,noatime,mode=700)

-- EDIT --

I've fixed my /boot problem. I simply did dd if=/dev/sda1 of=/dev/nvme0n1p1 bs=1M. Before rebooting I've mounted /dev/nmve0n1p1 and updated boot/grub/grub.cfg so it has the correct PARTUUID. After rebooting the /boot partition is mounted. But my sysupgrade doesn't do anything. I've tried the attended upgrade via luci, auc and the sysupgrade -i -v. After a reboot I'm still on 22.03.0 instead of 22.03.1.

I don't see any errors anymore thanks to @jow . Any checks I can do to see what actually happens?

— EDIT 2 —
Actually my original question is answered, I’ll mark that as the solution and open a new topic for my sysupgrade not working as expected.

I applied the patch on post#11, but I'm still not able to upgrade from 22.03.1 to 22.03.2. I checked /boot and it is being mounted.

/dev/root on / type ext4 (rw,noatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,noatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,noatime)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noatime)
/dev/nvme0n1p1 on /boot type vfat (rw,noatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
/dev/nvme0n1p1 on /boot type vfat (rw,noatime,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
tmpfs on /dev type tmpfs (rw,nosuid,noexec,noatime,size=512k,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,noatime,mode=600,ptmxmode=000)
debugfs on /sys/kernel/debug type debugfs (rw,noatime)
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,noatime,mode=700)

Did you also install the blkid package with opkg?

that was the first thing I did.

edit:
I just tried again and auc succeeded. Not sure why the first sysupgrade from /tmp didn't work.

Do I understand correctly that you are able to sysupgrade your x86 machine with auc and succeeded? I tried to do that too, also via the luci-app and by downloading an *.img.gz but after a reboot my x86 machine stays at 22.03.

Have you enlarged the root partition (and filesystem)?

EDIT:
My bad. I just noticed my x86 device is still at 22.03.1, and running auc still fails to upgrade. It was my dumb AP that I upgraded last night.

Thanks for getting back, I thought I had done something wrong or had an issue on my x86 machine. I think the reason for any upgrade method not working is because the root partition is enlarged. I’ve got a week off next week so I’m going to try a different approach on “x86 upgrade scheme” for my setup. Probably multiple partitions with room to spare and multi boot grub and a separate larger “data” partition for AdGuard Home and backups and so on.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.