RaspberryPi sysupgrade looses overlay when /boot partition gets bigger

I'm trying (and failing so far) to figure out why my 19.07 based RaspberryPi installations loose all config data when sysupgrading to 22.03. The 19.07 setups are using sqashfs+f2fs already, not ext4. It is a custom build, but basically just fewer routing related packages and more IoT ones enabled in menuconfig.

I narrowed the problem down to the fact that the /boot partition increases from 20M to 64M. If I build the 22.03 image with a 20M boot partition (which is still just barely enough for bcm2708, but too small for bcm2709), the upgrade works.

I also found that it is not a 19.07 vs 22.03 issue, the same problem happens when I upgrade a 22.03 with 20M /boot to 64M /boot.

The relevant code seems to be in /target/linux/bcm27xx/base-files/lib/upgrade/platform.sh, in particular the platform_do_upgrade() function. It makes a distinction between partition table unchanged and changed. In the changed case, the entire image with all partitions is written as a whole, followed by an interesting comment and two partx statements:

  # Separate removal and addtion is necessary; otherwise, partition 1
  # will be missing if it overlaps with the old partition 2
  partx -d - "/dev/$diskdev"
  partx -a - "/dev/$diskdev"

The comment exactly describes the case of an expanding /boot partition.

Curiously, sysupgrading from a 22.03 with 64M /boot down to a version with 20M /boot works without loosing the overlay backup. This is also a partition table change and triggers the same alternative code path in platform_do_upgrade(), but apparently works.

It seems to me that platform_copy_config() is failing to actually mount /boot after a /boot partition size increase, which would explain why the backup gets lost (including config.txt and cmdline.txt which platform_copy_config() is trying to extract and apply early from the backup so these are already there at reboot).

I totally fail to see what could prevent partx to fail to make /boot mountable in this case, and I haven't figured out a way to singlestep this to see what happens.

So any ideas how to approach this are very welcome!

In the meantime, I found a way to analyze this but no idea how to fix it yet - by getting an interactive console shell after the switchover to ramfs (modified /lib/upgrade/stage2), so I could manually enter and observe the results of these partx calls, and also call ps, lsof etc.:

  • Problem is that in line 67 of /lib/upgrade/platform.sh (target/linux/brcm2708/base-files/lib/upgrade/platform.sh in buildroot), the partx -d - /dev/$diskdev which expands to

    partx -d - /dev/mmcblk0
    

    fails, because partition 2 is busy ("resource in use").
    In consequence, the larger new /boot partition cannot be added, and cannot be mounted later in platform_copy_config, and thus the config backup cannot be saved in /boot, which means it gets lost.

  • I assume this is because the f2fs overlay is still active somehow. ps shows three kernel threads related to f2fs, [f2fs_flush-7:0], [f2fs_discard-7:], [f2fs_gc-7:0].

  • lsof does not show any files open on /dev/mmcblk0p2, but I guess f2fs internals would not show in lsof anyway.

My open questions:

  • are lines 64/65 in /lib/upgrade/stage2 really sufficient to completely unmount a squashfs/f2fs overlay such that the /dev/mmcblk0p2 partition is not in use any more?

    /bin/mount -o noatime,remount,ro /overlay
    /bin/umount -l /overlay
    

    I tried manually to execute these lines, and noticed that the first line (remounting the overlay as readonly) apparently had no effect - /overlay was still rw afterwards.
    Judging from the -l (lazy) umount option in the second line I assume the author expected the volume to be still in use by something. But what? And at what later step it would become free?

  • anything else that could keep the /dev/mmcblk0p2 partition busy? At the time the switch_to_ramfs is run, all other processes have been killed, so no userspace process should have any files open that could make the partition busy.

  • any way to force the partition to get released? I guess that would be acceptable at this point in the process, because the only thing left to do is copy the config backup .tgz from /tmp to /boot (and extract config.txt early so it is ready at reboot), then reboot.

Any ideas and hints are welcome!

Also, confirmed experience of successful update of a RPi with squashfs+f2fs with expanding /boot partition would be a helpful indication that something in my setup must be special, apart from not using ext4 for rootfs.

1 Like

Really nobody else having this problem?

Maybe because OpenWrt on RPi is a niche, and using the squashfs variant seems to be a niche in that niche, and upgrading with /boot size increasing even more, so a niche^3 problem? :wink:

Still - I'd be very interested to learn what can keep the f2fs partition on /dev/mmcblk0p2 busy, when it is unmounted and all user processes except sysupgrade are already gone. Details see the original post and the followup analysis.

Still no solution, but just today, analyzing another bcm27xx sysupgrade problem, I noticed that the regular upgrade function, default_do_upgrade() in package/base-files/files/lib/upgrade/common.sh, does

sync
echo 3 > /proc/sys/vm/drop_caches

The bcm27xx specific implementation, platform_do_upgrade() in target/linux/bcm27xx/base-files/lib/upgrade/platform.sh, does not have that.

The kernel docs say that this includes flushing cached inodes - could it be that not flushing the cache is what keeps f2fs partition busy?

Would it make sense to add echo 3 > /proc/sys/vm/drop_caches to bcm27xx platform_do_upgrade()?

Only general ideas:
Are you able to see the f2fs sysfs entries?
/sys/kernel/debug/f2fs/
/sys/fs/f2fs
Either through that sysfs, or the f2fs_io utility, you should be able to make f2fs GC more urgent

There is also a trace config if you wanted to get really deep: CONFIG_F2FS_IO_TRACE

overlay has plenty of debug prints, so you may be able to build with DYNAMIC_DEBUG, then enable debug for it via kernel boot params, or later via /proc/dynamic_debug/control

Hi @johnth, thanks for these hints!

I will try that when I manage to get the test setup next time (a bit tedious because it is a point in the midst of the sysupgrade, after the pivot to ram-only operation, where a lot of tooling is missing unless I manually copy the needed executables and libs to ram).

A pending f2fs GC sounds plausible as a reason for the underlying partition still being used, but wouldn't unmounting the f2fs also trigger an urgent GC anyway?

Just for information for anybody who might hit the same problem.

Maybe nobody ever will :wink:, because apparently, not many people are using OpenWrt with RaspberryPi at all, and of those few even fewer are using the squashfs/f2fs layout, and of those very few apparently no one remains for whom box-bricking upgrade problems (like this one or another I found and even suggested a patch, with zero feedback) are an actual problem.

Still, for the record:

  • I found no way how to make f2fs release the partition in a reliable fashion.
  • My workaround is now to use losetup to map a loopback partition onto the space of the SD card where the new boot partition was dumped, and mount this, so the config backup and config.txt can be saved. I do this in /target/linux/bcm27xx/base-files/lib/upgrade/platform.sh, only as a fallback when partx -d returns an error.
  • contact me if you want the somewhat ugly patch doing the losetup thing.

Opinion:
I have a Zero and Pi4b. I currently run a sensor on the OpenWRT’ed Zero, with attached LTE modem. It feels like there is not even a handful of users doing similar stuff.

There is a lot of things, you have to know and figure out, its kind of an expert game. For the typical Pi owners/buyers, this likely seems to have a too steep learning curve.

On the other hand, for the typical OpenWRT router fan, the WiFi abilities of Pis feel very weak (repeater mode blocked by hardware, weak antenna, Broadcom chip) and none has more than 1 LAN port OOTB.

And having a floppy arrangement of Pi + USB-hub, USB-LAN adapter, USB-LTE, USB-Wifi adapters, external antennas and maybe additional PSUs in a corner of the room is maybe not everyones preferred idea of fun for a main everyday router. Also you need to carefully select each piece, to be properly supported.

All of this seems to make it a very narrow niche-type OpenWRT device type, resulting in few devs and even fewer user feedback. I guess such niche devices are more likely to show minor issues here and then, once you sidestep from the main path.

@Pico I fully agree - the RPi is not really a platform that fits OpenWrt's main goal, being a network router, very well. On top of that, while for most of the supported HW OpenWrt is the only alternative to the manufacturers proprietary FW, on RPi there's a wide range of open firmware options.

Thus, the niche for OpenWrt on RPi is small. But I'd say for those of us who want to run an RPi as a network appliance 24/7 for years without destroying the SD card, with clean separation between config and factory-reset state, in particular the squashfs/f2fs configuration of OpenWrt is ideal, because it is designed for that use case, where more desktop-y OSes are not. And quite obviously, there are devs that care for it a lot - otherwise we wouldn't have the RPi support at all.

The only thing I wonder is what I'm doing wrong in my attempts to contribute to this, so getting near zero response to detailed analysis and even suggested patches. Maybe simply input overflow on the part of the devs that would need to take it further...