Increasing mamba and venom kernel partition to 6MB

Yes, this is explicitly supported by the Linksys mvebu upgrade scripts. I also tested it on my Mamba and it worked as expected.

But looking at those scripts now, I wonder if this patchset missed a detail: The kernel size/rootfs split is coded into linksys_get_root_magic(), and used by platform_do_upgrade_linksys() to figure out if both old and new root is UBI when doing factory image upgrades:

linksys_get_root_magic() {
	(get_image "$@" | dd skip=786432 bs=4 count=1 | hexdump -v -n 4 -e '1/1 "%02x"') 2>/dev/null
}

platform_do_upgrade_linksys() {
	local magic_long="$(get_magic_long "$1")"

	mkdir -p /var/lock
	local part_label="$(linksys_get_target_firmware)"
	touch /var/lock/fw_printenv.lock

	if [ ! -n "$part_label" ]
	then
		v "cannot find target partition"
		exit 1
	fi

	local target_mtd=$(find_mtd_part $part_label)

	[ "$magic_long" = "73797375" ] && {
		CI_KERNPART="$part_label"
		if [ "$part_label" = "kernel1" ]
		then
			CI_UBIPART="rootfs1"
		else
			CI_UBIPART="rootfs2"
		fi

		nand_upgrade_tar "$1"
	}
	[ "$magic_long" = "27051956" -o "$magic_long" = "0000a0e1" ] && {
		# check firmwares' rootfs types
		local target_mtd=$(find_mtd_part $part_label)
		local oldroot="$(linksys_get_root_magic $target_mtd)"
		local newroot="$(linksys_get_root_magic "$1")"

		if [ "$newroot" = "55424923" -a "$oldroot" = "55424923" ]
		# we're upgrading from a firmware with UBI to one with UBI
		then
			# erase everything to be safe
			mtd erase $part_label
			get_image "$1" | mtd -n write - $part_label
		else
			get_image "$1" | mtd write - $part_label
		fi
	}
}

So this detection falis when either old or new (or both) images are using the new 4M split. A minor problem, since the only differenct is whether the old UBI partition is erased or not. But I guess that code was put there for a reason...

EDIT: is that reason wear levelling of the unwritten blocks? My knowledge of the UBI internals is close to non-existing, but I imagine this could end up with blocks being marked as more used than the freshly written ones?

Suggestions on how to make that work for the new world, where we should expect any combination of 3-or-4 MB for new and old?

Out of curiosity, is that the same unmodified script for all mvebu routers?

Even for those like wrt3200acm that natively has 6 MB kernel??? For them the skipping calculation would have been wrong all the time. That makes me to think that it is not actually needed.

Yes, it's the same for all of them:

        linksys,wrt1200ac|\
        linksys,wrt1900ac-v1|\
        linksys,wrt1900ac-v2|\
        linksys,wrt1900acs|\
        linksys,wrt3200acm|\
        linksys,wrt32x)
                platform_do_upgrade_linksys "$1"
                ;;

Note though that this code isn't exercised often. It's only ever relevant if you sysupgrade to a factory image, and then only if there was a UBI image in the same partition before. Does stock firmware use UBI?

But I do think you're right that the special handling is unneeded. Comparing my old firmware in "kernel2":

root@wrt1900ac-1:/# ubinfo -d 0
ubi0
Volumes count:                           2
Logical eraseblock size:                 126976 bytes, 124.0 KiB
Total amount of logical eraseblocks:     296 (37584896 bytes, 35.8 MiB)
Amount of available logical eraseblocks: 0 (0 bytes)
Maximum count of volumes                 128
Count of bad physical eraseblocks:       0
Count of reserved physical eraseblocks:  20
Current maximum erase counter value:     28
Minimum input/output unit size:          2048 bytes
Character device major/minor:            249:0
Present volumes:                         0, 1

with the new one in "kernel1", I see that the new maximum erase counter looks pretty sane:

root@wrt1900ac-1:~# ubinfo -d 0
ubi0
Volumes count:                           2
Logical eraseblock size:                 126976 bytes, 124.0 KiB
Total amount of logical eraseblocks:     287 (36442112 bytes, 34.7 MiB)
Amount of available logical eraseblocks: 0 (0 bytes)
Maximum count of volumes                 128
Count of bad physical eraseblocks:       1
Count of reserved physical eraseblocks:  19
Current maximum erase counter value:     2
Minimum input/output unit size:          2048 bytes
Character device major/minor:            249:0
Present volumes:                         0, 1

I don't have the dump from before the upgrade unfortunately. But I think it's safe to assume that the usage was similar to what I have in "kernel2". Most of the writeas are probably due to constantly sysupgrading this router for a number of years. And the automatic image switching should even out that load between the two system partitions. So when we have a maxium of 2 now, then something must have corrected all those old counters that weren't erased by the upgrade

EDIT: and that magic something is target/linux/generic/pending-5.10/494-mtd-ubi-add-EOF-marker-support.patch

There are some really smart people working on Openwrt :slight_smile:

I can see the result in my boot log after the upgrade:

[    2.533790] UBI: auto-attach mtd5
[    2.537122] ubi0: attaching mtd5
[    2.595576] UBI: EOF marker found, PEBs from 72 will be erased

So that Linksys special UBI to UBI code is definitely not needed

1 Like

I can confirm this works just fine. I upgraded two mamba devices in this manner, and then repeating to modify both partitions.

1 Like

I wondered about that as well. I don't think it's necessary. The normal tarfile sysupgrade procedure should work fine. And it might preserve erase counters for the remaining blocks? I believe we only need the factory image when the runtime partition layout doesn't match the device tree we're flashing.

1 Like

For these devices you don't need to use stock OEM/Linksys firmware.
The instructions I wrote for my builds:

Verify compatibility
 - $ fw_printenv | grep "pri_kern_size"; #mamba MUST equal 0x400000
 - $ fw_printenv | grep "priKernSize"; #venom MUST equal 0x0600000
Do not try to change them!

Flashing process:
You must always use a factory image when flashing to or away from a resized build.
- Create a backup tar
- Flash the corresponding factory image via sysupgrade. 'Force' enabled, 'Keep settings' disabled.
- Restore the backup

It is my understanding you have to wipe, as the data partition is 1MB smaller, the first 1MB can/will end up corrupt if you don't wipe.

I don't understand what you mean here...

The first 1MB of the old rootfs partition is erased and overwritten with kernel data or padding when you flash a factory image. As for the config backup - that's always stored in a separate partition ("syscfg").. This partition is not affected by anything you do to any of the kernel or rootfs partitions

Wiping configuration creates unnecessary restore hassle, and nothing else. In this particualar case, that is. Every forced upgrade will be different.

1 Like

@bmork

It is my understanding that when you sysupgade with keep settings:

  • it backs up configs into ram
  • flashes the image
  • restore the configs

but in this case it will restore the configs with the old values causing it to overwrite the kernel leaving it corrupt.

so you must backup, upgrade without keep settings, then restore in order to have it placed in the correct location.

If that is not correct, I stand corrected.
However I will leave my instructions as I have not tested it your way.

sorry, but you are wrong.

The OpenWrt config files contain no info on partition structure. It is directly defined in the DTS attached to kernel.

Linksys mvebu routers do it differently, as they get stored to /tmp/syscfg that is an extra partition. But being in RAM or there would make no difference, as the partition structure is not part of the config files.
in

1 Like

It's not correct for the Linksys devices in the mvebu target. The restore is delayed until next boot on these.

On the devices:

        linksys,wrt1200ac|\
        linksys,wrt1900ac-v1|\
        linksys,wrt1900ac-v2|\
        linksys,wrt1900acs|\
        linksys,wrt3200acm|\
        linksys,wrt32x)

platform_copy_config_linksys() in target/linux/mvebu/cortexa9/base-files/lib/upgrade/linksys.sh will copy the backup to /tmp/syscfg/$BACKUP_FILE. There is no writing to the new iimage

Restoring happens when the device boots from the new image. It loads target/linux/mvebu/cortexa9/base-files/lib/preinit/81_linksys_syscfg which mounts /tmp/syscfg and unpacks the $BACKUP_FILE if found.

This process is completely safe against any partition changes.

1 Like

@bmork
I was not aware of that. Thank you for the clarification.

I did test this way: the sysupgrade command I used (-F as the only flag) keeps settings, and I used the factory .img file, not the sysupgrade .bin. This worked just fine, and I did this twice to keep both partitions, keeping settings both times. This makes the upgrade essentially painless.

The question on the table, is: can the second flashing use sysupgrade with potentially some flash wear leveling benefits. I chose to be on the safe side and not test this.

1 Like

@InkblotAdmirer
Thank you for testing @bmork 's steps.

Yes, as your currently running system is already using the new partition structure, the sysupgrade would be made according to that and the factory image has the correct layout thanks to padding between kernel and rootfs.

(There is no real "partition table". It is hardcoded in the currently running kernel/DTS.)

as a user I have some questions that I hope wont derail the topic here.
It this gonna affect my router that is the 1200ac version? I think it is called caiman.
with my limited knowledge I look at the makefile for it and it seems to inherit the 6MB size from linksys makefile. is that correct?
so 1200ac had 6MB kernel size but newer versions like wrt32x didnt?

also I read in the git logs that it needs manual installation interference and cant be upgraded with sysupgrade because it is a backward incompatible change is that correct and also it that gonna apply to my router as well or because mine doesn't change the kernel size it is just like 18 --> 19 upgrade?

also I read something about swconfig to DSA (or something like that) migration.
does that necessitate a manual update process too? does it affect mine too?

Your router is not affected.
The affected routers in the series were the oldest wrt1900ac and, surprisingly, the newest WRT32x. The others, 1200ac, 1900acs, 3200acm etc., already had a larger kernel area.

Regarding to DSA, master and 21.02 use DSA and the wired config changes. Best to flash from 19.07 without settings.

2 Likes

thank you for concise answer.
can you point me to DSA vs swconfig ? are they just related to switch part of hardware and so as casual user dont affect me or should I need to know about DSA?

even as a user I am always curious about new stuff, and pros and cons of software and technology changes.

1 Like

If you are not using VLANs, you can just use the default config. (but the old swconfig based /etc/config/network in invalid, so you need the new defaults that are automatically generated if there is no network config file at boot.)

But DSA is off-topic here...

2 Likes

sorry to bring this up again, but I just started wondering if I am wrong.

If I am, it means the the OpenWrt "factory" image is risky whether you use it from OEM or from another OpenWrt installation. The only safe way to install OpenWrt with a partition split different from the currently running system will then be to temporarily boot an initramfs with the new layout and sysupgrade from that.

The reason for my doubts is that I just got (a completely different) device with a bad block in the part of the flash I'd like to use for the kernel. Which means that the kernel partition shrinks by one block. The problem with the factory image is that it uses padding to put the UBI partition in the right flash address. This does not take bad blocks into account. OEM or OpenWrt does not know the difference between padding and other data in the image, and will write everythng linearly to flash, skipping the bad blocks. So instead of skipping some padding, we end up with one or more padding blocks in the beginning of the UBI partition. And a soft-bricked device unable to find any UBI image where it was supposed to be.

I sort of hope I'm wrong here....

(note that "it worked for me" doesn't prove anything unless you actually have a device with a bad block between the start of the kernel and the start of the ubi partitions)

1 Like

Probably you are not wrong about the bad blocks in kernel area (or padding before rootfs) causing problems with factory images.

The same issue has also been noticed with R7800 with bad blocks, and the initramfs approach is seen as the fix in those cases (and there is forum discussion about it.)

However, note that this issue has nothing do with the partition changes here, but with the "flash unified image without bad block handling logic" in general.

1 Like