Belkin RT3200/Linksys E8450 WiFi AX discussion

I don't think the partition (volume) size changed but rather the executable code size changed. There are many highly qualified, high performance people studying the OKD oddity but I don't think anything is for sure yet. The change in data bytes amount is a known change.

I thot you read and re-wrote the same FIP partition. If you used a different FIP image, then it made sense.

Well, back to square one.

I've have intentionally power failed and restarted a half a dozen times and it has answered the call every time so far. Was the uboot code different between v1.1.0 and v1.1.1? I jumped on v1.1.0 as soon as it was made available.

To put another way....was I running on v1.1.0 uboot and updated to v1.1.1 uboot.

There is no v1.1.0 u-boot by itself.
Daniel's installer uses the uboot binaries compiled by OpenWrt buildbot and included in the imagebuilder. The same that you get when you download them directly from the OpenWrt download server. The one included in 1.1.0 is just the most recent as of Daniel's release compile time of 1.1.0. Same goes for 1.1.1.

There have been several small u-boot changes in the last few weeks, mostly unrelated to e8450/rt3200, but still changing the compiled uboot-mediatek binary.

https://git.openwrt.org/?p=openwrt/openwrt.git;a=history;f=package/boot/uboot-mediatek;hb=HEAD

Nobody really knows yet, what causes OKD. So there is no specific fix for it in the uboot.
I think that the main hypothesis is that the apparent fix is just about rewriting the fip again, possibly with the current firmware having better flash writing power levels than earlier, or something similar.
(And nobody knows if the written flash stays correctly readable forever, or if it's condition deteriorates gradually, so that a new OKD condition will materialize later.)

1 Like

All right, I finally got a router death on update for analysis. I had just gone from UBI layout (v1.0.2), ran the installer v1.1.1, and then had just gone to today's snapshot:

Watchdog handover: fd=3
- watchdog -
Watchdog does not have CARDRESET support
Thu Jan  1 00:00:59 UTC 1970 upgrade: Sending TERM to remaining processes ...
Thu Jan  1 00:00:59 UTC 1970 upgrade: Sending signal TERM to netifd (1478)
Thu Jan  1 00:01:03 UTC 1970 upgrade: Sending KILL to remaining processes ...
Thu Jan  1 00:01:03 UTC 1970 upgrade: Sending signal KILL to netifd (1478)
[   71.656616] stage2 (2644): drop_caches: 3
Thu Jan  1 00:01:11 UTC 1970 upgrade: Switching to ramdisk...
Thu Jan  1 00:01:12 UTC 1970 upgrade: Performing system upgrade...
[   73.549836] block ubiblock0_5: released
[   73.591123] block ubiblock0_5: created from ubi0:5(fit)
Volume ID 5, size 119 LEBs (15110144 bytes, 14.4 MiB), LEB size 126976 bytes (12
4.0 KiB), dynamic, name "fit", alignment 1
Set volume size to 92438528
Volume ID 7, size 728 LEBs (92438528 bytes, 88.1 MiB), LEB size 126976 bytes (12
4.0 KiB), dynamic, name "rootfs_data", alignment 1
sysupgrade successful
umount: can't unmount /dev: Resource busy
umount: can't unmount /tmp: Resource busy
[   76.673518] reboot: Restarting system

F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02F1 [000F]
Jump to BL

NOTICE:  BL2: v2.9.0(release):OpenWrt v2023-10-13-0ea67d76-1 (mt7622-snand-ubi-1
ddr)
NOTICE:  BL2: Built : 01:42:36, Mar  1 2024
NOTICE:  WDT: [40000000] Software reset (reboot)
NOTICE:  CPU: MT7622
NOTICE:  SPI-NAND: FM35Q1GA (128MB)
NOTICE:  UBI: scanning [0x80000 - 0x8000000] ...
NOTICE:  UBI: scanning is finished
NOTICE:  UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
NOTICE:  UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
NOTICE:  UBI: Volume fip (Id #0) size is 2097152 bytes
NOTICE:  BL2: Booting BL31
NOTICE:  BL31: v2.9(release):OpenWrt v2023-07-24-00ac6db3-2 (mt7622-snand-1ddr)
NOTICE:  BL31: Built : 22:09:42, Mar 22 2024


U-Boot 2023.07.02-OpenWrt-r23809-234f1a2efa (Mar 22 2024 - 22:09:42 +0000)

CPU:   MediaTek MT7622
Model: mt7622-linksys-e8450-ubi
DRAM:  512 MiB
Core:  48 devices, 21 uclasses, devicetree: separate
MMC:
Loading Environment from UBI... SPI-NAND: FM35Q1GA (128MB)
ubi0 error: ubi_eba_init: no enough physical eraseblocks (0, need 1)
ubi0 error: ubi_attach_mtd_dev: failed to attach mtd5, error -28
UBI error: cannot attach mtd5
UBI error: cannot initialize UBI, error -28
UBI init error 28
Please check, if the correct MTD partition is used (size big enough?)

** Cannot find mtd partition "ubi"
In:    serial@11002000
Out:   serial@11002000
Err:   serial@11002000
reset button found
Loading Environment from UBI... ubi0 error: ubi_eba_init: no enough physical eraseblocks (0, need 1)
ubi0 error: ubi_attach_mtd_dev: failed to attach mtd5, error -28
UBI error: cannot attach mtd5
UBI error: cannot initialize UBI, error -28
Please check, if the correct MTD partition is used (size big enough?)

** Cannot find mtd partition "ubi"
Net:
Error: ethernet@1b100000 address not set.
No ethernet found.

No EFI system partition
No EFI system partition
Failed to persist EFI variables

Error: ethernet@1b100000 address not set.

Error: ethernet@1b100000 address not set.
Reading 131072 byte(s) (64 page(s)) at offset 0x00220000
ubi0 error: ubi_eba_init: no enough physical eraseblocks (0, need 1)
ubi0 error: ubi_attach_mtd_dev: failed to attach mtd5, error -28
UBI error: cannot attach mtd5
UBI error: cannot initialize UBI, error -28
UBI init error 28
Please check, if the correct MTD partition is used (size big enough?)
Erasing 0x00000000 ... 0x07cfffff (1000 eraseblock(s))
Skipping bad block at 0x03c20000
resetting ...

F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02ED [000F]
Jump to BL

NOTICE:  BL2: v2.9.0(release):OpenWrt v2023-10-13-0ea67d76-1 (mt7622-snand-ubi-1ddr)
NOTICE:  BL2: Built : 01:42:36, Mar  1 2024
NOTICE:  WDT: [40000000] Software reset (reboot)
NOTICE:  CPU: MT7622
NOTICE:  SPI-NAND: FM35Q1GA (128MB)
NOTICE:  UBI: scanning [0x80000 - 0x8000000] ...
NOTICE:  UBI: scanning is finished
NOTICE:  UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
NOTICE:  UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
ERROR:   UBI error: No volume named fip could be found
NOTICE:  UBI: scanning [0x80000 - 0x8000000] ...
NOTICE:  UBI: scanning is finished
NOTICE:  UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
NOTICE:  UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
ERROR:   UBI error: No volume named fip could be found
ERROR:   BL2: Failed to load image id 3 (-2)

Interesting. It looks like the upgrade process failed where it wasn't expected, the process aborted after the point of no return, and it resulted in the UBI volumes getting trashed. It seems that this is pointing to a different issue than the classic OKD, but it has the same end result.

I apologize I didn't mean to say that uboot has a version number. I was just saying that when imagebuilder compiles to a file set it includes a fresh copy of the uboot .fip file.

When ever there is a snapshot revision I always build with the "openwrt-imagebuilder-mediatek-xxxxxx.Linux-x86_64.tar.zst" archive that comes with that current snapshot Supplementary Files group although I am not sure if that's necessary.

Anyway like you say....the uboot.fip does get updated from time to time.

But come to think of it.....OKD dates all the way back to v1.0.3 doesn't it?

I'm not sure what happened. I came back to the router with the lights off. Suspected yet another OKD. I boot up on serial and I see what I attached. Even using the mtkboot thing.

Anyone have a clue if I can try an do anything about this. :frowning:

It looks to me like your flash chip has developed a fault. If the boot sequence can get past that error and get you to a U-Boot terminal, you might be able to resolve the issue.

If it's nothing more than a bad block that the code has yet to confirm, then replacing the partition or volume containing said bad block should be enough. If it's in a bad spot or there's more than one, you might need to erase the flash chip and re-write all partitions. If that's the case, then hopefully you have a backup of your factory partition since that's the only one you can't get from the OpenWRT site.

If instead it's a bad chip, then that router is sadly kaput unless you have the skills, equipment, and spare parts to replace the flash chip.

1 Like

I think I do have backups still however I presume as I suspected the sticking point is getting to uboot properly. I did expect to be able to reach the menu with mtk uart boot tool but seems not. Any ideas?

When using mtk_uartboot, did you also provide the .fip via mtk_uartboot? If not, then it would try to access the .fip on the chip, resulting in this issue. If you did provide the .fip via mtk_uartboot but it still can't get past the UBI checks, then this gets a lot more messy and into territory where my knowledge and experience runs short.

Additionally, which bl2 did you use? A quick look at the build code suggests that you might have better success with the bl2-for-mtk_uartboot.bin file than you would with bl2-for-debug-snand-issue.bin

The only solutions that come to my mind at that point require using a custom or modified boot chain that can get past that error, or potentially using JTAG to issue the commands for a flat wipe on the flash chip and then starting from a blank slate. Either way, I suspect you're going to need to make sure you do have that factory partition backup in order to start rebuilding things.

When building 23.05-SNAPSHOT 5.15.158 commit breaks the 5Ghz radio. It shows as unknown device in Luci. Sorry I forgot to capture a kernel log.

I can confirm that kernel 5.15.157 works.

I have been following this thread closely because of OKD problem.

Quick question:

  1. I keep seeing references to backup up the factory partition. Is the factory partition /dev/mtd2?

So far no OKD on three devices with UBI installer V1.0.3 and latest stable release. I don't have any backups but wanted to make sure I backup the correct partition.

Thanks

Yes, mtd2 is the factory partition. If you haven't yet, you should backup everything as described in section "Backup stock/vendor bootchain" on the installer github page (https://github.com/dangowrt/owrt-ubi-installer)

2 Likes

mtd2 is the factory partition for people that have run one of the earlier UBI installers (v1.0.x or older). For those who ran installer v1.1.x or newer, then the factory partition is stored as a UBI volume instead. For those running on the factory original layout (non-UBI) and didn't run one of the installers, then the factory partition should be mtd4 according to the wiki documentation.

For devices that ran an installer older than v1.1.1x, you can use luci to back up the data in the partition. Under the "Backup/Flash Firmware" page, factory will be listed as one of the mtdblock items in the select list.

For people that have run v1.1.x, you will need to ssh into the device and find the factory partition via ubinfo -a. The volume number returned for the factory partition will match its entry in /dev/ubi0_{volume}. From there, you can retrieve the partition contents via dd and then scp it to your computer.

2 Likes

Really excellent post. This should be in the wiki.

Funny one of my RT3200's has an mtd3 partition after UBI conversion.

Thanks.

That sounds like it is partitioned in the stable UBI layout (installer v. 1.0.3 or earlier). mtd0 = bl2, mtd1 = fip, mtd2 = factory, and mtd3 = ubi. The wiki has been updated already to show the new (snapshot) UBI layout and it doesn't show the stable layout anymore.

If you're curious about the layout and offsets of the partitions in mtd for your device, you can run fw_printenv from the command line and look for the 'mtdparts' line. That will show the order and size of each recognized partition on the flash chip.

Please keep in mind that mtd isn't partiitoned the way most computers are. There's no partition table on the device. It instead uses a simple lookup table of names, offsets, and sizes. That is what the mtdparts line present in the U-Boot environment represents.

1 Like

Thanks. Mine reads like this:
mtdparts=mtdparts=spi-nand0:512k(bl2),1280k(fip),1024k(factory),256k(reserved),-(ubi)
I am still suffering for OKD whenever I have power outage. Appreciate any advise to have a proper solution.

That's the stable layout as seen from U-Boot. The 'reserved' area doesn't show up as a partition when OpenWRT is running.

I suggest following the instructions on the wiki for OKD recovery (https://openwrt.org/toh/linksys/e8450#recovery_from_openwrt_kiss_of_death_okd) labeled as ' Did not run the v1.1.1 UBI installer (e.g. anyone on OpenWrt 23.05.x or below, or older snapshots)'. These instructions at least temporarily resolve the boot issues for the majority of the cases of OKD.

1 Like

For people who may have forgotten which version of the installer they used, I did some testing and this seems to be a reliable method to figure it out, as long as you haven't updated the preloader (probably unlikely).

Run command: grep Built /dev/mtd0ro

These dates correspond to the installer version:

Build date Installer Version
Sep 3, 2022 1.0.0
Oct 7, 2022 1.0.1
Jan 3, 2023 1.0.2
Oct 9, 2023 1.0.3

This is the build date for the U-Boot preloader, but it seems to be pretty consistent. (I'm sure someone will correct me if needed :grinning:).

5 Likes