Belkin RT3200/Linksys E8450 WiFi AX discussion

Now that the OKD issue is resolved, I can safely update my two routers for the next major stable release eventually. Seriously, congrats. :grinning:

1 Like

Or you could do this, like what @daniel's installer does:

echo "redundantly write bl2 into the first 4 blocks"
for bl2start in 0x0 0x20000 0x40000 0x60000 ; do
	mtd -p $bl2start write $PRELOADER /dev/mtd0
done

Edit:
I assume BL1 is hard-coded to look for BL2 at those 4 MTD0 positions?

2 Likes

sounds interesting, as many of the users have no serial access. Finding a semi-safe way to update bl2 from the SSH console of a live system would be beneficial.

4 Likes

mtd0/1/2 are defined as read-only in DTS tho., so may need to force reload of MTD module I think. Have not done it before, so not really sure how to do it tbh.

Edit:
It's posted in @daniel's installer page actually ... haha. Do this first and write away:

insmod mtd-rw i_want_a_brick=1

That's assuming you already have the mtd-rw module ready.

Edit 2:
@NullDev , @wavejumper00 fancy having a go with your OKDed router?

Edit 3:
Should have been more specific. This can be done using mtk_uartboot to boot into OpenWrt and transfer @daniel's patched BL2, mtd-rw module into it and run the command to write into mtd0.

Edit 4:
Apologies if I appear long winded. Use bl2.bin with mtk_uartboot. Then use bl2.img to write into mtd0.

1 Like

might just be safer to wait for daniel to update the installer and then just re-run the installer
or maybe he can make a special version of the installer that only upgrades bl2

there is no need to be in a hurry here

3 Likes

Congrats everyone on this find! Thank you. I've been tracking the topic for months hoping for a fix. Let me try to sum it up:

  • Issue affects those who are running TF-A v2.9 (those who ran installer 1.0.3 or newer)
  • Issue does not affect those who are still running TF-A v2.4 (those who ran 1.0.2 or earlier)
  • You can find out which TF-A you are running by executing, for example:
root@E8450:~# grep "(release)" /dev/mtd0ro
v2.4(release):OpenWrt v2021-05-08-d2c75b21-3 (mt7622-snand-1ddr)
  • If you are on v2.4 => You can safely sysupgrade to any 23.x
  • If you are on v2.9 => We still need to agree on the best way to proceed (wait for the new installer, patch bl2 from openwrt, patch bl2 from Uboot...)

Having today's knowledge, how would you now comment on previous attempts to fix this. Are these still relevant on 23.x?

  • changing governor back to performance?
  • bumping min freq?
  • pin drive current?

Please correct me if I'm wrong, but everybody who has TF-A v2.9 can expect this to happen on their very next reboot - all it takes is a bit flip, normally correctable, but here interpreted as "fatal error, abort misson"

7 Likes

Maybe not "very next reboot" but you're mostly correct in that it can happen to anyone on TF-A v2.9 or later during any boot-up, power-on, reboot, or power-cycle, as far as I understand. It's why my RT3200 has been sitting unused since the first time it experienced OKD; replaced with a GL-MT3000 which has been flawless on SNAPSHOT flashed right out of the box.

I'm hopeful the breakthrough discovery here will restore reliability to the newer builds and versions going forward so that RT3200 may be used with confidence once again. It's not a bad AP at all... just had a run of bad luck with this pesky problem, which is now hopefully over!

Also excited to take another look at these settings, after the bootloader issue is behind us!

1 Like

Yes, there are different steps in that you have to use a different BL2 file built with the configuration and the instruciton to look for the fip in UBI. Unfortunately, it looks like my build set is currently producing broken builds for the snapshot version, so I'm going to need to dig into that to find out what's going wrong. No doubt I did something dumb out of rushing and/or exhaustion. (Edit: The build issue has been corrected.)

If your device is booting properly right now, you probably should wait until the patches have been accepted and the build bot has processed and built the proper and official binary anyway.

Very true, that also works! I haven't directly looked at the BL1 code, but I expect it's designed to skip over anything in a bad block. If so, it would immediately move to the next block and look for the proper signature there.

Indeed so. However, it's still a risky operation since you won't know if something went wrong until you go to reboot and find it won't boot. At that point, only the serial console or JTAG will be viable ways of recovering.

1 Like

Daniel has merged the patch to main/master, so the next build round (after the currently ongoing that missed the patch) should then produce correct bl2 binaries (preloader.bin)

And the firmware itself will then also have the scrubbing feature trigger.

That will be true in any case for all the forthcoming solutions. Serial might ultimately be needed if flashing bl2 fails for any reason. But that has been true for all "let's use Daniel's installer" action already earlier, as the bootchain is modified.

Running the full installer to fix bl2 would be overkill, as other partitions need no modifications. (and e.g. fip can be rewritten from a live system via ubiupdatevol. I have done that twice so far for my RT3200.)

So, figuring a simple "serial-free" way via applying mtd-rw kmod (in build or installed separately) and repeating what Daniel's installer does for bl2, sounds straightforward enough. (of course, you need to pay attention to write the correct image etc.. But still straightforward)

3 Likes

I'm certainly willing to continue to use my OKD'd RT3200 for science. However, since this would be a write operation which would change its state, I'm going to wait until those that code have a specific end user fix that they would like me to test.

EDIT: For what it's worth, I have a 2nd RT3200 which also has the buggy BL2 code (v1.0.3), but hasn't OKD'd if someone want me to test a fix from SSH.

3 Likes

So for those on old installers with TF-A v2.4 do we need to do anything?

If you're on TF-A 2.4 and stable firmware, you're fine exactly as you are until you want perform a major version upgrade or an upgrade to SNAPSHOT.

3 Likes

Ah. So we will need manual intervention for upgrade to 24.xx. What would be the process for that for those who don’t want to open up their devices?

Yes, and I expect it will be as simple as using the new UBI installer package followed by the new firmware.

4 Likes

I've always had this doubt: When flashing a new installer from Luci, do I need to keep the settings checked? And, when rebooting to the recovery environment, do I need to keep the settings again when flashing the new Sysupgrade???

I mean this in the case a new installer drops and I want to test the new Snapshots with kernel 6.6.

Or do I need to export a backup, do not keep settings and restore the config backup in the end??

Cheers!

1 Like

I'd need to go back through the installer to be sure, but the change in the compat_version flag might get in the way even if the data volume is retained during the repartitioning.

As for backing up and restoring, it might work. This platform has been stable on the config compatibility, but not all of them are. Even with it being as stable as it has been, there are a few things that must be manually edited before any further upgrades would be possible. (edit) ...and then you also have config changes with some package versions, especially when dealing with snapshot. That's a whole new can of worms. Some of those changes have occasionally crashed critical processes, so be prepared for oddities regardless if you choose to do so.

In general, when the compat_version flag changes (and it has done so between 23.x and snapshot/24.x), it's best not to try and restore a configuration to or from a different version.

4 Likes

That is actually the same as "keep settings" but without benefiting from the possible settings migration scripts, increasing the risk for bricking.

If there are changes that cause some default values to change (e.g. dnsmasq internal config file directory location in 2020), there may a built-in uci-defaults script for migration, which script gets run once at the first boot after flashing and is then removed. If you later restore an old backup with old settings, there is no migration script any more and you are stuck with the old invalid settings...
Example from 2020: https://github.com/openwrt/openwrt/commit/6a2855212096d2c486961a0841b037bae4b75de7

Restoring incompatible settings is one typical reason for bricks.

In general, restoring settings from backup is only safe, when the backup has been done from the same major version of OpenWrt. (and if using the development snapshots from main/master, even that may not hold true...)

No, as the recovery environment in E8450/RT3200 does not know anything about your normal settings... Keeping the (possibly old) recovery settings that are the (old) default settings is practically the same as flashing without keeping settings...
...the caveat is actually that the new firmware to be flashed might already have different defaults than the older recovery instance)

2 Likes

So, the correct procedure when upgrading to snapshot/OpenWRT 24 when the new installer is ready, is to reconfigure the device from scratch?? That's midly inconvenient, but I still can do it because there's not that much to set up for me (WAN and WAN6 DNS, DDNS settings, WiFi SSIDS, some FW rules, etc).

So, in summary would be:

  1. Flash the imminent new installer. Don't enable the checkmark to keep settings.
  2. The device reboots to recovery environment. Flash the new sysupgrade from there. Again, not keep settings.
  3. Reconfigure the device
  4. Profit ?

I safely upgraded from 22.03.5 to 23.05.x while keeping settings because I didn't have to use a new installer, that's why I'm asking. In the end I'd love not to reconfigure everything, but yeah, many times it's better to start from scratch (mostly Windows user here).

Cheers!

1 Like

I just flashed the upgrade bootloader and then will reflash the latest snapshot upgrade when the merged change gets built

Hi, currently running on 23.05.4 stable with UBI layout like this

root@RT3200:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00080000 00020000 "bl2"
mtd1: 00140000 00020000 "fip"
mtd2: 00100000 00020000 "factory"
mtd3: 07d00000 00020000 "ubi"

is there any risk of catching OKD?
If YES - the only way to avoid OKD is to apply new UBI layout and move to latest snapshot?

p.s. looks safe:

root@RT3200:~# grep "(release)" /dev/mtd0ro
v2.4(release):OpenWrt v2021-05-08-d2c75b21-3 (mt7622-snand-1ddr)