sounds interesting, as many of the users have no serial access. Finding a semi-safe way to update bl2 from the SSH console of a live system would be beneficial.
mtd0/1/2 are defined as read-only in DTS tho., so may need to force reload of MTD module I think. Have not done it before, so not really sure how to do it tbh.
Edit:
It's posted in @daniel's installer page actually ... haha. Do this first and write away:
insmod mtd-rw i_want_a_brick=1
That's assuming you already have the mtd-rw module ready.
Edit 3:
Should have been more specific. This can be done using mtk_uartboot to boot into OpenWrt and transfer @daniel's patched BL2, mtd-rw module into it and run the command to write into mtd0.
Edit 4:
Apologies if I appear long winded. Use bl2.bin with mtk_uartboot. Then use bl2.img to write into mtd0.
might just be safer to wait for daniel to update the installer and then just re-run the installer
or maybe he can make a special version of the installer that only upgrades bl2
If you are on v2.4 => You can safely sysupgrade to any 23.x
If you are on v2.9 => We still need to agree on the best way to proceed (wait for the new installer, patch bl2 from openwrt, patch bl2 from Uboot...)
Having today's knowledge, how would you now comment on previous attempts to fix this. Are these still relevant on 23.x?
changing governor back to performance?
bumping min freq?
pin drive current?
Please correct me if I'm wrong, but everybody who has TF-A v2.9 can expect this to happen on their very next reboot - all it takes is a bit flip, normally correctable, but here interpreted as "fatal error, abort misson"
Maybe not "very next reboot" but you're mostly correct in that it can happen to anyone on TF-A v2.9 or later during any boot-up, power-on, reboot, or power-cycle, as far as I understand. It's why my RT3200 has been sitting unused since the first time it experienced OKD; replaced with a GL-MT3000 which has been flawless on SNAPSHOT flashed right out of the box.
I'm hopeful the breakthrough discovery here will restore reliability to the newer builds and versions going forward so that RT3200 may be used with confidence once again. It's not a bad AP at all... just had a run of bad luck with this pesky problem, which is now hopefully over!
Also excited to take another look at these settings, after the bootloader issue is behind us!
Yes, there are different steps in that you have to use a different BL2 file built with the configuration and the instruciton to look for the fip in UBI. Unfortunately, it looks like my build set is currently producing broken builds for the snapshot version, so I'm going to need to dig into that to find out what's going wrong. No doubt I did something dumb out of rushing and/or exhaustion. (Edit: The build issue has been corrected.)
If your device is booting properly right now, you probably should wait until the patches have been accepted and the build bot has processed and built the proper and official binary anyway.
Very true, that also works! I haven't directly looked at the BL1 code, but I expect it's designed to skip over anything in a bad block. If so, it would immediately move to the next block and look for the proper signature there.
Indeed so. However, it's still a risky operation since you won't know if something went wrong until you go to reboot and find it won't boot. At that point, only the serial console or JTAG will be viable ways of recovering.
Daniel has merged the patch to main/master, so the next build round (after the currently ongoing that missed the patch) should then produce correct bl2 binaries (preloader.bin)
And the firmware itself will then also have the scrubbing feature trigger.
That will be true in any case for all the forthcoming solutions. Serial might ultimately be needed if flashing bl2 fails for any reason. But that has been true for all "let's use Daniel's installer" action already earlier, as the bootchain is modified.
Running the full installer to fix bl2 would be overkill, as other partitions need no modifications. (and e.g. fip can be rewritten from a live system via ubiupdatevol. I have done that twice so far for my RT3200.)
So, figuring a simple "serial-free" way via applying mtd-rw kmod (in build or installed separately) and repeating what Daniel's installer does for bl2, sounds straightforward enough. (of course, you need to pay attention to write the correct image etc.. But still straightforward)
I'm certainly willing to continue to use my OKD'd RT3200 for science. However, since this would be a write operation which would change its state, I'm going to wait until those that code have a specific end user fix that they would like me to test.
EDIT: For what it's worth, I have a 2nd RT3200 which also has the buggy BL2 code (v1.0.3), but hasn't OKD'd if someone want me to test a fix from SSH.
I've always had this doubt: When flashing a new installer from Luci, do I need to keep the settings checked? And, when rebooting to the recovery environment, do I need to keep the settings again when flashing the new Sysupgrade???
I mean this in the case a new installer drops and I want to test the new Snapshots with kernel 6.6.
Or do I need to export a backup, do not keep settings and restore the config backup in the end??
I'd need to go back through the installer to be sure, but the change in the compat_version flag might get in the way even if the data volume is retained during the repartitioning.
As for backing up and restoring, it might work. This platform has been stable on the config compatibility, but not all of them are. Even with it being as stable as it has been, there are a few things that must be manually edited before any further upgrades would be possible. (edit) ...and then you also have config changes with some package versions, especially when dealing with snapshot. That's a whole new can of worms. Some of those changes have occasionally crashed critical processes, so be prepared for oddities regardless if you choose to do so.
In general, when the compat_version flag changes (and it has done so between 23.x and snapshot/24.x), it's best not to try and restore a configuration to or from a different version.
That is actually the same as "keep settings" but without benefiting from the possible settings migration scripts, increasing the risk for bricking.
If there are changes that cause some default values to change (e.g. dnsmasq internal config file directory location in 2020), there may a built-in uci-defaults script for migration, which script gets run once at the first boot after flashing and is then removed. If you later restore an old backup with old settings, there is no migration script any more and you are stuck with the old invalid settings...
Example from 2020: https://github.com/openwrt/openwrt/commit/6a2855212096d2c486961a0841b037bae4b75de7
Restoring incompatible settings is one typical reason for bricks.
In general, restoring settings from backup is only safe, when the backup has been done from the same major version of OpenWrt. (and if using the development snapshots from main/master, even that may not hold true...)
No, as the recovery environment in E8450/RT3200 does not know anything about your normal settings... Keeping the (possibly old) recovery settings that are the (old) default settings is practically the same as flashing without keeping settings...
...the caveat is actually that the new firmware to be flashed might already have different defaults than the older recovery instance)
So, the correct procedure when upgrading to snapshot/OpenWRT 24 when the new installer is ready, is to reconfigure the device from scratch?? That's midly inconvenient, but I still can do it because there's not that much to set up for me (WAN and WAN6 DNS, DDNS settings, WiFi SSIDS, some FW rules, etc).
So, in summary would be:
Flash the imminent new installer. Don't enable the checkmark to keep settings.
The device reboots to recovery environment. Flash the new sysupgrade from there. Again, not keep settings.
Reconfigure the device
Profit ?
I safely upgraded from 22.03.5 to 23.05.x while keeping settings because I didn't have to use a new installer, that's why I'm asking. In the end I'd love not to reconfigure everything, but yeah, many times it's better to start from scratch (mostly Windows user here).