Yes it was ECC -- just not only what was written by the vendor but also all writes done using the old driver...
And yes, the installer only writes back the EEPROM data and MAC addresses inside mtd2, writing the whole partition didn't seem to be needed (and would also be slightly more tricky as there may be bad blocks between EEPROM data in the beginning and MAC addresses in the middle, so that the offset of the MACs would change ...).
As @981213 explained well in https://github.com/openwrt/openwrt/pull/4179 , the root-cause is a mix of API misuse by mt76 and mtdblock (not handling EUCLEAN) and apparently broken ECC data for things which had been written using the old driver.
I seem to be having a similar issue with snapshot where the 5ghz driver isn’t loaded. When I rolled back to commit 7119fd32d397567931e63dbbf72014e95624018f everything works fine again.
Hi @daniel , would you know if the old (snfi) nand driver work with the mtd2 ECC data that's re-written with the new v0.5.3 installer? Or would it be the case that once we use the new v0.5.3 installer, we must use the new snand driver?
Thank you for reporting that @hlew . This confirms what I was guessing but could not empirically verify: It comes with ECC errors out of the factory. I re-wrote the flash completely several time during development and also the installer may re-write parts of the factory partition in case it is needed to restore offsets without BMT, so I wasn't sure if only things written with the old driver, also used in old versions of the installer, cause trouble for the new driver.
As you never run the UBI installer and still got the device with it's original bootchain, the fact that it broke for you too confirms that the problem of corrupted ECC data also applied when the device comes right out of factory....
btw: do you see read-errors when trying got read other /dev/mtdblock* devices?
Unfortunately the only way out is to either give the new driver an option (?) to be more tolerant with pre-existing ECC errors or to re-rewrite at least the factory partition using the new driver.
As having correct ECC data is nice to have if you plan to use this thing for a decade or so having that cheap-brand SPI-NAND chip, I'd recommend re-writing so you got ECC in future.
Doing so requires booting with initramfs image where you removed the read-only attribute from MTD partitions in device-tree.
OOB area for user data are reordered in the old driver but ECC part shouldn't be affected, as those data are placed by hardware, not by our drivers.
The old driver just doesn't report any bitflips in _mtd_read return value (due to how the spi-nand driver was hacked) which makes everything appear fine.
We just need to fix mt76 to ignore -EUCLEAN and everything should work again without other steps.
edit: I'd still recommend a rewriting to clear those bitflips as there are already 3 bits flipped when -EUCLEAN is reported and mt7622-snand is only able to correct up to 4 bitflips.
Last week, I was able to flash the non ubi image through the Belkin web interface. I decided to go back to stock by installing Belkin firmware through luci. Now I want to go back to non ubi image but nothing happens when I flash it on Belkin's web interface.
I was having the same issue and assumed that the image was not flashing correctly hence the board was defaulting to the stock firmware. As I mentioned a few posts ago, I had to revert to an earlier commit/build to get everything working again.
Just wanted to confirm that I am doing this correct, I should first flash the recovery installer and then once openwrt boots into recovery I need to flash the sysupgrade firmware, correct?
By the by, I am seeing below message when I try to flash recovery:
Yes, you have to use the force (ie. override the check) to re-run the installer once OpenWrt is already running.
If you have already used the 0.2.4 installer with the stock firmware it is not necessary to do that and will also not solve any problems.
In that case (ie. problems despite you have already run the 0.2.4 installer when installing for the first time from stock) something else must be wrong. Please login to the device via SSH and send the output of the following commands to be (by PM if you prefer):