The dangowrt image and the official image are the same, but the official image has also received some patches since then. The dangowrt images are really meant only for the installation process, and then you should switch to using the openwrt images after you've completed the first install.
I discovered that I did make a backup a few months ago and have all the mtd?.bin files! I can't find any clear guidance on how to restore these files in the context of this situation.
Assuming I can successfully run:
$ sudo ./mtk_uartboot -a -p ./bl2-for-mtk_uartboot.bin -f ./openwrt-23.05.3-mediatek-mt7622-linksys_e8450-ubi-bl31-uboot.fip && sudo screen /dev/ttyUSB0 115200
What are the next steps to properly restore the mtd?.bin files and get back to a running system?
Thank you!
Just follow the steps as mentioned here
I think one of the advantages of the official image is the ability to use attended sys upgrade / auc and keep the system up-do-date.. Not sure, but it may be possible to run auc from custom builds, anyway.
Btw, I've been running official 23.05.3 for more than 2 months (wed, sw+hw flow offload, default CPU governor and clock speeds), rock stable.
On the installer version theme, still running very old 0.6.5 from 2022, and it never crashed or OKD'ed.
Wifi AX has been very stable, but I'm using 80MHz, not 160Mhz.
Hope it helps.
Hi A couple of issues I have noticed:
I have 2 radios: MediaTek MT7622 802.11b/g/n and MediaTek MT7915E 802.11ac/ax/n
On MediaTek MT7622 802.11b/g/n, I have a SSID on Mode N
On MediaTek MT7915E 802.11ac/ax/n, I have 2 SSIDs (1 of the SSID is linked to guest network).
The 'weird' thing is that on MT7915E 802.11ac/ax/n, If I change mode from AC to AX for 1 SSID, it changes the mode for both SSIDs - is that expected behaviour?
One of my devices, Redmi Note 8 Pro supports Wi-Fi 802.11 a/b/g/n/ac, dual-band, Wi-Fi Direct. However, when wireless is set to AX, then it cannot view those wireless networks - Is this normal ?
https://wwwgsmarena.com/xiaomi_redmi_note_8_pro-9812.php
Occasionally, when making changes to one of the existing SSIDs, an new SSID called 'OpenWRT' is magically created! (with no password). I then have to remove this manually.
#1 E8450 is only supporting AX mode on 5 GHz radio
If you have 2 SSID on the same radio, physical interface is shared by all logical SSID on this interface. You cannot mix modes on the same radio.
#2 Redmi note 8 pro works perfectly fine on 5 GHz radio in ax mode (WPA2 PSK). But this smartphone has sometimes trouble with Wifi on channels different from 36. A simple Wifi off/on on smartphone seems to solve (temporarily) the problem
#3 Are you sure you are not running in recovery mode when this is happening ?
I would not recommend WED for RT3200/E8450 ( i.e. mt7915e) as it has a nasty bug that causes router to hang, where only a power cycle will resolve. This bug was fixed early Feb 2024 in the snapshot builds. As far as I know fix was not back ported to 23.05 or earlier.
Right now, my E8450 is running my own snapshot builds from early Feb 2024, just before snapshot tree was transitioned to all UBI format for flash storage (I.e. installer v1.1.x). Itâs up and running stable with WED enabled for more than 127 days now. Before that itâs based on luck of draw where router will randomly hang with WED enabled.
Hi @Underworld ,
It seems several people may have gathered the data you asked for at this point? How are things looking with mediatek's diagnosis?
Hi,
Thanks for asking. Besides @jp707 OKD logs and @grauerfuchs logs, I do not see others.
I haven't hear if we need more logs, but I beliv,e we as community can contribute better.
Sorry If I missed any other logs posted here, but MT is also following the topic.
Kr
K
@grauerfuchs @Underworld
I have 3 of these devices. no OKD yet.
is there something I can do to provoke and 'OKD'? would anything be useful from a non-OKD'd device?
, unfortunately not. I have another one home, @daniel tried hard to provoke OKD as well, but it is sporadic and you can not force it.
The request is just to run the commands before you are about to restore the router from OKD - of course please be carefull and have your backup.
Kr
K
From what I can see of the logs provided so far, it doesnât look like it shows anything that could help.
I believe the expectation from those logs is to check if the OKDed flash chips are going bad. But from the provided logs, the flash chip contents all shows 0xFF when itâs erased, which is the expected value.
So it doesnât look like weâre any closer to getting at the root cause.
I haven't been able to intentionally provoke the classic appearance of OKD either. The only issue I encountered after numerous attempts is that during a later and unrelated set of monitored experiments, one block of the flash went bad and it crashed an install/upgrade after the point of no return. That was the device whose log I supplied.
Although the subsequent serial log output would have looked like an OKD that wouldn't resolve by reboots and/or the freezer trick, the failure I encountered was a direct result of the crashed update and not a spontaneous/random appearance.
I may have successfully re-created an OKD on my device (Belkin AX3200).
modelNumber=RT3200
cert_region=US
hw_revision=1
hw_version=48SAR601.0GA
manufacturer_date=2021/03/09
Here's the serial output when it won't boot:
F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02D4 [000F]
Jump to BL
NOTICE: BL2: v2.9(release):OpenWrt v2023-07-24-00ac6db3-2 (mt7622-snand-1ddr)
NOTICE: BL2: Built : 21:45:35, Oct 9 2023
NOTICE: CPU: MT7622
NOTICE: WDT: Cold boot
NOTICE: WDT: disabled
NOTICE: SPI-NAND: FM35Q1GA (128MB)
ERROR: BL2: Failed to load image id 5 (-2)
A few things about this device which may be relevant:
I created full OEM backups using dd
before installing OpenWRT
It definitely has bad NAND blocks, which has made it difficult to restore to the OEM firmware backups. The bad blocks raises the possibility that this may not be the same OKD everyone else is seeing.
ubinfo
UBI: number of bad PEBs: 5
mtd bad spi-nand0
MTD device spi-nand0 bad blocks list:
0x015c0000
0x015e0000
0x01600000
0x01620000
0x071e0000
And when running Dan's installer:
ubiformat: 5 bad eraseblocks found, numbers: 150, 151, 152, 153, 887
I can clearly see the bad blocks in the mtd dumps when I compare to dumps of other devices I have. Some of the bad areas amusingly have hex values dead c0de
sporadically in the block.
When restoring the OEM backups, I need to use mtd -f
when writing mtd3, and sometimes need to do this multiple times to get the router to boot.
Skipping bad block at 0x012c0000
Skipping bad block at 0x012e0000
Skipping bad block at 0x01300000
Skipping bad block at 0x01320000
Skipping bad block at 0x06ee0000
I had restored the OEM firmware a few times and re-ran different versions of the openwrt installer. In this case, I restored to OEM, then used openwrt installer 1.0.3 in the hopes of getting the OKD issue.
To trigger the crash, the device was powered on for over 48 hours, and then I pulled the plug. I had tried various combinations with shorter time periods, but in those cases it would reboot successfully.
After booting with mtk_uartboot:
nand bad
'spi-nand0' is now active device
015c0000
015e0000
01600000
01620000
071e0000
Here's a link to the erase/dump log file:
https://drive.google.com/file/d/10ql95Or1Np86Uf5kyryPIKr5oH-8lhOr/view?usp=sharing
I don't have a pressing need to get this unit working again soon, so please let me know if there's anything else I can run on it for diagnostics.
This scenario fits what I have observed the two times I've experienced OKD--running for several days and then lose power.
I completed a full dump (md.b 0x0 0x000008000000
) of the erased flash (mtd erase.dontskipbad spi-nand0
), and there are some blocks that are not zeroed (0x00200000
to 0x002ffff0
). I can make this available if someone thinks it will be useful (600 MB).
This took more than 14 hours over a serial line, so I don't recommend anyone else do this unless it's really needed.
EDIT: I just did another dump of a device with the OEM firmware (not an erased chip), and the results are the same (no data except in the range 0x00200000
to 0x002ffff0
), so clearly my dump method isn't correct.
Sounds like it could be compressed fairly good with something like 7zip. Could you try to compress and upload somewhere just to have this in case it would be useful at some point?
Well Iâve finally had enough of the OKD issue making my network unreliable. I have an e8450 currently in OKD state free for anyone that wants it. If you live in Australia Iâll even post it to you for free! Overseas youâll need to cover postage.
I would however like to offer it you @daniel (totally free, including postage) first to help with the issue for everyone else suffering. If @daniel does not want it then itâll be first come first served.
I'll throw in my experience in case it's useful (though I don't suspect it is):
I have three of these devices and two of them have experienced OKDs. In keeping with what others have reported, they all occurred after at least several days of uptime and power disturbances. Two occurred during severe thunderstorms that did not visibly cause lost power (who knows if there was a power surge from lightning or a faster than noticeable power loss and recovery) while the last occurred during a 3-5 second power outage during a sunny day as a fine how-do-you-do...
All three were converted to the latest UBI layout using an installer I have labeled in my files as 1.0.4 pre-release from Feb 24th. It appears that one was withdrawn as it didn't stay on the github releases page. They are all running snapshot r25310-79dae14157 and I found out about the OKD not long after the install so I decided not touch anything as it was being investigated.
The first device has OKD'd twice, both times recovered using the serial recovery method outlined on the wiki page. Unfortunately I did not see the call for nand dumps prior or I would have looked into generating those.
The second has OKD'd once and I did generate the nand dumps but I've run into issues while recovering from the MTD backups I have after erasing the memory. Admittedly I did not research enough into the process beforehand to realize exactly what that entailed thanks to a combination of boredom and liquid courage. I have since found out there's something odd about my MTD backup that has made recovery difficult. To be clear I blame only myself.
Now two things about the latter router:
- The logs were very uninteresting, no reported bad blocks and it shows "FF" for all values. I can gather/share/post more information if desired, dumps included.
- The MTD backup for MTD0 is 512KiB as expected but MTD1 is only 4 MiB and the wiki's flash layout table shows it should be 128MiB. Even more curious the MTD1 backup for the third router is also 4MiB so not an isolated incident.
- I generated the MTD backups from LUCI web interface about 3 weeks ago (unfortunately these are the only ones I have and they are from after the layout conversion).
- I regenerated an MTD1 backup for the third router today and it is the expected ~128MiB
I tried searching for more information but haven't seen anyone else mention incorrectly sized backups. If anyone offhand anyone knows what is in the 4MiB file (e.g. if it corresponds to a specific ubi block) and if I can utilize a copy of the fullsize backup from the third router and some select overwriting to create a recovery file for the second router I'd be quite appreciative. I did compare the 4MiB and ~128MiB files in a hex editor and there are portions that are the same (e.g. the mac addresses are in both) but best as I can tell the 4MiB file does not match a contiguous 4MiB portion of the ~128MiB file. I plan to continue researching it but have a feeling it'll take me a while to figure out due to lack of domain knowledge.
Anyways hope there was something useful in my rambling, even if it's just a warning to at least sanity check your backups!
Take this with a grain of salt and know that you might need to make a few attempts with this method. I'm not an openwrt developer, just someone who can hack around pretty well.
The most important part of the backup is the factory
partition. Looking at the MTD1 layout, it first has fip
, then factory
, then uboot. fip
is a bit less than 1MB, and then factory
is 1MB. So I would bet that within the 4MB MTD1 backups, the first 2MB contain what you need.
Here's how I would try to recover this:
- Make backup copies of everything and save them somewhere else
- Convert all your backup files to hex using a tool like
xxd
. - Open the 4MB hex file and take note of the starting and ending addresses (first column)
- Make a copy of the 128MB hex file (it will be much larger now. your text editor may be very slow dealing with this file. just be patient), and find the ending address you see in the 4MB hex file. Delete that line and all lines above it.
- Copy/paste the data from the 4MB file into where you just deleted those lines
- Convert the hex file back to binary using
xxd -r
At this point you will have hopefully replaced the factory
partition from the one device with the factory
data of the device where you don't have the full backup.
Then you need to flash it to the device somehow. Booting via mtk_uartboot
into the uboot console might be the way to go, but then you'll have to somehow get the binary file into memory. You could upload over the serial port using loadx
, however this will take at least a few hours (probably many hours). There might be a way to do it over tftp. Then you could flash it from the memory location to the flash chip.
There are also some other interesting uboot menu options like "Load via tftp then write to flash" that might work better. And the uboot console from mtk_uartboot
also seems to include support for usb storage devices.
One concern would be if there are any bad blocks in your flash, it could throw off the whole process. Maybe you'll get far enough to get the device recovered where you could then do a real reflash that would handle the bad blocks better.
This might be to off-topic for this thread, so maybe message me directly if you want to discuss some ideas. I'm sure others will correct me if I'm way off base.