Belkin RT3200/Linksys E8450 WiFi AX discussion

Unfortunately this looks like the bootloader got corrupted and can't be loaded. Probably a bit flipped on the NAND chip...

The only ways to recovery from this is by using JTAG to re-write the bootloader. JTAG pads are exposed and can be accessed on the board, you can then use OpenOCD to load U-Boot into RAM and then use that to re-run the installer which will re-write everything.

1 Like

Yeah I figured it was something like that.
I've not done something like that before but can't be too bad. It looks like I need a Jtag adapter, all the hardware bits I can probably figure ot, but I have done some searching and am not sure how to compile U Boot for this specific machine? do i need someone to pull down a working version for this machine or is it simpler than that?

I found this as an example but will need to do some more reading.

There are OpenOCD scripts and a ready made U-Boot binary from MediaTek here: https://github.com/mtk-openwrt/openocd-scripts/tree/main/mt7622

I use FT2232H as JTAG adapter, other people in this very thread managed using CH341A or Raspberry Pi.

See for example

4 Likes

Thanks, my searching I missed those threads and that link to the firmware. Was. It even sure what I was looking for, really appreciate it.

I’ll have to get the adapter or look into the Pi instructions as I have a few of those floating around :grinning:

Mike

I'm here to report the exact same issue. I have two RT3200s and one of them refused to boot after a normal power-down.

The buttons on the unit have no effect on the boot output.

Serial access shows the identical errors/output:

F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02D4 [000F]
Jump to BL

NOTICE:  BL2: v2.9(release):OpenWrt v2023-07-24-00ac6db3-2 (mt7622-snand-1ddr)
NOTICE:  BL2: Built : 21:45:35, Oct  9 2023
NOTICE:  CPU: MT7622
NOTICE:  WDT: Cold boot
NOTICE:  WDT: disabled
NOTICE:  SPI-NAND: FM35Q1GA (128MB)
ERROR:   BL2: Failed to load image id 5 (-2)

This sounds like a more widespread problem?

I have a CH341A and a few other programmers (STLink, Segger, USB Blaster), and I'll try to document the recovery process if I'm successful.

Is there anything that can be done to mitigate this issue, or verify flash manually after install?

Guess I'll use the "good ol" Archer C7v2 again for now :slight_smile:

3 Likes

That's a bit concerning. Maybe NAND dying?

In my case, sometimes my 5GHz radio goes crazy and I have to either restart it or reboot the device. I guess that's because I use 160MHz channels, but IDK. I'll try 80 and see how it goes.

Cheers!

Should all RT3200 users be panicking right now? Are our beloved and faithful devices all about to fail one day or the next?

It looks like the bootloader can no longer be read from the flash. Could be a bit-flip on the NAND, after all that Fidelix FM35Q1GA is a rather low-end cheap flash chip, so that happening to some of the devices after some (longer) time of use is not too surprising.

If you have the option to do so, try connecting the JTAG pads and re-flash the bootloader by booting U-Boot into RAM using OpenOCD.
Before that you could dump what is in the flash right now (incl. OOB data) and we should find out what has happened (or at least what the damage looks like).

Is there any preemptive action that can be taken to safeguard against this Damocles sword - a sort of modified bootloader or so?

I was thinking of trying to repair mine this weekend finally, will have a little free time. Am going to fire up the raspberry pi and try that method for now.

As far as dumping what is in the flash right now, is that something I can do as part of re-flashing through the jtag pads? If so I can try to do that as well and see what I can get

1 Like

I haven't had any hardware issues—at least not any visible signs of one that I'm aware of. But something interesting did happen to my unit once: when I rebooted it over LuCi, it seemingly powered off. No LED or any indications of any kind. I always did wonder if Linux on embedded devices like these could be powered off, and turns out it can, though not by intention. I powered it back on by flicking the device's power switch to off and on again. Nothing seems wrong with it since that little episode.

@mikewagnercmp Please document :slight_smile:

This happens because the default min clock speed is too low and the device fails to boot up. You need to set a min speed of 600MHz in the /etc/rc.local file.

Add this

echo 600000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_min_freq

In System > Startup > local startup.

Oh? How did you find out? Also, this happened only once, so does that mean it automatically tries to operate at higher clock, and only this time had to run at min clock for whatever reason?

It was mentioned here a long time ago.

This only set the min frequency the CPU runs at. It doesn't mean the CPU won't go faster. It sets how slow it can go. It's actually related on how the voltage of the SoC operates at such low speed that prevents the system to booting up properly. By setting it at 600MHz, the voltage of the SoC can be enough to boot up properly.

Cheers!

2 Likes

I was fairly sure the scaling_min_freq issue was mentioned on the toh wiki page, however it doesn't seem to be there anymore and I can't find it in the history either. I must be misremembering. Should it be added near the governor section? The only other way to know about it is to read through all messages in this thread.

I thought about this a little more and there is another possibility apart from flipped bit(s) on the flash: The RAM being broken. And now that this cpufreq issue came up again it makes me think: what happens if the device gets stuck in trying to calibrate DDR RAM at to low voltage for many hours or days? Can that break the RAM? Maybe we should have listened to MediaTek engineers who very clearly and repeatedly stated that they recommend running MT7622 only at full speed and only ever did QA in that way?

4 Likes

Given this newly considered possibility, would you now recommend that all RT3200 users run their devices at ‘performance’?

1 Like

If RAM would be broken, that should likely manifest also in other kind of failures, I would guess. It is a bit hard to imagine a failure that would specifically affect just the bootloader.

But running at full speed with "performance" might still be preferred for several reasons:

  • some QoS qdiscs are vulnerable to changing CPU speeds, so a stable CPU speed might help with calculating, especially with busrty traffic that causes CPU speed to vary.
  • CPU cache drivers may have problems with scaling. After long process to get it fixed in ipq806x, we threw the towel in and set performance as the default in June. https://github.com/openwrt/openwrt/commit/6f5ea752d7c95ba426ca21a6588cae8812bb3e7c
1 Like

Despite my intuition telling me that additional thermal cycling associating with scaling up and down with varying loads would be bad for degradation (compared with more constant, albeit higher temperature), I had understood that maintaining lower temperatures overall might help increase longevity:

Any thoughts?

@mikewagnercmp and @NodeNovelty in respect of your failed devices were you both using 'ondemand' and how regularly were you rebooting your devices? Any special, relevant considerations like proximity to magnets and ambient temperature? Have either of you been able to fix?