Recommendation on next steps with bricked WRT1900ACv1

Bought a WRT1900ACv1 from craigslist "with issues"

  • Managed to reset and have stock fireware run, radios worked etc, but never ran for more than a few hours before rebooting itself and resetting flash
  • Flashing other fireware including newer stock or LEDE usually failed.
  • Many flash resets, many "3 reboots" to flip to alternate image
  • Had one lucky flash and LEDE 17.0.4 running and base config fine, but just like stock reboot itself after several hours and reset flash (back to stock)
  • Screwed around with enough today (resets, failed LEDE reflash, ..) that it only does two things with a power cycle, power led flash, occasioanlly power led and esata led solid. Never responds to ping with static client IP, doesn't respond to reset button holds.

So I am thinking my next step is USB-TTL to see if I can reflash. Have flashed many dd-wrt and lede routers but have not done the serial route yet. Also honestly not sure if corrupt flash is the only issue with this thing. So main question is ...

Since this thing booted and ran ok for a while with LEDE and stock firmware, but then after a few hours reboot/reset, do you guy think corrupt flash would do this? or it is more likely hardware? Trying to gauge it is worth getting the USB-TTL and cables for this thing.

TIA

This sounds like a hardware problem. Try a different power adapter because sometimes they go bad causing voltage drops and crashing. Also you could open up the unit and look for swelled or leaky capacitors, again that would cause power drops.

A serial adapter is something that everyone who does much with flashing routers should own though.

Thanks mk. This helps, and you are right on the USB-TTL. At this point I am far enough into this stuff and have enough junk routers I will get one. This router has a 4A power supply, everything I have around the house is <= 2A. One question. I think it is ok to have more amps correct? i.e. a 5A would be ok?

Yes, a supply with the same voltage but rated for more amps is OK.

Hi my vote is on the power adapter to. Pleas let us know how it gos. Good luck D:

A broken fan would be another simple cause, as these devices run hot, they will overheat after a while if the fan doesn't work.

Thanks all. I am on the hunt for a 12V 4A+ power supply. Went through the whole house and came up short. Thanks for the suggestion on the fan, I know it works though. When LEDE boots it will run the fan so I know it works at least.

Ok. I have updates, some good, some not. First thanks for the tips so far. I managed to get a 12V 5A power supply with the correct barrel, a USB/TTL chip and 2.54mm to 2.00mm cables. Can boot and bring up the serial interface in putty. I was able to flash both images (primary/alt) from the marvell command line. So all that is good. The bad is the router will not boot either image yet. The most ominous boot log is the first one below. All the snipped parts seems to match the bootlogs here https://wiki.openwrt.org/toh/linksys/wrt1x00ac_series

<snipped>

 __   __                      _ _
|  \/  | __ _ _ ____   _____| | |
| |\/| |/ _` | '__\ \ / / _ \ | |
| |  | | (_| | |   \ V /  __/ | |
|_|  |_|\__,_|_|    \_/ \___|_|_|
         _   _     ____              _
        | | | |   | __ )  ___   ___ | |_
        | | | |___|  _ \ / _ \ / _ \| __|
        | |_| |___| |_) | (_) | (_) | |_
         \___/    |____/ \___/ \___/ \__|
 ** LOADER **


U-Boot 2011.12 (Feb 06 2014 - 17:14:13) Marvell version: v2011.12 2013_Q1.2

Boot version:v1.3.25

Board: RD-AXP-GP rev 1.0
SoC:   MV78230 B0
       running 2 CPUs
       Custom configuration
CPU:   Marvell PJ4B (584) v7 (Rev 2) LE
       CPU 0
       CPU    @ 1200 [MHz]
       L2     @ 600 [MHz]
       TClock @ 250 [MHz]
       DDR    @ 600 [MHz]
       DDR 32Bit Width, FastPath Memory Access
       DDR ECC Disabled
DRAM:

It hangs on DRAM and the power and sata leds are solid. It did this before and after flashes, but only sometimes. I am guessing this might be the deathknell for this router with bad DRAM?

It does run the bootloader farther. After I reflashed LEDE on the primary and Linksys onto the alt image via TFTP. This are the bootlogs after.

<snipped>

U-Boot 2011.12 (Feb 06 2014 - 17:14:13) Marvell version: v2011.12 2013_Q1.2

Boot version:v1.3.25

Board: RD-AXP-GP rev 1.0
SoC:   MV78230 B0
       running 2 CPUs
       Custom configuration
CPU:   Marvell PJ4B (584) v7 (Rev 2) LE
       CPU 0
       CPU    @ 1200 [MHz]
       L2     @ 600 [MHz]
       TClock @ 250 [MHz]
       DDR    @ 600 [MHz]
       DDR 32Bit Width, FastPath Memory Access
       DDR ECC Disabled
DRAM:  256 MiB

Map:   Code:            0x0fea7000:0x0ff5e2d4
       BSS:             0x0ffefd80
       Stack:           0x0f9a6ef8
       Heap:            0x0f9a7000:0x0fea7000

<snipped>

FPU initialized to Run Fast Mode.
USB 0: Host Mode
USB 1: Host Mode
USB 2: Device Mode
Modules Detected:
mvEthE6171SwitchBasicInit finished
Net:   mvSysNetaInit enter
set port 0 to rgmii enter
set port 1 to rgmii enter
egiga0 [PRIME], egiga1
modify Phy Status
auto_recovery_check changes bootcmd: run nandboot
Hit any key to stop autoboot:  0

NAND read: device 0 offset 0xa00000, size 0x400000
 4194304 bytes read: OK
## Booting kernel from Legacy Image at 02000000 ...
   Image Name:   ARM LEDE Linux-4.4.92
   Created:      2017-10-17  17:46:20 UTC
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2158384 Bytes = 2.1 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK

Starting kernel ...

So hangs on attempting to start LEDE image from primary. Power led blinking

Next attempt flips to alt image (Linksys) and I get the following.

<snipped>

FPU initialized to Run Fast Mode.
USB 0: Host Mode
USB 1: Host Mode
USB 2: Device Mode
Modules Detected:
mvEthE6171SwitchBasicInit finished
Net:   mvSysNetaInit enter
set port 0 to rgmii enter
set port 1 to rgmii enter
egiga0 [PRIME], egiga1
modify Phy Status
Saving Environment to NAND...
Erasing Nand...
Writing to Nand... done
#### auto_recovery:2 ####
auto_recovery_check changes bootcmd: run altnandboot
Hit any key to stop autoboot:  0

NAND read: device 0 offset 0x3200000, size 0x400000
 4194304 bytes read: OK
## Booting kernel from Legacy Image at 02000000 ...
   Image Name:   Linux-3.2.40
   Created:      2017-07-07  17:20:18 UTC
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    3902992 Bytes = 3.7 MiB
   Load Address: 00008000
   Entry Point:  00008000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK

Starting kernel ...

Uncompressing Linux...

uncompression error

 -- System halted

Hangs with power led blinking. It is possible I pulled down the wrong Linksys image, but it is the one from the Linksys site. I will mess with this router more but main questions for you guys are:

  • Is this DRAM hang a HW failure for sure?
  • Is it possible the bootloader environment or bootloader itself is corrupt and causing this. I am thinking this is unlikely, especially the bootloader itself as it seems to run fine, when the router doesn't hang on DRAM.

TIA

A couple updates and questions

I have been messing with this router and at this point I am 98% sure this box has bad memory. As noted before sometimes hangs on "DRAM" in bootloader, sometimes fails to load firmware with "uncompression error" logs. Sometimes will load the firmware and then eventually take a dump with various "oops" or full on kernel panics. Especially since the errors don't happen at consistent points with no changes, I am assuming HW. Here are a couple examples. I am not a linux genius but these seem to be in the area of "memory". These are all from stock firmware, but LEDE barfs also.

Kernel panic - not syncing: Fatal exception in interrupt
[<c0013c88>] (unwind_backtrace+0x0/0xe4) from [<c054db88>] (panic+0x50/0x180)
BUG: Bad page state in process zebra  pfn:0d2bc

---

/usr/bin/lua: /etc/init.d/service_devidentd/deviceupdate.lua:54: invalid UUID string
stack traceback:
        [C]: in function 'uuid'
        /etc/init.d/service_devidentd/deviceupdate.lua:54: in main chunk
        [C]: ?
Internal error: Oops - undefined instruction: 0 [#1] SMP
Modules linked in: ap8x(P+) nf_nat_ftp nf_conntrack_ftp orion_wdt mod_bdutil(O)

---

Unable to handle kernel paging request at virtual address 4f480074
pgd = ccc98000
[4f480074] *pgd=00000000
Internal error: Oops: 15 [#4] SMP

I am running the simple U-Boot memory test now, not sure if it will show anything. I will post back if it does. Will likely take hours to run. My last ditch after that is to try and recover the bootloader itself. I don't have high hopes on this, since the bootloader seems to run fine usually.

Only other possibility I can think of is corrupt flash somewhere. I am in over my head on that clearly but a couple questions anyway...

  1. Is this the correct U-Boot command to reset bootloader environment?

env default -a

I got from here (https://wiki.openwrt.org/toh/linksys/wrt1x00ac_series) but this doesn't work and gives me usage output. In the usage there is a -f flag which does run.

  1. Or is this...

env default -f

  1. Should I expect the env vars for the bootloader to be defaulted after this run? I thought they would but printenv shows openwrt* vars which I am assuming are from LEDE boots and are NOT default.

  2. Should I run any manual flash erase functions in the bootloader. KNOW beforehand I have little idea of what I am doing in this space.

Any feedback is appreciated!

If all else fails, McDebian might work, given how it works.

I have a WRT1900AC V1 with the power and esata lights constantly on. It fails also when it gets to the DRAM portion of the booting process. Did you ever get your router working? Or did you trash the router. I am over my head on this router but if it can be reflashed I would jump up for joy.

I didn't. I wrestled with it flashing firmware and the bootloader, corrupting NAND etc. Still hung on DRAM during boot at times and other random freak outs. I am fairly confident there is bad mem in it. I did use mtest from u-boot and it reported bad mem in a specific spot, however if I ran mtest test on different address ranges I got varying reports on other bad spots. Entirely possible I am using mtest incorrectly.

I punted and this router and it is in the brick pile now. The remaining things I considered trying were.

  • Find a better memtest.. that could be loaded and run. It appears that mtest has a more detailed option but you have to recompile. There may be other better low level tests.
  • kernal memmap options don't work on this router. I think the processor has to support ACPI for it to work.
  • Might be able to use DTS and a patch to exclude memory if you know where it is. There is some more info here Is it possible to use memmap or equiv to skip bad RAM with LEDE?

All this said even if you can get linux to ignore the bad mem. If you can't get u-boot to load the kernal consistently it won't matter.

HTH

If somehow you get yours running please let me know how you did it. When it runs it works well.

Oops one more note
I can't remember the LED sequences to see if they match yours. Happy to boot it a couple times if that somehow helps. Just let me know.