Bricked Linksys WRT1200AC after Upgrading OpenWrt CC to LEDE

Hi folks,

last weekend I wanted to update my router from its current OpenWRT Chaos Calmer (latest) installation to LEDE (current 17.1.3 version) to address all the open/ critical security issues (mainly dnsmasq). I did backups, package lists and the whole battle plan you do and checked all online resources for potential issues I could hit and then did the reflashing with the sysupgrade tool. And that was the last I saw of my router. Luckily the reboot+ 3 time switch off trick to switch back to the old OpenWRT installation worked, so I got back a working router after I cursed for like 2 hours in the middle of the night. :slight_smile:

root@sapphire /tmp # sysupgrade -v /tmp/lede-17.01.3-mvebu-linksys-wrt1200ac-squashfs-sysupgrade.bin
Saving config files...
etc/collectd.conf
etc/config/dhcp
etc/config/dropbear
etc/config/firewall
etc/config/luci
etc/config/luci_statistics
...
[cut]
...
etc/uhttpd.key
etc/uhttpd.crt
killall: watchdog: no process killed
Sending TERM to remaining processes ... uhttpd ntpd collectd pppd ubusd askfirst dnsmasq logd rpcd netifd odhcpd crond 
Sending KILL to remaining processes ... askfirst 
Switching to ramdisk...
Performing system upgrade...
Unlocking kernel1 ...

Writing from <stdin> to kernel1 ...     
ubiattach: error!: cannot attach mtd5
           error 22 (Invalid argument)
ubiformat: mtd5 (nand), size 35651584 bytes (34.0 MiB), 272 eraseblocks of 131072 bytes (128.0 KiB), min. I/O size 2048 bytes
libscan: scanning eraseblock 271 -- 100 % complete  
ubiformat: 81 eraseblocks are supposedly empty
ubiformat: warning!: 191 of 272 eraseblocks contain non-UBI data
ubiformat: warning!: only 0 of 272 eraseblocks have valid erase counter
ubiformat: erase counter 0 will be used for all eraseblocks
ubiformat: note, arbitrary erase counter value may be specified using -e option
ubiformat: use erase counter 0 for all eraseblocks
ubiformat: formatting eraseblock 271 -- 100 % complete  
UBI device number 2, total 272 LEBs (34537472 bytes, 32.9 MiB), available 248 LEBs (31490048 bytes, 30.0 MiB), LEB size 126976 bytes (124.0 KiB)
Volume ID 0, size 22 LEBs (2793472 bytes, 2.7 MiB), LEB size 126976 bytes (124.0 KiB), dynamic, name "rootfs", alignment 1
Set volume size to 28696576
Volume ID 1, size 226 LEBs (28696576 bytes, 27.4 MiB), LEB size 126976 bytes (124.0 KiB), dynamic, name "rootfs_data", alignment 1
sysupgrade successful

The above log is the last I saw from my router before the SSH connection terminated. I waited like 20 minutes and it didn't come back up; rebooted not coming back up either.

tokai@beryl ~/Desktop $ ping sapphire
PING sapphire (192.168.0.10): 56 data bytes
ping: sendto: No route to host
ping: sendto: Host is down
Request timeout for icmp_seq 0
ping: sendto: Host is down
Request timeout for icmp_seq 1
^C
--- sapphire ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss

The router was nowhere to be seen. The LAN itself still worked (I could ping my other machines). I even switched to the default 192.168.1.* IP range to see if perhaps it was reset for some reason; did full LAN scans… nothing. No router, just my other machines.

I have no idea what went wrong, what a potential issue could be. I was told LEDE is basically just an OpenWRT update and it should just work. :slight_smile: I never upgraded OpenWRT before. CC 15.05.1 was my first installation. So basically on the first partition the original firmware was still present, and OpenWRT CC was located on the second partition. Flashing LEDE overwrote the original linksys firmware (since it's now broken I no longer have a fail-safe in case I get a problem with my CC installation).

It dumped a whole bunch of errors and warnings during the sysupgrade, but I have no idea if any of those are normal or serious concern or what the problem could have been.

So my questions:

  • does the sysupgrade output looks normal? Are those errors and warnings normal or something to be concerned about?
  • are there known incompatibilities between OpenWRT config files and LEDE that could prevent the network properly initializing/ coming up?

I sadly don't have a USB debug cable and the router is still in warranty (so I couldn't open it up anyway) and I don't know if the network fail-safe packet stuff was triggered (I first read about that after I was back up online with the old OpenWRT CC installation).

Thanks in advance for any kind of extra information and help!

Some advice for you:

  • support for the AC1900/1200/3200ACx series is not complete in the old stale CC15.05 branch. Some core settings have changed since then. So sysupgrade with settings is not safe.
    • If you sysupgrade to LEDE 17.01 branch or LEDE master, make sure that you do NOT keep your settings during sysupgrade. Just let the default config be applied and then manually reconfigure things. Do NOT restore settings from a CC15.05 backup.
  • LEDE, Openwrt and Linksys OEM firmware all do flash the "other" partition and will keep the currently running partition as a safe fallback partition. so, it is quite expected that the other parition hgets overwritten.
    • You can use the CC15.05 to flash the Linksys OEM firmware to the other partition, so that you could have two ok partitions (in case you are hesitant to try flashing LEDE again)

I am using WRT3200ACM myself, and sysupgrading between LEDE versions works quite normally. But like I said earlier, CC15.05 is hopelessly out of sync.

You might want to wait for the forthcoming 17.01.4 release, which is likely to get released in a few days.

In general, the dual-partition scheme makes recovery from a faulty flash relatively easy as you have noticed :wink:

1 Like

@hnyman

Thanks for the reply.

If sysupgrading with settings is not safe then this should be added to the documentation, IMHO, before the next one runs into trouble like me. Preferably in big red letters. :slight_smile: In fact LEDE's documentation states the exact opposite. Similar no mention about any of this on the Linksys WRT1200/etc. site of OpenWRT's documentation, even they are now recommending the LEDE branch.

Is there a list of this breaking changes somewhere? I checked the release notes of all 3 LEDE releases, but couldn't find any hint about this. In theory one could simply mount the mtd5 partition (never tried that) and fix the offending options then manually, no?

Personally I don't really understand why a new config option or deprecated config option should break the whole thing (in worst case it simply should fall back to default, no?). Sounds like a conceptual problem. :smiley: Just as an idea: maybe there could be at least some kind of simple "migration" script that can be run optionally on the current installation which checks the config files for breaking options and warn the user.

I have to admit I'm not too keen about rebuilding my fine-tuned config from scratch (the initial installation took me like a month to get to the stage where it is now; and then it run stable for over 200 days as one would expect from a router :slight_smile: ).

The problematic changes can be router-specific, so here is no problem for most routers.

The critical parts are the different switch config and some network config options that are directly related to hardware. WRT1x00ACx series support in 15.05 is so initial, that lots has changed since that. Here is mvebu specific change log for 17.01:
https://git.lede-project.org/?p=source.git;a=history;f=target/linux/mvebu;hb=refs/heads/lede-17.01

You probably can use pretty much all other config files in /etc/config except system and network. But e.g. LED and switch definitions have changed etc.

1 Like

The good news is that most of the config data is in files under /config, so you
should be able to go through the new files and cut-n-paste most of the stuff
from your old config.

1 Like

I'll give it another try. This will become a long/ busy weekend. :slight_smile:

Earlier this week, I was in the same situation with a WRT1900AC unit with the lede-17.01.2-mvebu-linksys-wrt1900ac-squashfs-factory.img firmware.

I figure people running OpenWRT CC era firmwares will be looking to upgrade due the KRACK vulnerability, and will be hitting the forums due to breakage.

I have a USB-TTL cable, so I was able to check where it was failing to boot.

I could get to the Marvell prompt, and used the openwrt.org WRT1x00ac corrupt bootloader recovery process.

However, this was not working.
I was getting the following error:

Verifying Checksum ... Bad Data CRC
ERROR: can't get kernel image!

After a while I realised I was overlooking the checksum part of the error.
Although the hash for the firmware matched the source, I found a reference about the firmware image size - https://github.com/Chadster766/McDebian/issues/7#issuecomment-202169021
I set the size as per the reply, and the checksum error was resolved, only to be replaced with:

Wrong Image Format for bootm command
ERROR: can't get kernel image!

Another search found a resolution - https://community.linksys.com/t5/Wireless-Routers/WRT1900AC-and-OpenWrt/m-p/918073#M293531 and I ended up running the following to get it booting. Some of this is counter to the instructions given in the openwrt .org recovery steps.

setenv ipaddr 192.168.1.1
setenv serverip 192.168.1.2
tftp 2000000 u-boot-nand.kwb
nand erase 0 e0000
nand write 2000000 0 e0000
run update_both_images
boot

Lastly, if you have a saved configuration and you were running a firmware from the OpenWRT CC days, check these dnsmaq group bugs:
OpenWRT bug 22271 - https://dev.openwrt.org/ticket/22271
OpenWRT bug 22300 - https://dev.openwrt.org/ticket/22300

I hope this helps out others.
(First post in these forums and the software does not allow more than two links per post for new users ... wtf ? I had to deliberately break the hyperlinks)

1 Like

This is an antispam measure, the link limit disappears once a user has been active for a while in the forum. Meanwhile, I fixed the links in your post.

I'd been holding out waiting for the pending OpenWRT/LEDE merge but in light of KRACK and its rapid patching in LEDE have decided to upgrade my Linksys WRT1900ACS to LEDE-17.01.4 so this thread and your post are very timely.

After reading the original post in this thread by tokai I'll be taking a backup from the current install but am mindful that I should not simply restore it, to which end I'll backup manually all configuration files under /etc/ for copying and pasting back configuration back over manually.

Slightly concerning that Checksum for your WRT1900AC didn't checkout and prevented it from booting (mindful of bricking a > £100 router!), especially as I've no USB-TTL cable.

Fingers crossed things go well.

Pleased to say the upgrade has gone pretty seamlessly.

I backed up /etc and moved it off the router as well as making a backup through LuCi just in case I had to go back to OpenWRT.

Flashing the sysupgrade went fine and a fresh LEDE install booted. I manually modified settings by editing files in /etc/config/ to match those I'd backed up from the old OpenWRT install, rebooted and all is back as it was. Most pleased it all went well.

Final step is to flash the mwlwifi drivers that have just been rebuilt. Initial attempt and the package is thought to be a downgrade so the --force-reinstall option is required.

Just joined the club. Unfortunately, only now I see this thread.
After upgrade from openwrt 15.05.1 to lede 17.01.4 the internet
connection stopped working as well as other ethernet ports. Right
now my router is utterly bricked since I was stupid enough to
update both boot partitions. No connectivity whatsoever anymore.
Have to order the serial cable.

Is there a way to somehow recover the thing without the serial
cable? Can it perhaps boot from the USB port?

Any idea of why the all the ports be borked? Left vlan switch settings in-place?
What about wifi, you might be able to put an image on /tmp and flash, wee bit risky, but you are already there.

Sounds like the Linksys WRTxx00ACx series bootloader-based dual-partition fallback is not an option anymore, but you should be able to enter the Openwrt/LEDE failsafe mode (that can be entered during early Openwrt/LEDE boot process).

In this failsafe mode the router bypasses all your config files and only uses the contents of the firmware itself = the LEDE defaults (like 192.168.1.1 etc.). In failsafe mode you can then use "firstboot" to wipe the jffs2 /overlay partition, so that on the next normal boot the router boots up fresh.

Entering failsafe requires a button push inside a definite two-second time window, so it may take a few tries to get in...

I have used failsafe with WRT3200ACM, so it works for the mvebu devices. (there is the WPS button on the back panel) In WRT3200ACM the correct moment is about 10-15 seconds after power-on. (first the power LED blinks a few seconds during u-boot bootloader phase, then the LED remains steadily lit for a few seconds, after which it starts to blink that 0.1 sec indicator rhythm for two seconds during which you need to push the button.)

Read advice at wiki. There are also the key commands explained...
https://lede-project.org/docs/user-guide/failsafe_and_factory_reset

On most routers, LEDE will blink a LED (usually “Power”, may be other) during the boot process after it gets control from the initial bootloader (like u-boot). LEDE will rather early in the boot cycle check if the user wants to enter the failsafe mode instead of a normal boot. It listens for a button press inside a specific two second window, which is indicated with LEDs and by transmitting an UDP package.

There are three different (power) LED blinking speeds during boot for most of the routers:

first a moderate 0.1 second blinking rhythm during those two seconds, when router waits for user to trigger the failsafe mode
then either
    a slow 0.2 second blink continuing to the end of boot, if the failsafe was not triggered and the normal boot continues
    a rapid 0.05 second blink if the user pressed a button and failsafe mode was triggered

To enter failsafe mode, follow one of the procedures listed below:

Wait for a flashing LED and press a button. This is usually the easiest method once you figure out the correct moment.

One way is naturally pushing rapidly the WPS button after turning power on.

Successful entrance to failsasafe is rather indicated with very rapid continuous LED blinking, like explained in wiki.

mount_root
firstboot
reboot

Forgot to mention that I've already tried the failsafe option,
but no cigar.

The "Internet" port is completely dead. Surprisingly, on port 1
there were bootp requests coming in, so I gave it an IP address
and offered a file for tftp, but no further communication, not
even ping worked.

Obtained a USB TTL thing, but nothing comes at the serial port.
Then tried with a null modem cable (have a serial port on the
workstation), but again nothing. Neither with 115200 nor with
9600 baud. Not sure what else can I try.

First and foremost, there's a reason "It is highly recommended to invest in a USB - TTL cable" is the 2nd main bullet in the WRT AC Series' Wiki Introduction

  • No one should be flashing 3rd party firmware without some means of serial access (USB-TTL, UART, etc.)

.
Please detail exactly how you've connected your USB-TTL Cable to the serial header, as the likelihood of the main board being damaged is so astronomical, this is likely user error. Have you read the Serial Port, Serial Interfaces, and Serial Firmware Flash sections of the WRT AC Series Wiki.?

Watch this tutorial...

I have 24 hours to try to fix this before getting on a plane. Any ideas of a shortcut to get back up and running? I have a WRT1200AC and I loaded

openwrt-18.06.1-mvebu-cortexa9-linksys-wrt1200ac-squashfs-sysupgrade

Now I no longer get an IP address assigned. I guess I used the wrong version to upgrade the factory image.

169.254.10.87

Have you reviewed the WRT AC Series ToH wiki that's been mentioned several times in this thread? Did you follow the instructions in Flashing Firmware

Actually I am taken aback after reading more about this router. Not sure about the the flashing from partitions, etc. I have been using various custom firmwares since the old WRT54G and switched to OpenWrt a few years ago since it seems more "serious minded". After struggling with learning the interface, I normally just look for a recommended router, download the firmware and flash. I always assume that if a firmware is listed in Table Of Hardware, it's pretty much download and go, well other than spending weeks at a time trying to TFTP it if the destination router is a Buffalo LOL. Looks like this whole setup is a different animal and I may have to slide this back in the box and use an old Buffalo that I have until I can invest several weeks for a learning curve to get familiar with what all this dual boot and other talk is regarding this router. I found this unit on sale for a good price considering the specs. Should have read a little more about it ...... always assume the world continues to get more complicated.