OpenWrt Forum Archive

Topic: Be careful with "ipkg remove"

The content of this topic has been archived on 13 Apr 2018. There are no obvious gaps in this topic, but there may still be some posts missing at the end.

Yesterday I did some cleaning on my WRT54GS... I removed packages not used anymore, being happy to see "can't remove XXX because it is necessary for YYY". Then my ipkg stopped working. One step up I noticed I removed libgcc, which was obviously necessary for system to work.

The network still worked ok after that, but I was no longer able to get into router (guess, shell nedeed libgcc smile )

After a reboot - total failure, nothing went up. I bricked it.

But - I'm stubborn. If I bricked it already, I can't do much more harm. I had archived pages from OpenWRT.org, few different FLASH images and, as a last resort, JTAG cable (screw the guarantee, anyway voided by OpenWRT).

First of all, I checked and boot_wait hopefully was operational. Then (against the rules) I tried to tftp WhiteRussian again. No success, but boot_wait still worked. Sigh...

Next, I tried to flash dd - started flashing and PWR led flashed to the et(h)ernity. Screw it, I did a power cycle. Didnt wake up, but boot_wait was STILL OPERATIONAL.

Next I flooded it with original software from Linksys. It did flash correctly, did a reset and woke up. I entered a long seen WWW pages to discover - these were not original but DD pages! What the heck? Long-term memory?

Last, I got into firmware update page and finally flashed back my beloved OpenWRT. To my surprise, my /root/ directory somehow survived the whole mess and after switching from telnet to ssh I was able to log-in using my pubkey... Weird...

So as last words:

1) If you brick your router, don't panic. Most firmware activates boot_wait, which is your friend.
2) Having boot_wait, feed your router with ORIGINAL LINKSYS software.
3) It should wake up, at least enough to feed it with desired flash.

And last-last word - please somehow protect critical packages from deletion. Perhaps you should make a dummy read-only package SYSTEM and make it depending on all necessary ones. This would give at least a warning during doing something stupid...

Ain't this FUN!

- DL

*cough* failsafe

mbm wrote:

*cough* failsafe

*burp* failsafe

- DL

(Last edited by dl on 28 Oct 2005, 09:05)

At least the important packages listet below (list is for White Russian) should NOT be removed!

cat OpenWrt-ImageBuilder-Linux-i686/lists/micro.brcm-2.4

base-files
base-files-brcm
bridge
busybox
dnsmasq
dropbear
hotplug
ipkg
iptables
kmod-brcm-et
kmod-brcm-wl
kmod-diag
kmod-wlcompat
libgcc
mtd
nvram
uclibc
wireless-tools
wificonf
zlib

(Last edited by olli_04 on 28 Oct 2005, 09:23)

For dl, because you asked so nicely -

pkraszewski is using a squashfs firmware image; how do I know that? keep reading ...

The squashfs firmwares use two filesystems, a squashfs filesystem that's compiled into the firmware image and a jffs2 filesystem outside the firmware. Outside? let me explain -

The flash structure on most routers is as follows:

[ bootloader ] [ firmware .................... ]  [ .......................... ] [ nvram ]

The bootloader is always at the start of flash, the nvram is always at the end of flash. The firmare gets stored immediately after the bootloader and can be any length as long as it fits between the bootloader and nvram. The firmware generally takes less than half of that space, leaving you with a large chunk of unused flash what OpenWrt uses for a jffs2 filesystem.

[ bootloader ] [ firmware (kernel+squashfs) ] [ jffs2 .................... ] [ nvram ]

Wait a sec, I only flashed with a firmware, how the hell did that jffs2 filesystem get there? Good question, so glad you asked. When the firmware boots, it actually boots from the squashfs filesystem compiled into the firmware. The squashfs filesystem attempts to mount the jffs2 filesystem, and if that fails, it will use firstboot to format the space as jffs2 and copy files; either way it'll mount the jffs2 filesystem and continue the bootup from jffs2. (This also explains why it seems to take so long for openwrt to boot durring the initial install -- it's running firstboot)

Now, the two key points that everyone needs to learn -

* The squashfs filesystem is readonly; it's impossible to actually modify the squashfs filesystem from within openwrt. This is why we have the jffs2 partition; the jffs2 partition stores only files that have been changed, unchanged files are simply symlinks to squashfs. This means that it was pointless to even consider removing any package that came with the firmware since all you'd be doing is deleteing a symlink from jffs2 and not recovering any space.

* The jffs2 filesystem is outside the firmware. I can't stress this enough. The jffs2 filesystem is not part of the squashfs image. When you flash via tftp, all that happens is that it starts writing the firmware to flash overwriting the existing firmware. If you attempt to reflash using the same squashfs image, the total amount of change is absoltely nothing -- the jffs2 data is after the firmware.

So, the jffs2 data is untouched, the firmware loads, finds the existing jffs2 partition, attempts to boot from it.. nothing has changed. If there's any valid jffs2 data anywhere in the jffs2 partition, the squashfs image will mount it and attempt to boot it. This means that even if you've loaded other firmwares there will still be some jffs2 data that wasn't overwritten and the squashfs firmware will attempt to mount and boot from it, recovering whatever data is still valid. This is why pkraszewski's /root directory was still intact, even after flashing with the linksys firmware.

So what the hell do you do to fix a corrupted jffs2 partition when you're using squahsfs? Simple; boot failsafe. Power up, and hold the reset button for ~2 seconds immediately when you see the DMZ led. When you boot failsafe it won't attempt to mount the jffs2 filesystem, allowing you to telnet in and either fix the problem or just run firstboot again to reset the jffs2 partition.

While in failsafe the ip address is forced to 192.168.1.1 with a mac address of 00:00:BA:DC:0D:ED. You will only be able to get in via telnet; ssh won't work because your ssh configuration is stored on the jffs2 partition.

So ..

*cough* failsafe.

or maybe

*cough* RTFM

@mbm:

Wow, that with the flash layout is very good examplified! smile

Would be really cool if someone could contribute this to the new Faq under "The flash layout" or so.

(Last edited by olli_04 on 28 Oct 2005, 10:20)

mbm wrote:

While in failsafe the ip address is forced to 192.168.1.1 with a mac address of 00:00:BA:DC:0D:ED. You will only be able to get in via telnet; ssh won't work because your ssh configuration is stored on the jffs2 partition.

Yes, very nice explanation. But my comment was prompted by the fact that many of us have reported that failsafe does not always boot to a known "telnetable" (from the lan) configuration. Presumably this is because some part of our custom configuration (startup script?, nvram settings?) is not being ignored. I have run into this many times and have not been able to track down a specific cause/setting, so I just end up reflashing out of it. As I reported in the other thread it seems that br0 is not complete because eth0 is dead.

PS: and my "fun" comment was genuine. I haven't had this much fun since punching tape on a DDP-516 or perhaps writing a CPM macro cross-assembler for the 6502. Whish I had the time to get into the guts of this but I'll leave that up to you guys and worry about getting another "hotspot in a tube" up.

- DL

(Last edited by dl on 31 Oct 2005, 10:43)

pkraszewski wrote:

And last-last word - please somehow protect critical packages from deletion. Perhaps you should make a dummy read-only package SYSTEM and make it depending on all necessary ones. This would give at least a warning during doing something stupid...

Duhh...  hmm  Thats why you run Windows CE Wireless for WRT in Safe mode .. and not openWrt roll Geeezzz..  and you think using the stock firmware this post would have never happened. cool

(Last edited by /usr/local/fox on 28 Oct 2005, 13:14)

mbm wrote:

So what the hell do you do to fix a corrupted jffs2 partition when you're using squahsfs? Simple; boot failsafe. Power up, and hold the reset button for ~2 seconds immediately when you see the DMZ led. When you boot failsafe it won't attempt to mount the jffs2 filesystem, allowing you to telnet in and either fix the problem or just run firstboot again to reset the jffs2 partition.

While in failsafe the ip address is forced to 192.168.1.1 with a mac address of 00:00:BA:DC:0D:ED. You will only be able to get in via telnet; ssh won't work because your ssh configuration is stored on the jffs2 partition.

And how do you get a router like the SE505 V2 into failsave? It doesn't have a DMZ led, and OpenWRT doesn't use the other leds too. I tried it many times, but I never got it into failsave,  no matter when or how long I did hold the reset button.

Stupid you are. Manual didn't you read.

Shame on me!

Due to misunderstanding - I assumed removing package removes it from all filesystems (so, I assumed it works like non-sqashfs for package management) . I didn't remember part of it was on read-only... So - what did I actually remove? Just symlinks to ROM area?

Well, it is usually me who cries RTFM the loudest... Shame on me.

Well, thanks everybody for answers. My OpenWRT is up and running and I'll be more careful next time.

Anyway, critical packages should be somehow protected.

pkraszewski wrote:

Stupid you are. Manual didn't you read.
Anyway, critical packages should be somehow protected.

Chroot?

Well, it is usually me who cries RTFM the loudest... Shame on me.

Can some one PM me and tell me what RTFM means?

RTFM = Read The Fine Manual

You can substitute Fine with your choice of words beginning with the letter F.

The discussion might have continued from here.