Reflashing corrupt partition on LS-EA8300

Hello Experts!

I have a curious case of what I believe is ubi partition corruption. I have a Linksys EA8300 ( a.k.a. test device ), whose main partition ( mtd10 ) is non bootable. It had OpenWrt 24.10.2 installed. The alt partition ( mtd12 ) does boot well and has OpenWrt 22.03.6 installed and is working. When I try to boot into main partition in System --> Advanced Reboot, the router just hangs. The triple on-off switch trick lets me boot back into working alt partition.

Here are some listings on the test device :

1. root@iLink-02:~# fw_printenv

altkern=5f80000
auto_recovery=yes
baudrate=115200
boot_part=2
boot_part_ready=3
boot_ver=1.2.9
bootcmd=if test $auto_recovery = no; then bootipq; elif test $boot_part = 1; then run bootpart1; else run bootpart2; fi
bootdelay=2
bootpart1=set bootargs $partbootargs && nand read $loadaddr $prikern $kernsize && bootm $loadaddr
bootpart2=set bootargs $partbootargs2 && nand read $loadaddr $altkern $kernsize && bootm $loadaddr
ethact=eth0
ethaddr=01:23:45:67:89:ab
flash_type=2
flashimg=tftp $loadaddr $image && nand erase $prikern $imgsize && nand write $loadaddr $prikern $filesize
flashimg2=tftp $loadaddr $image && nand erase $altkern $imgsize && nand write $loadaddr $altkern $filesize
image=dallas.img
imgsize=5800000
ipaddr=192.168.1.1
kernsize=500001
loadaddr=84000000
machid=8010006
netmask=255.255.255.0
partbootargs=init=/sbin/init rootfstype=ubifs ubi.mtd=11,2048 root=ubi0:ubifs rootwait rw
partbootargs2=init=/sbin/init rootfstype=ubifs ubi.mtd=13,2048 root=ubi0:ubifs rootwait rw
prikern=780000
serverip=192.168.1.254
stderr=serial
stdin=serial
stdout=serial

2. root@iLink-02:~# block detect

config 'global'
        option  anon_swap       '0'
        option  anon_mount      '0'
        option  auto_swap       '1'
        option  auto_mount      '1'
        option  delay_root      '5'
        option  check_fs        '0'

3. root@iLink-02:~# block info

/dev/mtdblock11: UUID="9405828549" VERSION="1" TYPE="ubi"
/dev/ubiblock0_0: UUID="abcdef01-8e62ab57-452e94ab-68f51160" VERSION="4.0" MOUNT="/rom" TYPE="squashfs"
/dev/ubi0_1: UUID="89abcdef-80dd-457d-968b-8ea05a90d276" VERSION="w5r0" MOUNT="/overlay" TYPE="ubifs"

4. root@iLink-02:~# df -h

Filesystem                Size      Used Available Use% Mounted on
/dev/root                 4.5M      4.5M         0 100% /rom
tmpfs                   121.2M   1012.0K    120.2M   1% /tmp
/dev/ubi0_1              65.4M    756.0K     61.3M   1% /overlay
overlayfs:/overlay       65.4M    756.0K     61.3M   1% /
tmpfs                   512.0K         0    512.0K   0% /dev

5. root@iLink-02:~# cat /proc/mtd

dev:    size   erasesize  name
mtd0: 00100000 00020000 "sbl1"
mtd1: 00100000 00020000 "mibib"
mtd2: 00100000 00020000 "qsee"
mtd3: 00080000 00020000 "cdt"
mtd4: 00080000 00020000 "appsblenv"
mtd5: 00080000 00020000 "ART"
mtd6: 00200000 00020000 "appsbl"
mtd7: 00080000 00020000 "u_env"
mtd8: 00040000 00020000 "s_env"
mtd9: 00040000 00020000 "devinfo"
mtd10: 05800000 00020000 "kernel"
mtd11: 05500000 00020000 "rootfs"
mtd12: 05800000 00020000 "alt_kernel"
mtd13: 05500000 00020000 "alt_rootfs"
mtd14: 00100000 00020000 "sysdiag"
mtd15: 04680000 00020000 "syscfg"

6. root@iLink-02:~# ubus call system board

{
        "kernel": "5.10.201",
        "hostname": "iLink-02.lan",
        "system": "ARMv7 Processor rev 5 (v7l)",
        "model": "Linksys EA8300 (Dallas)",
        "board_name": "linksys,ea8300",
        "rootfs_type": "squashfs",
        "release": {
                "distribution": "OpenWrt",
                "version": "22.03.6",
                "revision": "r20265-f85a79bcb4",
                "target": "ipq40xx/generic",
                "description": "OpenWrt 22.03.6 r20265-f85a79bcb4"
        }
}

I tried to mount /dev/mtdblock11 on /tmp/mtd11 using

mount -t ubi /dev/mtdblock11 /tmp/mtd11/

But got error :

mount: mounting /dev/mtdblock11 on /tmp/mtd11/ failed: No such device

I tried similar mount on production device and it succeeded. Does it mean the main partition on test device is so corrupt that it is unmountable? What are my options now? I guess, I can still flash the old openwrt it once had. Just wanted to confirm if it makes sense.

Thank you all! :slight_smile:

-Gamma

Postcript :

Where did it all begin? This test device LS-EA8300 was a production device sometimes back. One day it just went off and hanged. Fortunately alt partition was still working back then too. But I thought it was too risky to continue. So I bought another device ( linksys wrt1900ac ) as my production device. Afterwards the test device never booted well.

While looking at the test device's charger I realised that it is delivering only 2A, though the specs expect 3 to 4 amps of current. That prompted me to fire up the test device using a 5A charger. And yes, it magically came back to life. But alas, the main partition was still corrupt. I think the current deficient system behaved erratically so as to corrupt the running partition. I believe this is how it all began. Your views are welcome!

Can we assume that, before flashing 24.10 you read the wiki doc, and perform the boot variable tweak? I can read here kernsize = 500001

In general, the solution is easy - boot up to the working partition and sysupgrade from there (which will flash 'the other' partition, the non-working one, leaving the currently booted/ working partition alone). But badulesia raises an important point, so make sure -and be specific- which partitions hold which versions - and that you have adapted the kernsize.

Hi, Yes, I increased the size form 300000 to 500001. I deliberately avoided 500000 so as to avoid missing zeroes. It was a working partition. All of a sudden it went down without any warning. I suppose it was due to underfed current. Thanks.

-Gamma

So you assume it is a power supply failure. In this case this should happen whatever partition is booting.

The partition scheme of the OpenWrt image (since 23.05) is due to kernsize = 500000. I'm not sure what will happen if you manually set it to 500001, but it may interfere. Booting 22.03 succeed as the tweak is not necessary. Booting 24.10 fails as it is necessary. I recommand to set it to 500000 and see.

Other possibilities: reflash OEM firmware to restart from fresh, monitor the boot process with serial link.

Hi @Slh,

I flashed 24.10.2 on from mtd13. Mission successful. The router is booted in to 24.10.2. But fw_printenv shows boot_part_ready=3.

I set it to 1 and everything is back to normal. Thank you! :slight_smile:

-Gamma

Hi @badulesia ,

Thank you for your inputs. Unfortunately I have no tools for serial access. I am not at all into hardware level access. Haveing said that, as of now I reflashed ow 24.10.2 into mtd11 and it is booting stable.

Regarding the kernsize, I always had 500001 for ow 23.*.* and newer without any problems. In case something goes wrong, I will alter that to min limit 500000. Stopping the fixing as soon as it is no more 'broken' seems a safer way to me.

Thanks once again! :slight_smile:

-Gamma

Good. Now consider upgrading to latest 25.12.4.