EA7300, Flash to "Wrong" Partition

Well they are still different targets, which might mean it's easy for one and impossible for the other, or somewhere inbetween

but how does the alt_kernel know that it must load the alt_rootfs? in other words, it knows the offset it is at somehow

when we worked with OKLI for example, we had to give the offset of the real kernel, but whatever is used to find the rootfs always find the right one by itself?

Getting closer, I think :rofl:

OK, found out how to get the environment - seems I need to add in fw_init_cmdline(); prior to the check for IS_ENABLED(CONFIG_MIPS_CMDLINE_DTB_EXTEND). If I do this, I can get bootargs! I found this in the EA3500 (though I'm so confused what the real arch for that one is :stuck_out_tongue_winking_eye:) - but this code is included in the stock kernel actually (for that device). So, adding it in, I can then get the needed variable(s). And to your question - that's how the other devices also "signal" the correct partition (boot_part sets the kernel target, rootfs comes from bootargs).

So close it seems, just two odd things,

  1. I manually added the correct rootfs (more on that below), and it does then load that rootfs. But, I seem to be getting a bunch of errors like the following. Ever seen this? Almost thinking it just has some offset not quite right? BTW, from the console, Kernel command line: console=ttyS0,115200 root=/dev/mtdblock8 rootfstype=squashfs,jffs2 rootfstype=squashfs,jffs2 ... OK, it adds one on top of my manual entry. LOL!
[    2.633725] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000000: 0x4255 instead
[    2.643216] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000004: 0x0001 instead
[    2.652668] jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000018: 0xdd8f instead
  1. The reason for the manual entry - pulling my hair out with this one, can't find the trigger, but it seems like sometimes / often U-Boot modifies (resets, of sorts?) the bootargs => destroys what OpenWrt set, which breaks using dual partition (with OpenWrt control). Still trying to figure this one out. If I can, I think this is all possible now.

Thanks!

Hmmm, and booting back to the original - seeing a bunch of,

[   21.513149] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 561.0
[   21.520363] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 561.1
[   21.527714] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 561.2
...

Poking around I see a lot of notes and changes related to think. Need to figure out where this page is.

Hmmm ... seems some NAND related bugs may have reappeared just recently (below), thinking this may be part of my grief. Dang it!
https://bugs.openwrt.org/index.php?do=details&task_id=1926

Will try to keep an eye on NAND related fixes - but that may stall me for a bit here. FYI, I booted over TFTP (initramfs), and my kernel + initramfs works just fine ... and from there, I could erase the alt_kernel and alt_rootfs (no errors reported). Just need to resolve the NAND issue not (not really believing it, but also not sure how to really check the NAND :frowning_face:). Then I can get back to the couple open items.

Thanks for the pointers!

BTW, hoping someone understands this better than I do (which would be pretty easy :rofl:). Thinking that for 2k pages in NAND (from the datasheet of the flash, also U-Boot output), then the noted pages above (starting at 561.0) ... they would be at address 2048 x 561 => 1148928 (0x118800). Agreed? Checking the partition map, this is inside mtd1 (u_env). I installed nand-utils (running from initramfs boot), and ran nanddump --file /tmp/mtd1.nanddump /dev/mtd1, the output is,

ECC failed: 0
ECC corrected: 0
Number of bad blocks: 0
Number of bbt blocks: 0
Block size 131072, page size 2048, OOB size 64
Dumping data starting at 0x00000000 and ending at 0x00040000...

So the flash really does seem to be OK? Just to match to the backup (kernel, rootfs) partition(s), I also tried this on mtd7 and mtd8 (alt_kernel and alt_rootfs), also no ECC failed or corrected.

Thoughts?

Thanks!

OK, hunting through, found the bad flash partition ... guessing it got messed up at some point - flashing, debugging, etc. But I had a backup :stuck_out_tongue_winking_eye:. Re-flashed that partition, all good. Not sure I understand the page numbering, but that aside ... all good again.

Flashing, etc. is all working => the only issue is that the U-Boot environment is not being used within the command line, so it's booting from the correct kernel, but not the correct rootfs (always defaults to mtd6). Need to find where / how to select the correct rootfs.

Thanks!

OK, hunting around, and looking at my other dual partition device - I found the key! Now, how to use it ... LOL. It seems that on sysupgrade, for the (working) EA3500, the partitions are renamed (somehow :rofl:). Then, this code takes over,

It auto-loads the partition labeled "ubi" ... which is not happening on the EA7300. Errr, it is, but the names are not getting updated. I just need to find the code where the partitions are renamed.

Thanks!

OK, finally getting somewhere on this :stuck_out_tongue_winking_eye:. Have been pulling my hair out, trying to get OpenWrt to flash and run on the alternative partition (mtd8), but it seems there are a few things that need to be in place to make it all work. Some of these, below,

  1. Need to change the dts file, alternative kernel and rootfs cannot / should not be read-only.
  2. Partition naming is key! I have been flashing from U-Boot, for debug - and the partition that is named "ubi" is automatically set to be the rootfs (and it's because of this, https://github.com/openwrt/openwrt/blob/master/target/linux/generic/pending-5.4/490-ubi-auto-attach-mtd-device-named-ubi-or-data-on-boot.patch).
  3. Also, because of this patch (https://github.com/kkkgo/openwrt/blob/master/target/linux/generic/patches-4.3/480-mtd-set-rootfs-to-be-root-dev.patch), "rootfs" cannot be used as a partition name in the dts. Given 2 and 3 ... partition naming has to change, to allow both partitions to work correctly with OpenWrt. Right now, with the dts file as it is, it's all hard-coded to only run from mtd6.

I also changed the u_env partition to be r/w, which is needed to be able to try to swap partitions (OpenWrt needs to set the boot_part variable). This is where I get stuck ... even with the partition set to r/w, and fw_printenv working, fw_setenv is not. Huh? I did check, and setting environment variables from U-Boot shows the same start address (0x80000) and erase block size (0x20000). Perhaps something odd with the 1k long environment? This also matches the U-Boot info (says 4092 bytes ... 4k less a 4 byte header I assume?).

Any suggestions on how to get fw_setenv working would be greatly appreciated. I think with that, it may be possible to have this set up to flash to the desired partition => keeping or overwriting Linksys becomes a personal option then. But I can explain that later ... LOL. Need to confirm first that it does work correctly (but need fw_setenv to get there).

Thanks!!

1 Like

I think you mean length

im pretty sure the syntax is

offset length erasesize

what does the kernel detect as the flash chip model?

Yes, agreed. And sorry, bad wording on my part. The syntax is matching to what I see with hexdump.

Here you go!

[    0.595010] nand: device found, Manufacturer ID: 0xef, Chip ID: 0xf1
[    0.601360] nand: Winbond W29N01HV
[    0.604765] nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64

Thanks!

Hmmm ... this is really odd. I tried a few things,

  1. add kmod-mtd-rw to the build. No joy, still behaves the same (so really not a r/w issue, agreed?)
  2. reinstalled Linksys firmware ... fw_setenv works! Also works from U-boot of course.
  3. I grabbed the Linksys source code from https://www.linksys.com/us/support-article?articleNum=114663 (as /etc/fw_env.config doesn't seem to exist in Linksys, rather it's hard coded). Nothing that I see there any different than the values already used.

Really pulling my hair out. Seems very odd that this isn't working. Are there any known issues with fw_setenv?

Thanks!

I feel like this is something to do with the flash being NAND instead of NOR

just to be clear, whats output of both

cat /proc/mtd
and the config
cat /etc/fw_env.config

1 Like

and what is the error you get exactly?

Was thinking exactly the same thing! :laughing:. Have been digging, something really is odd. Linksys fw_setenv works, but not OpenWrt, so something is hopefully just misconfigured. But odd, I tried the following (added nand-utils, thinking NAND vs. NOR like you mentioned) - and yes, I did a backup first. LOL!

root@OpenWrt:/# mtd erase /mtd/mtd1
Could not open mtd device: /mtd/mtd1
Could not open mtd device: /mtd/mtd1

Hmm, so then,

root@OpenWrt:/# nanddump /dev/mtd1 > mtd1.nanddump
ECC failed: 0
ECC corrected: 0
Number of bad blocks: 0
Number of bbt blocks: 0
Block size 131072, page size 2048, OOB size 64
Dumping data starting at 0x00000000 and ending at 0x00040000...
root@OpenWrt:/# nandwrite /dev/mtd1 mtd1.nanddump
nandwrite: error!: /dev/mtd1
           error 13 (Permission denied)

Now it has me wondering about read-only, but that's not the setting in the dts, and I don't see this in the kernel log.

Here you go!

root@OpenWrt:/# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00080000 00020000 "boot"
mtd1: 00040000 00020000 "u_env"
mtd2: 00040000 00020000 "factory"
mtd3: 00040000 00020000 "s_env"
mtd4: 00040000 00020000 "devinfo"
mtd5: 00400000 00020000 "kernel1"
mtd6: 02400000 00020000 "rootfs1"
mtd7: 00400000 00020000 "kernel2"
mtd8: 02400000 00020000 "ubi"
mtd9: 00100000 00020000 "sysdiag"
mtd10: 02d00000 00020000 "syscfg"
root@OpenWrt:/# cat /etc/fw_env.config
/dev/mtd1 0x0 0x1000 0x20000

Thanks!

Yep, as suspected! :slight_smile:. Added lsblk to my build, and checked (note RO column),

root@OpenWrt:/# lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
mtdblock0    31:0    0  512K  1 disk
mtdblock1    31:1    0  256K  1 disk
mtdblock2    31:2    0  256K  1 disk
mtdblock3    31:3    0  256K  0 disk
mtdblock4    31:4    0  256K  1 disk
mtdblock5    31:5    0    4M  0 disk
mtdblock6    31:6    0   36M  0 disk
mtdblock7    31:7    0    4M  0 disk
mtdblock8    31:8    0   36M  0 disk
mtdblock9    31:9    0    1M  1 disk
mtdblock10   31:10   0   45M  1 disk
ubiblock0_1 259:0    0  4.7M  0 disk /rom

Hmmm ... they all match to the dts file (actually, dtsi, target/linux/ramips/dts/mt7621_linksys_ea7xxx.dtsi) - except that one partition (mtd1). Now I'm really confused ... LOL. Where the heck is that getting changed. Could it be in U-Boot, locking out the dts? Will keep looking.

Thanks!

the mtd program accepts the partition label not the path
(btw the path would be /dev/mtdX)

so in your case
mtd erase u_env

now what is the error for fw_setenv?

Yes, sorry! Typing too quick ... LOL. I did try the name, and /dev/mtd1 => same error in all cases.

And, for fw_setenv,

root@OpenWrt:/# fw_setenv test new
Can't open /dev/mtd1: Permission denied
Error: can't write fw_env to flash

Really seems to match the lsblk above - and seems like (somehow) it's read-only? Very odd, as this is what is in the dts (now, i.e. I removed the read-only),

                partition@80000 {
                        label = "u_env";
                        reg = <0x80000 0x40000>;
                };

Thanks!

sooooo

you removed the read-only; property from DTS

did you do a make clean before building again?

1 Like

Yep! I can try to whack all the temp directories (full clean), if that may make a difference?