NAND and Bad-Block Handling

How are bad blocks handled by OpenWrt with NAND-based devices?

In particular, I see

[...]
Sun Dec 23 04:07:56 2018 kern.info kernel: [    0.678919] spi-nand: Giga SPI NAND was found.
Sun Dec 23 04:07:56 2018 kern.info kernel: [    0.683515] spi-nand: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
Sun Dec 23 04:07:56 2018 kern.notice kernel: [    0.691737] 1 cmdlinepart partitions found on MTD device spi0.1
Sun Dec 23 04:07:56 2018 kern.notice kernel: [    0.697880] Creating 1 MTD partitions on "spi0.1":
Sun Dec 23 04:07:56 2018 kern.notice kernel: [    0.702829] 0x000000000000-0x000008000000 : "ubi"
Sun Dec 23 04:07:56 2018 kern.warn kernel: [    1.157138] found bad block 7fe0000
[...]

which suggests to me that there may be a problem with the NAND chip on a new router.

0x7fe000 is 134086656 / 1024^2 = 127.875 (MB?) on a new device with "128 MB NAND"

Curiously, or coincidentally, that looks like the "last" block of 128 KiB blocks.

Is this a problem?

1 Like

from https://openwrt.org/docs/techref/flash#bad_blocks

Bad blocks happen, especially on NAND flash. Even never used NAND flash, right from the factory, might contain bad blocks.

1 Like

Thanks for posting that. :+1:
TIL that using dd to read and write from/to nand storage is not going to work , and even if it does its very risky.

Ok, so I've poked around a bit on master and for a similar device it looks like it is partitioning the NAND flash only through 0x7e00000, rather than to the end of the address space at 0x8000000 (-1). Am I interpreting the DTS correctly?

From target/linux/ath79/dts/qca9531_glinet_ar300m-nand.dts

partition@1 {
        label = "ubi";
        reg = <0x200000 0x7e00000>;
};

So, if I've interpreted that right, it seems that OpenWrt is only going to write a 130,023,424 byte (124 MB) "ubi" (I assume shared between ROM and JFFS), after the 2 MB for the "kernel" partition, for a total of 126 MB (of 128 MB).

any way to repair the bad blocks? ECC algo included in u-boot seems helpless. i've seen some yt videos where they repair bad blocks on xbox nand storage.
if not could nand init from u-boot clear bad block informations? as read on another site bad blocks that happened due to improper write are actually artificial bad blocks so i gues they should be recoverable.

dd can read nand without problems, but writing will produce bad block

Based on the wiki page that @tmomas linked to, dd will read both data and parity information off a nand chip. Which will make the backup files it produces a bit useless. nandread and nandwrite should only be used as they are both aware of the nand parity info.

1 Like

used dd from linksys fw here to dump whole nand. ended up with 134,217,728 bytes large file. i can see within empty parts of dump or calibration data there is no OOB records stored

I directly connected to my ea8100 Linksys router using serial port after I connected wrongly and without knowing the dangers of two commands (never test these two)
nand erase ubifs
nand erase syscfg
I wrote, after turning the router off and on again, the router did not boot, and even its lights did not turn on, and serial communication was not established, and nothing booted at all!!!!

Is there a way to fix the router?
How can I check if the bootloader is damaged or not?
Should I use CFE files if I want to rewrite the bootloader or ubifs? Where should I find the file for this router?

I don't know much about electronics and the destructive effects of these two commands on NAND.