NAND and Bad-Block Handling

How are bad blocks handled by OpenWrt with NAND-based devices?

In particular, I see

[...]
Sun Dec 23 04:07:56 2018 kern.info kernel: [    0.678919] spi-nand: Giga SPI NAND was found.
Sun Dec 23 04:07:56 2018 kern.info kernel: [    0.683515] spi-nand: 128 MiB, block size: 128 KiB, page size: 2048, OOB size: 128
Sun Dec 23 04:07:56 2018 kern.notice kernel: [    0.691737] 1 cmdlinepart partitions found on MTD device spi0.1
Sun Dec 23 04:07:56 2018 kern.notice kernel: [    0.697880] Creating 1 MTD partitions on "spi0.1":
Sun Dec 23 04:07:56 2018 kern.notice kernel: [    0.702829] 0x000000000000-0x000008000000 : "ubi"
Sun Dec 23 04:07:56 2018 kern.warn kernel: [    1.157138] found bad block 7fe0000
[...]

which suggests to me that there may be a problem with the NAND chip on a new router.

0x7fe000 is 134086656 / 1024^2 = 127.875 (MB?) on a new device with "128 MB NAND"

Curiously, or coincidentally, that looks like the "last" block of 128 KiB blocks.

Is this a problem?

1 Like

from https://openwrt.org/docs/techref/flash#bad_blocks

Bad blocks happen, especially on NAND flash. Even never used NAND flash, right from the factory, might contain bad blocks.

1 Like

Thanks for posting that. :+1:
TIL that using dd to read and write from/to nand storage is not going to work , and even if it does its very risky.

Ok, so I've poked around a bit on master and for a similar device it looks like it is partitioning the NAND flash only through 0x7e00000, rather than to the end of the address space at 0x8000000 (-1). Am I interpreting the DTS correctly?

From target/linux/ath79/dts/qca9531_glinet_ar300m-nand.dts

partition@1 {
        label = "ubi";
        reg = <0x200000 0x7e00000>;
};

So, if I've interpreted that right, it seems that OpenWrt is only going to write a 130,023,424 byte (124 MB) "ubi" (I assume shared between ROM and JFFS), after the 2 MB for the "kernel" partition, for a total of 126 MB (of 128 MB).

any way to repair the bad blocks? ECC algo included in u-boot seems helpless. i've seen some yt videos where they repair bad blocks on xbox nand storage.
if not could nand init from u-boot clear bad block informations? as read on another site bad blocks that happened due to improper write are actually artificial bad blocks so i gues they should be recoverable.

dd can read nand without problems, but writing will produce bad block

Based on the wiki page that @tmomas linked to, dd will read both data and parity information off a nand chip. Which will make the backup files it produces a bit useless. nandread and nandwrite should only be used as they are both aware of the nand parity info.

1 Like

used dd from linksys fw here to dump whole nand. ended up with 134,217,728 bytes large file. i can see within empty parts of dump or calibration data there is no OOB records stored