Mt7621-nand / jffs2 broken with uncorrectable ECC errors

I'm working on adding support for a new device with MT7621 SoC and ESMT F59L1G81MB-25T 128 MiB NAND flash (which seems used by the already supported Xiaomi AC2100).

However, it appears that writes to the NAND are totally broken in both openwrt master and 21.02.2. Reading from the NAND works. However, any data that is written to the NAND appears to be
unreadable, resulting in "Uncorrectable ECC error" messages. This can be illustrated by the following situation where I have created a separate mtd partition for testing:

root@OpenWrt:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00080000 00020000 "Bootloader"
mtd1: 00080000 00020000 "Config"
mtd2: 00080000 00020000 "Factory"
mtd3: 01300000 00020000 "ImageA"
mtd4: 00380000 00020000 "kernel"
mtd5: 00f60000 00020000 "rootfs"
mtd6: 00140000 00020000 "blah"
root@OpenWrt:~# mtd erase blah
Unlocking blah ...
Erasing blah ...
root@OpenWrt:~# mkdir qwer
root@OpenWrt:~# mount -t jffs2 /dev/mtdblock6 qwer
[  109.015790] jffs2: notice: (2842) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0.
root@OpenWrt:~# [  109.131210] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.138526] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.191270] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.198573] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.250776] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.258081] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.310325] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.317632] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.370315] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.377621] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.430286] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.437591] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.489791] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.497096] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.549313] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.556665] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.609395] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.616700] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134
[  109.668916] mt7621-nand 1e003000.nand: Using programmed access timing: 31c07388
[  109.676220] mt7621-nand 1e003000.nand: Using programmed access timing: 21005134

root@OpenWrt:~# cd qwer
root@OpenWrt:~/qwer# echo hello > hello.txt
root@OpenWrt:~/qwer# cat hello.txt
hello
root@OpenWrt:~/qwer# cd ..
root@OpenWrt:~# umount qwer
root@OpenWrt:~# ls qwer
root@OpenWrt:~# mount -t jffs2 /dev/mtdblock6 qwer
[  143.452239] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 11264.0
[  143.459632] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 11264.1
[  143.467137] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 11264.2
[  143.474518] mt7621-nand 1e003000.nand: Uncorrectable ECC error at page 11264.3
[  143.481768] jffs2: cannot read OOB for EB at 00100000, requested 8 bytes, read 0 bytes, error -77
mount: mounting /dev/mtdblock6 on qwer failed: I/O error

The problem seems to occur even if the "blah" partition is moved to various locations, which probably indicates that the problem is with software rather than actual bad blocks.

Has anyone seen this before, or know what might be wrong? Thanks!

Does this problem also apply to e.g. the kernel image or other things written to the flash outside of JFFS2?
Be aware that JFFS2 is not intended to be used on NAND and will always lead to unequal wear and fast deterioration, hence you should use UBI/UBIFS instead. I assume that JFFS2 is also widely untested on NAND nowadays, simply because nobody should be using JFFS2 on NAND.

1 Like

I just tried UBI/UBIFS instead of JFFS2, and it appears to work fine. So I guess JFFS2 might be broken with mt7621-nand now.

Thanks for your helpful insights!

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.