Impossible to install/update any packages on WZR-HP-G300NH 18.06.1

Not done sha256sum , but I downloaded both sysupgrade and tftp images from https://downloads.openwrt.org/releases/18.06.1/targets/ar71xx/generic/ and problem is the same.

Hello,
I've just tried to install 17.01.6 via tftp and it can install packages without any issue

[    0.000000] Kernel command line:  board=WZR-HP-G300NH console=ttyS0,115200 rootfstype=squashfs,jffs2 noinitrd
[    0.742270] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    8.185396] mount_root: jffs2 not ready yet, using temporary tmpfs overlay
[   21.121849] jffs2_scan_eraseblock(): End of filesystem marker found at 0x0
[   21.145208] jffs2_build_filesystem(): unlocking the mtd device... done.
[   21.151877] jffs2_build_filesystem(): erasing all blocks after the end marker... 
[  179.333503] jffs2: notice: (937) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.

later I will try with development snapshot.

Any idea, how to debug or to solve issue with 18.06.1 ?

Fabio

I've just installed fresh OpenWrt SNAPSHOT, r8087-38a88ad,
and problem is still present, but something new is shown on dmesg

root@OpenWrt:~# opkg install luci-ssl uhttpd
Installing luci-ssl (git-18.264.24000-89a4fdd-1) to root...
Downloading http://downloads.openwrt.org/snapshots/packages/mips_24kc/luci/luci-ssl_git-18.264.24000-89a4fdd-1_all.ipk
Installing uhttpd (2018-08-23-b741dec3-1) to root...
Downloading http://downloads.openwrt.org/snapshots/packages/mips_24kc/base/uhttpd_2018-08-23-b741dec3-1_mips_24kc.ipk
Installing liblua (5.1.5-1) to root...
Downloading http://downloads.openwrt.org/snapshots/packages/mips_24kc/base/liblua_5.1.5-1_mips_24kc.ipk
Installing lua (5.1.5-1) to root...
Downloading http://downloads.openwrt.org/snapshots/packages/mips_24kc/base/lua_5.1.5-1_mips_24kc.ipk
Installing luci-lib-nixio (git-18.264.24000-89a4fdd-1) to root...
Downloading http://downloads.openwrt.org/snapshots/packages/mips_24kc/luci/luci-lib-nixio_git-18.264.24000-89a4fdd-1_mips_24kc.ipk
Installing luci-lib-ip (git-18.264.24000-89a4fdd-1) to root...
Downloading http://downloads.openwrt.org/snapshots/packages/mips_24kc/luci/luci-lib-ip_git-18.264.24000-89a4fdd-1_mips_24kc.ipk
Package uhttpd (2018-08-23-b741dec3-1) installed in root is up to date.
Configuring liblua.
Configuring lua.
Configuring uhttpd.
ln: /etc/rc.d/S50uhttpd: I/O error
Configuring luci-lib-nixio.
Collected errors:
 * pkg_write_filelist: Failed to open //usr/lib/opkg/info/luci-lib-ip.list: I/O error.
 * opkg_install_pkg: Failed to extract data files for luci-lib-ip. Package debris may remain!
 * opkg_install_cmd: Cannot install package luci-ssl.
 * opkg_conf_write_status_files: Can't open status file //usr/lib/opkg/status: I/O error.
[    0.780920] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    8.406396] jffs2: notice: (415) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[    8.423476] mount_root: switching to jffs2 overlay
[  270.902671] jffs2: Write of 1989 bytes at 0x005ce6f8 failed. returned -5, retlen 962
[  270.931965] jffs2: Write of 1989 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  270.939631] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  270.950397] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  270.957838] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  270.968584] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  270.976027] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  270.986735] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  270.994225] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.004876] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.012364] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.023050] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.030556] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.041191] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.048676] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.060416] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.067868] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.078720] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.086162] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.440425] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.447878] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  271.458607] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  271.466047] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  272.129752] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  272.137202] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero
[  272.156365] jffs2: Write of 68 bytes at 0x005ceec0 failed. returned -5, retlen 0
[  272.163889] jffs2: Not marking the space at 0x005ceec0 as dirty because the flash driver returned retlen zero

Looks like your flash memory has failed.

Yes, could be, but why it doesn't happen with 17.01.6 ?

You can test it by writing a large file

Maybe I will try to write a big file on 17.01.6 to check if errors happen, but,
why does kernel does not log errors at mtd driver level ? Is jffs2 writing directly to flash?
Is it possible to raise verbosity of mtd or jffs2 modules ?

Maybe I will give a try to dd-wrt also to check how it behaves

https://openwrt.org/docs/techref/filesystems?s[]=overlayfs#overlayfs

https://openwrt.org/docs/techref/flash.layout#partitioning_of_the_flash

Interesting, but I've not seen informations to help me how to find out the problem. Should I try to fill the mtd/jffs2 partion with dd program while in failsafe mode to see if kernel log write/erase errors ?

No just create a regular file in "/" for example, don't bypass the the filesystem by accessing mtd device directly.

root@DGN3500:/tmp# dd if=/dev/urandom bs=1M count=1 | tee /test1 | sha256sum
1+0 records in
1+0 records out
8206102f4a4ed4ecdb1ee66764621cac38219288f414199eb6698b3573be3be5  -
root@DGN3500:/tmp# sha256sum /test1
8206102f4a4ed4ecdb1ee66764621cac38219288f414199eb6698b3573be3be5  /test1

Repeat and increment the file name testN each time until you fill up the memory.

Check how much space is available before each test so you don't run into other problems
df -h /

Ok, the following with development snapshot

root@OpenWrt:~# dd if=/dev/urandom bs=1M count=1 | tee /test1 | sha256sum
tee: /test1: I/O error
1+0 records in
1+0 records out
56ae867796503949c41fad31de5323bbff87e4274a6c7e26dca2dd30aa1ce570  -
[34234.280201] jffs2: Write of 68 bytes at 0x00de2b3c failed. returned -5, retlen 0
[34234.287652] jffs2: Not marking the space at 0x00de2b3c as dirty because the flash driver returned retlen zero
[34234.328529] jffs2: Write of 68 bytes at 0x00de2b3c failed. returned -5, retlen 0
[34234.335977] jffs2: Not marking the space at 0x00de2b3c as dirty because the flash driver returned retlen zero

I will put back 17.06.1 and try on that

I did the following test with 17.01.6 and no errors occured.
So it doesn't seem flash problem, but I suspect jffs2 or mtd modules used on new kernels on this platform/model.

root@LEDE:~# dd if=/dev/urandom bs=1M count=1 | tee /test1 | sha256sum
1+0 records in
1+0 records out
0d0d01ef5abfe275a50f53c49f1387972c9c6c02f7d4dd7743cd2dc47d2305a2  -
root@LEDE:~# sha256sum /test1
0d0d01ef5abfe275a50f53c49f1387972c9c6c02f7d4dd7743cd2dc47d2305a2  /test1
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M      2.0M     25.9M   7% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=1 | tee /test2 | sha256sum
1+0 records in
1+0 records out
3b89f91808fe5ecf303bfe4e565b441c8965cd43903f6868de0eb4a7a47cf5f3  -
root@LEDE:~# sha256sum /test2
3b89f91808fe5ecf303bfe4e565b441c8965cd43903f6868de0eb4a7a47cf5f3  /test2
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M      3.0M     24.9M  11% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=1 | tee /test3 | sha256sum
1+0 records in
1+0 records out
64166d5c9b10506c38b42bf7727cf2efcd41993e09c7daadcd0323e270bcf3f0  -
root@LEDE:~# sha256sum /test3
64166d5c9b10506c38b42bf7727cf2efcd41993e09c7daadcd0323e270bcf3f0  /test3
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M      4.0M     23.9M  14% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=1 | tee /test4 | sha256sum
1+0 records in
1+0 records out
1e74c856973ee725797d456eacd2bdb726b3bd37e6ee2f81edb1de6a9b31a869  -
root@LEDE:~# sha256sum /test4
1e74c856973ee725797d456eacd2bdb726b3bd37e6ee2f81edb1de6a9b31a869  /test4
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M      5.1M     22.8M  18% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=1 | tee /test5 | sha256sum
1+0 records in
1+0 records out
46111735482a2c14f97fffe5b53aa780f4d4af5dce658ef4029518f51ce380ae  -
root@LEDE:~# sha256sum /test5
46111735482a2c14f97fffe5b53aa780f4d4af5dce658ef4029518f51ce380ae  /test5
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M      6.1M     21.8M  22% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=1 | tee /test6 | sha256sum
1+0 records in
1+0 records out
13a72d6ac4d6e840fe23df2e7d26e8195229af8068579298cefb0e76354f12b7  -
root@LEDE:~# sha256sum /test6
13a72d6ac4d6e840fe23df2e7d26e8195229af8068579298cefb0e76354f12b7  /test6
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M      7.1M     20.7M  26% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=5 | tee /test7 | sha256sum
5+0 records in
5+0 records out
2beeb6bf0388db3c1d8edd0fc33c28782c65c320bd404a4ca44050fa6fe5c468  -
root@LEDE:~# sha256sum /test7
2beeb6bf0388db3c1d8edd0fc33c28782c65c320bd404a4ca44050fa6fe5c468  /test7
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M     12.3M     15.5M  44% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=5 | tee /test8 | sha256sum
5+0 records in
5+0 records out
1577b095ca80b4fedb7d41f435ad8949b4e477a72de86808f82d3c3216654e27  -
root@LEDE:~# sha256sum /test8
1577b095ca80b4fedb7d41f435ad8949b4e477a72de86808f82d3c3216654e27  /test8
root@LEDE:~# df -h /
Filesystem                Size      Used Available Use% Mounted on
overlayfs:/overlay       27.9M     17.4M     10.4M  63% /
root@LEDE:~# dd if=/dev/urandom bs=1M count=5 | tee /test9 | sha256sum
5+0 records in
5+0 records out
ca97e8ac3d239771493add7e5a85c526c07a830b37fa77e0cc0aae4c0bee2a81  -
root@LEDE:~# sha256sum /test9
ca97e8ac3d239771493add7e5a85c526c07a830b37fa77e0cc0aae4c0bee2a81  /test9
root@LEDE:~# dd if=/test9 bs=1M count=5 | tee /testA | sha256sum

No errors or warning on dmesg

The test took hours to run, and last one is not ending.
Running top, you can see tee is consuming all cpu.

Mem: 46908K used, 13700K free, 76K shrd, 1964K buff, 30376K cached
CPU:   0% usr  89% sys   0% nic  10% idle   0% io   0% irq   0% sirq
Load average: 1.65 1.49 1.45 1/47 1448
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
 1403  1261 root     D     1184   2%  85% tee /testA
  386     2 root     DWN      0   0%   4% [jffs2_gcd_mtd5]
 1442  1274 root     R     1184   2%   0% top
   89     2 root     SW       0   0%   0% [kworker/0:1]
    1     0 root     S     1528   3%   0% /sbin/procd
 1404  1261 root     S     1184   2%   0% sha256sum
 1273   756 root     S     1124   2%   0% /usr/sbin/dropbear -F -P /var/run/dro
 1402  1261 root     S     2212   4%   0% dd if /test9 bs 1M count 5

Hi Guido, I think it should be useful if you could upgrade to 18.06.1 and check if you meet the same problem I have. If it is a flash problem on my Buffalo, it should work for you, and if it is a software problem, it shouldn't work for you too, as I expect.

I filed in bugs https://bugs.openwrt.org/index.php?do=details&task_id=1864

Unfortunately, my Buffalo router is currently remote and "in production", so I'm not able to upgrade or test anytime soon. :disappointed:

The very similar wzr-hp-ag300h has recently been tested successfully on ath79/ master.

Hello,
I've tried to build an image from git with
CONFIG_JFFS2_FS_DEBUG=1 set into openwrt/target/linux/generic/config-4.9

The problem is still present and no extra useful information is available for troubleshooting.

Is available a method to have mtd logging I/O errors ?

Is it safe if I do a dd if=/dev/urandom of=/dev/mtdblock5 bs=64k ?
/dev/mtdblock5 on /overlay type jffs2 (rw,noatime)

[  539.034017] [JFFS2 DBG] (1405) jffs2_do_reserve_space: minsize=196 , jeb->free=40184 ,summary->size=10935 , sumsize=18
[  539.051663] [JFFS2 DBG] (1405) jffs2_sum_add_mem: inode (75) added to summary
[  539.158977] [JFFS2 DBG] (1405) jffs2_link_node_ref: Last node at 83081e88 is (01bb59de,83081e78)
[  539.158999] [JFFS2 DBG] (1405) jffs2_link_node_ref: New ref is 83081e90 (fffffffe becomes 01bb630a,  (null)) len 0xc18
[  539.159175] [JFFS2 DBG] (1405) jffs2_add_full_dnode_to_inode: adding node 0x3000-0x4000 @0x01bb6308 on flash, newfrag *83b59678

[  539.159431] [JFFS2 DBG] (1405) jffs2_do_reserve_space: minsize=196 , jeb->free=37088 ,summary->size=10953 , sumsize=18
[  539.177620] [JFFS2 DBG] (1405) jffs2_sum_add_mem: inode (75) added to summary
[  539.264259] jffs2: Write of 2952 bytes at 0x01bb6f20 failed. returned -5, retlen 2492
[  539.272169] [JFFS2 DBG] (1405) jffs2_link_node_ref: Last node at 83081e90 is (01bb630a,83081e88)
[  539.272186] [JFFS2 DBG] (1405) jffs2_link_node_ref: New ref is 83081e98 (fffffffe becomes 01bb6f21,  (null)) len 0xb88
[  539.272365] [JFFS2 DBG] (1405) jffs2_do_reserve_space: minsize=196 , jeb->free=34136 ,summary->size=10971 , sumsize=18
[  539.290472] [JFFS2 DBG] (1405) jffs2_sum_add_mem: inode (75) added to summary
[  539.291698] jffs2: Write of 2952 bytes at 0x01bb7aa8 failed. returned -5, retlen 0
[  539.299318] jffs2: Not marking the space at 0x01bb7aa8 as dirty because the flash driver returned retlen zero
[  539.309769] [JFFS2 DBG] (1405) jffs2_do_reserve_space: minsize=68 , jeb->free=34136 ,summary->size=10989 , sumsize=18
[  539.309816] [JFFS2 DBG] (1405) jffs2_sum_add_mem: inode (75) added to summary
[  539.310103] jffs2: Write of 68 bytes at 0x01bb7aa8 failed. returned -5, retlen 0
[  539.317586] jffs2: Not marking the space at 0x01bb7aa8 as dirty because the flash driver returned retlen zero

Using git build, in failsafe mode, I tried to write directly to /dev/mtdblock5 (/overlay), skipping overlay and jffs2 filesystems

I checked partition dimension, and wrote to it, I had the following.
It's strange it's not logged also when using jffs2 filesystem

Now, could it be a flash problem, or software problem accessing it ?
I will do the same with Lede soon

root@(none):~# dd if=/dev/mtdblock5 |wc -c
29229056
57088+0 records in
57088+0 records out

root@(none):~# dd if=/dev/urandom of=/dev/mtdblock5 bs=64k count=446

[  420.032294] blk_update_request: I/O error, dev mtdblock5, sector 256
[  420.038700] Buffer I/O error on dev mtdblock5, logical block 32, lost async page write
[  440.333941] MTD do_erase_oneblock(): software timeout
[  442.948139] blk_update_request: I/O error, dev mtdblock5, sector 264
[  442.954537] Buffer I/O error on dev mtdblock5, logical block 33, lost async page write
[  463.243940] MTD do_erase_oneblock(): software timeout
[  465.798161] blk_update_request: I/O error, dev mtdblock5, sector 272
[  465.804563] Buffer I/O error on dev mtdblock5, logical block 34, lost async page write
[  486.093939] MTD do_erase_oneblock(): software timeout
[  488.695413] blk_update_request: I/O error, dev mtdblock5, sector 280
[  488.701819] Buffer I/O error on dev mtdblock5, logical block 35, lost async page write
[  508.993939] MTD do_erase_oneblock(): software timeout

I've tried to fill /dev/mtdblock5 with Lede 17.01.6 and it worked.
It took lot of time, but nothing was logged on dmesg

I will investigate on "lost async page write" and "MTD do_erase_oneblock(): software timeout" errors

I tried to flash dd-wrt, it took some time, but it's working fine.

dd if=wzr-hp-g300nh-dd-wrt-webupgrade-MULTI.bin of=wzr-hp-g300nh-dd-wrt-tftp.bin bs=28 skip=1
and flashed from openwrt webgui.

Anyway I want to use OpenWRT, so I need help to investigate on MTD errors from 18.06.1.

Suspects:

  • some code has been changed
  • some timings or parameters has been changed
  • admit flash is faulty, and old kernel 4.4 in 17.06.1 didn't notify to jffs2 layer, and data errors happened undercover

Please help me