Upgrade from Barrier Breaker 14.07 to latest 21.02 (WD My Net N750)

I have experienced similar issues with MyNet n750 upgraded to 21.02.2. I started out by using the sysupgrade firmware to go from 19.07.1 to 21.02.2. It initially appeared to work properly but after changing some settings and rebooting a few times I was no longer able to web into it. I reset and flashed the router multiple times and invariably the web interface would stop working while trying to configure it, either it was unresponsive or it prompted with "Bad Gateway". My last attempt was to boot into "Emergency Room" and use the factory firmware, then sysupgrade. Thinking it might be related to the new LuCI interface and based on the release note for 19.07.9, I installed luci-compat and it appears that uhttpd-mod-ubus in included in the image. But that resulted in the same issues with the web interface.

I flashed it to back to 19.07 (19.07.9) using the Emergency Room and had no issues configuring the device.

2 Likes

I can report the same exact experience with my WD N750 on 21.02. I downgraded back to 19.07 and the issues have resolved.

2 Likes

Has anyone tried 21.02.3?

1 Like

I upgraded from 19.07.9 to 22.03.0 RC1 last night on my N750, kept settings.

It has been 24 hours, so far no signs of the SquashFS errors in the kernel log. I will keep monitoring, as previously the errors didn’t appear until a few days running on 21.02.01.

So far, so good.

1 Like

Did settings for 22.03.0-rc1 survive when rebooting the N750 ?

Yes, survived all reboots.

1 Like

Hmm, I maybe spoke too soon. Woke up this morning to no Internet, router is in a boot loop. The router was scheduled via cron to reboot last night, perhaps that reboot caused the issue. The router survived multiple soft reboot earlier in the week, unsure why this might be different.

The cause of this issue is very likely related to the previously reported SquashFS corruption problem, referenced here - and here.

Could unitialised clock be the reason for boot cycle? Periodic reboot.

This linked issue is related to a cron setting.

  • The poster noted that the device was working and soft booted multiple times before issues
    • All others noted something similar
  • In any case, it worked in version 19
  • In my case, I'm sure I don't have a bad cron config

I'm testing the latest OpenWRT Snapshot (r20029-3c06a344e9), running the 5.15.50 kernel via testing mode on a WD N750, but observed the same SquashFS errors upon initial boot after flash.

Update: Possible improvement! After a reboot, no errors and there have been no errors for past 7 days. No issues on second reboot after 7 day uptime either. In the past on OpenWRT 21 and 22 I would see a few SquashFS errors immediately upon reboot.

I built the current firmware image from source using the July 7th snapshot, selected testing kernel and running 5.15.50. I'll keep monitoring and will report back.

1 Like
  • I have one running 22.03.0-rc5 (no extra packages installed), no issues
  • I have an issue with another one. It continued to have issue saving configs, pre-installing more packages, etc. into the image with the firmware selector. I believe I revered to v19 and then accidentally sysupgraded remotely to 22.03.0-rc4 (with needed packages). So, I will update on this one later

After a month of testing I’m sad to report the SquashFS errors continue with my WD N750, even when running the 5.15 kernel.

I was running a custom images of 22.03.0 for about a month. Yesterday, I noticed I was getting ash I/O error typing commands and I/O errors attempting to scp to device A. In order to reboot device A, I had to execute 'busybox -reboot'

I had snmpd and Wireguard LuCI packages installed.
I've now flashed an image (without LuCI) to see if that helps

On a second device, I wanted to flash another 22 image with 'tcpdump' included. It merely reboots without flashing. Same extra packages as device A. I will have to go on-site to troubleshoot device B and install the non-LuCI image with snmpd, tcpdump and Wireguard-tools installed.

On a third upgraded from 21, it is 22 from the downloads site with only network and wifi configs (dumb AP). Still no issues with device C.

Device D was recently upgraded from 19 to 22 with relayd in the custom image (and LuCI included). No issues thus far.

FYI - I only recall seeing the errors if I installed software of edited a file. I see no errors in the log of device B.

Device B:

root@OpenWrt:~# dmesg | grep CRC
[   13.184693] jffs2: notice: (542) jffs2_get_inode_nodes: Node header CRC failed at 0x00b51c. {677a,ffff,00000044,a4ef223e}
[   13.222067] jffs2: notice: (428) jffs2_get_inode_nodes: Node header CRC failed at 0x00d05c. {6579,ffff,00000044,a4ef223e}
[   41.936318] jffs2: Node CRC 2bdb746e != calculated CRC 2191a72f for node at 0000bf34

EDIT- Device B was repaired by:

  • reset to defaults
  • flash official sysupgrade from Downloads Page (removed image with extra packages)
  • restore device's config archive
  • flashing an image that doesn't contain LuCI from the Firmware Selector (w/ snmpd, wireguard-tools and tcpdump pre-installed as needed)

I have N600s and N750s and have been struggling with these issues. I have some observations that will hopefully be of some use.

I suspect that snmpd writes /usr/lib/snmp/snmpd.conf repeatedly now (not in 19.07), so if there is an issue with writing to jffs2, it will eventually be triggered.

The spi-nor chip in the N750 is a MX25L12835e rather than a MX25L12805d (I opened two of mine and looked at the bottom of the circuit boards). The kernel believes that it sees a MX25L12805d. They are close but in particular the software write protection lock bits are interpreted differently, so if BP0-BP3 are used to lock regions, they will be inconsistent. The MX25L12835e has a minimum lock region of 128K rather than 64K. I'm not certain it matters, but if, for example, the last mtd partition is software locked, the final 64K of the firmware partition will also be inadvertently locked on a MX25L12835e (edited).

The MX25L12835e also will process RDSFDP, which the MX25L12805d does not. I didn't dig far enough in to the kernel source to see if SFDP is checked for our situation, but that may be a difference between 19.07 and current.

(To fill in, if this mystifies you, the handling of serial flash memory changed significantly between (I hope this is right) kernel versions 4 and 5, which is also the boundary between OpenWRT 19 and later, where the N600/N750 became unreliable)

I have built some kernels with modifications that attempt to turn off the software write protection entirely, setting these bits to zero early, and not processing attempts to lock anything. I've still had some issues, but these are my first tries doing openwrt builds at all, so it's too early to draw many conclusions.

I'm still working on this, but I wanted to get some of it on the record.

Hope this helps!

3 Likes

Here's an update: the software write protection bits were not the whole problem; after installing some fairly large packages, the filesystem errors started again.

I have now built a new kernel which omits the SECT_4K flag for the sp25l128[05d,35e] entry in
spi-nor/macronix.c, and this seems to improve things; sysupgrade works (at least on a device that was not yet showing filesystem errors) and I was able to install some large packages without developing errors. There was some mikrotik discussion that inspired trying that.

This is using 22.03.2, with everything standard except the tweaks in macronix.c (so kernel 5.10.146)

Have to wait a while to see if that actually fixes things, I guess.

2 Likes

@bradford Thanks for working on this.

I'm almost at the point of going down the extroot path and calling it a day.

2 Likes

Outstanding work @Bradford. Thanks for the thorough investigation and summary. I hope you keep up the testing and report back with your findings, as many of us continue to use the N600 and N750.

I've linked your post here to a Github Issues post that started in October 2021. I think the users there would benefit from knowing what you've discovered as well.

@bradford

Since it says ramips/mt7721, how is it related to the N750?

The original post in that thread is from a WD N750 owner. At some point a moderator arbitrarily changed the title. The reality is the issue seems to affect multiple platforms.

1 Like