Powerbeam M5 XW Configuration Loss after reboot

Hi, I tried upgrading to the latest snapshot in my Powerbeam M5 with image builder and after configuring it, the configuration changes were lost after rebooting, I have tried using the master snapshot without luci the same issue is reproducible, and this issue isn't reproducible with 5.10 kernel.

Logs filtered to jffs and overlay:

[    0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2
[    0.299228] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    9.050587] mount_root: jffs2 not ready yet, using temporary tmpfs overlay
[   46.395338] jffs2_scan_eraseblock(): End of filesystem marker found at 0x0
[   46.402362] jffs2_build_filesystem(): unlocking the mtd device...
[   46.422104] jffs2_build_filesystem(): erasing all blocks after the end marker...
[   48.726475] jffs2: Newly-erased block contained word 0xdeadc0de at offset 0x00000000
[   48.743667] jffs2: notice: (1818) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   49.082673] overlayfs: upper fs does not support tmpfile.
daemon.info mount_root: performing overlay whiteout
daemon.info mount_root: synchronizing overlay
daemon.err mount_root: failed to sync jffs2 overlay

Full dmesg:

[    0.000000] Linux version 5.15.76 (builder@buildhost) (mips-openwrt-linux-musl-gcc (OpenWrt GCC 11.3.0 r21171-46fbe55971) 11.3.0, GNU ld (GNU Binutils) 2.37) #0 Fri Nov 4 15:21:00 2022
[    0.000000] printk: bootconsole [early0] enabled
[    0.000000] CPU0 revision is: 0001974c (MIPS 74Kc)
[    0.000000] MIPS: machine is Ubiquiti PowerBeam M5 (XW)
[    0.000000] SoC: Atheros AR9342 rev 3
[    0.000000] Initrd not found or empty - disabling initrd
[    0.000000] Primary instruction cache 64kB, VIPT, 4-way, linesize 32 bytes.
[    0.000000] Primary data cache 32kB, 4-way, VIPT, cache aliases, linesize 32 bytes
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x0000000003ffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x0000000003ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000003ffffff]
[    0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[    0.000000] pcpu-alloc: [0] 0
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 16240
[    0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2
[    0.000000] Dentry cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
[    0.000000] Inode-cache hash table entries: 4096 (order: 2, 16384 bytes, linear)
[    0.000000] Writing ErrCtl register=00000000
[    0.000000] Readback ErrCtl register=00000000
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 55268K/65536K available (6179K kernel code, 592K rwdata, 1332K rodata, 1232K init, 217K bss, 10268K reserved, 0K cma-reserved)
[    0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] NR_IRQS: 51
[    0.000000] CPU clock: 535.000 MHz
[    0.000000] clocksource: MIPS: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7144898866 ns
[    0.000002] sched_clock: 32 bits at 267MHz, resolution 3ns, wraps every 8027976190ns
[    0.008434] Calibrating delay loop... 266.64 BogoMIPS (lpj=1333248)
[    0.095035] pid_max: default: 32768 minimum: 301
[    0.100238] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.108064] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.119570] dyndbg: Ignore empty _ddebug table in a CONFIG_DYNAMIC_DEBUG_CORE build
[    0.131097] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[    0.141646] futex hash table entries: 256 (order: -1, 3072 bytes, linear)
[    0.149047] pinctrl core: initialized pinctrl subsystem
[    0.156611] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.163601] thermal_sys: Registered thermal governor 'step_wise'
[    0.183786] clocksource: Switched to clocksource MIPS
[    0.196827] NET: Registered PF_INET protocol family
[    0.202329] IP idents hash table entries: 2048 (order: 2, 16384 bytes, linear)
[    0.211094] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
[    0.220107] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.228382] TCP established hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.236575] TCP bind hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.244112] TCP: Hash tables configured (established 1024 bind 1024)
[    0.251037] UDP hash table entries: 256 (order: 0, 4096 bytes, linear)
[    0.258085] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes, linear)
[    0.265955] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.272039] PCI: CLS 0 bytes, default 32
[    0.280409] workingset: timestamp_bits=14 max_order=14 bucket_order=0
[    0.292929] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.299228] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    0.311025] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[    0.323063] pinctrl-single 1804002c.pinmux: 544 pins, size 68
[    0.330490] Serial: 8250/16550 driver, 16 ports, IRQ sharing enabled
[    0.340196] printk: console [ttyS0] disabled
[    0.344919] 18020000.uart: ttyS0 at MMIO 0x18020000 (irq = 9, base_baud = 2500000) is a 16550A
[    0.354160] printk: console [ttyS0] enabled
[    0.363059] printk: bootconsole [early0] disabled
[    0.396119] spi-nor spi0.0: mx25l6405d (8192 Kbytes)
[    0.401261] 5 fixed-partitions partitions found on MTD device spi0.0
[    0.407817] OF: Bad cell count for /ahb/spi@1f000000/flash@0/partitions
[    0.414628] OF: Bad cell count for /ahb/spi@1f000000/flash@0/partitions
[    0.421742] OF: Bad cell count for /ahb/spi@1f000000/flash@0/partitions
[    0.428565] OF: Bad cell count for /ahb/spi@1f000000/flash@0/partitions
[    0.435757] Creating 5 MTD partitions on "spi0.0":
[    0.440652] 0x000000000000-0x000000040000 : "u-boot"
[    0.451930] 0x000000040000-0x000000050000 : "u-boot-env"
[    0.458685] 0x000000050000-0x0000007b0000 : "firmware"
[    0.467614] 2 uimage-fw partitions found on MTD device firmware
[    0.473656] Creating 2 MTD partitions on "firmware":
[    0.478757] 0x000000000000-0x000000270000 : "kernel"
[    0.485099] 0x000000270000-0x000000760000 : "rootfs"
[    0.493337] mtd: device 4 (rootfs) set to be root filesystem
[    0.499284] 1 squashfs-split partitions found on MTD device rootfs
[    0.505623] 0x0000005c0000-0x000000760000 : "rootfs_data"
[    0.513505] 0x0000007b0000-0x0000007f0000 : "cfg"
[    0.519633] 0x0000007f0000-0x000000800000 : "art"
[    1.014448] ag71xx 19000000.eth: connected to PHY at mdio.0:04 [uid=004dd072, driver=Qualcomm Atheros AR8035]
[    1.025349] eth0: Atheros AG71xx at 0xb9000000, irq 4, mode: rgmii-id
[    1.032597] i2c_dev: i2c /dev entries driver
[    1.039771] NET: Registered PF_INET6 protocol family
[    1.057429] Segment Routing with IPv6
[    1.061264] In-situ OAM (IOAM) with IPv6
[    1.065484] NET: Registered PF_PACKET protocol family
[    1.070717] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    1.083912] 8021q: 802.1Q VLAN Support v1.8
[    1.101467] VFS: Mounted root (squashfs filesystem) readonly on device 31:4.
[    1.116724] Freeing unused kernel image (initmem) memory: 1232K
[    1.122757] This architecture does not have kernel memory protection.
[    1.129339] Run /sbin/init as init process
[    1.133500]   with arguments:
[    1.133506]     /sbin/init
[    1.133512]   with environment:
[    1.133518]     HOME=/
[    1.133524]     TERM=linux
[    1.962353] init: Console is alive
[    1.966554] init: - watchdog -
[    3.214455] kmodloader: loading kernel modules from /etc/modules-boot.d/*
[    3.313176] kmodloader: done loading kernel modules from /etc/modules-boot.d/*
[    3.331202] init: - preinit -
[    5.410642] random: jshn: uninitialized urandom read (4 bytes read)
[    5.897940] random: jshn: uninitialized urandom read (4 bytes read)
[    5.998590] random: jshn: uninitialized urandom read (4 bytes read)
[    6.414204] random: jshn: uninitialized urandom read (4 bytes read)
[    6.566414] random: procd: uninitialized urandom read (4 bytes read)
[    9.050587] mount_root: jffs2 not ready yet, using temporary tmpfs overlay
[    9.063199] urandom-seed: Seed file not found (/etc/urandom.seed)
[    9.374076] procd: - early -
[    9.377480] procd: - watchdog -
[   10.133070] procd: - watchdog -
[   10.137076] procd: - ubus -
[   10.315081] random: ubusd: uninitialized urandom read (4 bytes read)
[   10.325413] random: ubusd: uninitialized urandom read (4 bytes read)
[   10.332595] random: ubusd: uninitialized urandom read (4 bytes read)
[   10.348852] procd: - init -
[   11.681698] random: jshn: uninitialized urandom read (4 bytes read)
[   11.869424] random: ubusd: uninitialized urandom read (4 bytes read)
[   11.945309] kmodloader: loading kernel modules from /etc/modules.d/*
[   12.493406] urngd: v1.0.2 started.
[   12.602655] Loading modules backported from Linux version v5.15.74-0-ga3f2f5ac9d61
[   12.610430] Backport generated by backports.git v5.15.74-1-0-ge2d78967
[   13.107746] random: crng init done
[   13.111228] random: 25 urandom warning(s) missed due to ratelimiting
[   13.292306] ath: EEPROM regdomain: 0x0
[   13.292337] ath: EEPROM indicates default country code should be used
[   13.292344] ath: doing EEPROM country->regdmn map search
[   13.292364] ath: country maps to regdmn code: 0x3a
[   13.292374] ath: Country alpha2 being used: US
[   13.292382] ath: Regpair used: 0x3a
[   13.307149] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   13.309841] ieee80211 phy0: Atheros AR9340 Rev:3 mem=0xb8100000, irq=12
[   13.424767] kmodloader: done loading kernel modules from /etc/modules.d/*
[   46.395338] jffs2_scan_eraseblock(): End of filesystem marker found at 0x0
[   46.402362] jffs2_build_filesystem(): unlocking the mtd device...
[   46.413860] done.
[   46.422104] jffs2_build_filesystem(): erasing all blocks after the end marker...
[   48.266963] br-lan: port 1(eth0) entered blocking state
[   48.279964] br-lan: port 1(eth0) entered disabled state
[   48.285658] device eth0 entered promiscuous mode
[   48.726475] jffs2: Newly-erased block contained word 0xdeadc0de at offset 0x00000000
[   48.741668] done.
[   48.743667] jffs2: notice: (1818) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[   49.082673] overlayfs: upper fs does not support tmpfile.
[   64.905265] eth0: link up (100Mbps/Full duplex)
[   64.909943] br-lan: port 1(eth0) entered blocking state
[   64.915308] br-lan: port 1(eth0) entered forwarding state
[   64.924328] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready

Thanks, Regards.

That looks like only 4 blocks for the jffs, which will not allow formatting a writeable filesystem. Confirm with cat /proc/mtd. If that is the case you will need to take some packages out of the build to make space in the flash.

This issue is reproducible with a snapshot build with no packages, I even tried to remove as many as possible packages from the build image with image-builder (IPv6, USB Modules, PPPoE, and Opkg) but I still had configuration loss, I've downgraded it at the moment so I can't check the mtd output

I tried trimming most of the packages and I still reproduced this, here's the mtd output

root@OpenWrt:~# cat /proc/mtd
dev:    size   erasesize  name
mtd0: 00040000 00010000 "u-boot"
mtd1: 00010000 00010000 "u-boot-env"
mtd2: 00760000 00010000 "firmware"
mtd3: 00270000 00010000 "kernel"
mtd4: 004f0000 00010000 "rootfs"
mtd5: 00220000 00010000 "rootfs_data"
mtd6: 00040000 00010000 "cfg"
mtd7: 00010000 00010000 "art"
[    0.000000] Kernel command line: console=ttyS0,115200 rootfstype=squashfs,jffs2
[    0.299212] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORITY) (c) 2001-2006 Red Hat, Inc.
[    8.476950] jffs2_scan_eraseblock(): End of filesystem marker found at 0x10000
[    8.484359] jffs2_build_filesystem(): unlocking the mtd device...
[    8.492633] jffs2_build_filesystem(): erasing all blocks after the end marker...
[    8.495187] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00210000
[    8.513147] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00200000
[    8.523519] jffs2: Newly-erased block contained word 0x19852003 at offset 0x001f0000
[    8.533921] jffs2: Newly-erased block contained word 0x19852003 at offset 0x001e0000
[    8.544295] jffs2: Newly-erased block contained word 0x19852003 at offset 0x001d0000
[    8.554669] jffs2: Newly-erased block contained word 0x19852003 at offset 0x001c0000
[    8.565063] jffs2: Newly-erased block contained word 0x19852003 at offset 0x001b0000
[    8.575442] jffs2: Newly-erased block contained word 0x19852003 at offset 0x001a0000
[    8.585814] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00190000
[    8.596199] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00180000
[    8.606565] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00170000
[    8.616940] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00160000
[    8.627314] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00150000
[    8.637689] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00140000
[    8.648063] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00130000
[    8.658439] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00120000
[    8.668814] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00110000
[    8.679188] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00100000
[    8.689556] jffs2: Newly-erased block contained word 0x19852003 at offset 0x000f0000
[    8.699932] jffs2: Newly-erased block contained word 0x19852003 at offset 0x000e0000
[    8.710320] jffs2: Newly-erased block contained word 0x19852003 at offset 0x000d0000
[    8.720692] jffs2: Newly-erased block contained word 0x19852003 at offset 0x000c0000
[    8.731066] jffs2: Newly-erased block contained word 0x19852003 at offset 0x000b0000
[    8.741441] jffs2: Newly-erased block contained word 0x19852003 at offset 0x000a0000
[    8.751816] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00090000
[    8.762193] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00080000
[    8.772567] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00070000
[    8.782958] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00060000
[    8.793338] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00050000
[    8.803710] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00040000
[    8.814122] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00030000
[    8.824499] jffs2: Newly-erased block contained word 0x19852003 at offset 0x00020000
[    8.834867] jffs2: Newly-erased block contained word 0xdeadc0de at offset 0x00010000
[    8.844726] jffs2: notice: (376) jffs2_build_xattr_subsystem: complete building xattr subsystem, 0 of xdatum (0 unchecked, 0 orphan) and 0 of xref (0 dead, 0 orphan) found.
[    8.867791] mount_root: switching to jffs2 overlay
[    8.873933] mount_root: switching to jffs2 failed - fallback to ramoverlay

The same configuration works fine with 22.03.2 build

It looks to me like either the flash chip failed, or maybe it is clocked too fast, as newly-erased erase block should contain only binary ones, and here you have some data. Is it reproducible with older builds, like, say, 19.07?

This issue isn't reproducible in 19.07 and 22.03, there are another issue in 22.03, which is consistent high latency in ath9k but it's out of scope of this issue, and the reason that drove me trying snapshot to begin with