More than a year at least, definitely more. Where I live, it's hot.
flash-read-disturb-errors_dsn15.pdf
3.17 MB
More than a year at least, definitely more. Where I live, it's hot.
Get a 120mm USB fan and blow it upwards from the bottom of the router. That's the best cooling effect for RT3200/E8450.
Check both SW and HW acceleration under firewall section in Luci, then enable WED by editing /etc/modules.conf
and appending:
options mt7915e wed_enable=Y
Testing was performed over WiFi 6 (channel 132 80mhz WPA3 mixed everything else default) to my MacBook Pro.
PD: I also have irqbalance and performance governor set.
PD2: I reenabled WAN6 and seems to be working good probably something changed from the ISP side.
PD3: Back then when I used to have 50mbps plan with the same ISP I had +180ms latency spikes so bufferbloat on a FTTH connection is most likely caused by traffic shapers which are optimized for high speeds.
And add some kind of dust filter
Well, looks like after 312 sysupgrades since June 2021 with my RT3200 I got now hit with semi-OKD.
(305 sysupgrades with the old layout, 7 sysupgrades with the new layout)
Last time successfully upgraded 4 days ago with r25567-0dfc0495fc.
Now with r25598-a6991fc7d2 I experienced the "no LEDs, no activity" OKD:
Ps. I changed my vote in the poll
For those uninitiated such as myself, could you clarify what you mean by "old" and "new" layout?
Old layouts are from @daniel's UBI installer prior to 1.1.0?
New layouts are via the 1.1.0 installer and the most current pre-release version, 1.1.1?
So the potential solution described in #4002 still resulted in an OKD?
I'd thought of that. But the dust blowing up into the case, the hassle of having to open it up periodically to clean out said dust, and vibration noise from the router sitting atop the spinning fan motivated me to try blowing air from the side.
@daniel
sir i was suspecting my capacitor which i discussed sometime ago, but wanted to see serial for this error or failure , here i can reproduce this
[ 36.371107] reboot: Restarting system
F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02F1 [000F]
Jump to BL
NOTICE: BL2: v2.9.0(release):OpenWrt v2023-10-13-0ea67d76-1 (mt7622-snand-ubi-)
NOTICE: BL2: Built : 11:23:19, Feb 18 2024
NOTICE: WDT: [40000000] Software reset (reboot)
NOTICE: CPU: MT7622
NOTICE: SPI-NAND: FM35Q1GA (128MB)
NOTICE: UBI: scanning [0x80000 - 0x8000000] ...
NOTICE: UBI: Bad VID magic in block 509 00000000
NOTICE: UBI: scanning is finished
NOTICE: UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
NOTICE: UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
NOTICE: UBI: Volume fip (Id #0) size is 1019644 bytes
ERROR: BL2: Failed to load image id 5 (-2)
@daniel
here is another serial output, just a simple power off and on and it worked again, you can see this in serial console output, if it can help you to debug this
[ 36.371107] reboot: Restarting system
F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02F1 [000F]
Jump to BL
NOTICE: BL2: v2.9.0(release):OpenWrt v2023-10-13-0ea67d76-1 (mt7622-snand-ubi-)
NOTICE: BL2: Built : 11:23:19, Feb 18 2024
NOTICE: WDT: [40000000] Software reset (reboot)
NOTICE: CPU: MT7622
NOTICE: SPI-NAND: FM35Q1GA (128MB)
NOTICE: UBI: scanning [0x80000 - 0x8000000] ...
NOTICE: UBI: Bad VID magic in block 509 00000000
NOTICE: UBI: scanning is finished
NOTICE: UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
NOTICE: UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
NOTICE: UBI: Volume fip (Id #0) size is 1019644 bytes
ERROR: BL2: Failed to load image id 5 (-2)
F0: 102B 0000
F6: 0000 0000
V0: 0000 0000 [0001]
00: 0000 0000
BP: 0400 0041 [0000]
G0: 1190 0000
T0: 0000 02F0 [000F]
Jump to BL
NOTICE: BL2: v2.9.0(release):OpenWrt v2023-10-13-0ea67d76-1 (mt7622-snand-ubi-)
NOTICE: BL2: Built : 11:23:19, Feb 18 2024
NOTICE: WDT: Cold boot
NOTICE: CPU: MT7622
NOTICE: WDT: disabled
NOTICE: SPI-NAND: FM35Q1GA (128MB)
NOTICE: UBI: scanning [0x80000 - 0x8000000] ...
NOTICE: UBI: Bad VID magic in block 509 00000000
NOTICE: UBI: scanning is finished
NOTICE: UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
NOTICE: UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
NOTICE: UBI: Volume fip (Id #0) size is 1019644 bytes
NOTICE: BL2: Booting BL31
NOTICE: BL31: v2.9.0(release):OpenWrt v2023-10-13-0ea67d76-1 (mt7622-snand-ubi)
NOTICE: BL31: Built : 11:23:19, Feb 18 2024
U-Boot 2024.01-OpenWrt-r25239-a3a33f02ce (Feb 18 2024 - 11:23:19 +0000)
CPU: MediaTek MT7622
Model: mt7622-linksys-e8450-ubi
DRAM: 512 MiB
Core: 48 devices, 21 uclasses, devicetree: separate
MMC:
Loading Environment from UBI... SPI-NAND: FM35Q1GA (128MB)
Read 126976 bytes from volume ubootenv to 000000005f7bf0c0
Read 126976 bytes from volume ubootenv2 to 000000005f7de100
OK
In: serial@11002000
Out: serial@11002000
Err: serial@11002000
reset button found
Loading Environment from UBI... UBI partition 'ubi' already selected
Read 126976 bytes from volume ubootenv to 000000005f7bf0c0
Read 126976 bytes from volume ubootenv2 to 000000005f7de100
OK
Net: eth0: ethernet@1b100000
No EFI system partition
No EFI system partition
Failed to persist EFI variables
( ( ( OpenWrt ) ) )
1. Run default boot command.
2. Boot system via TFTP.
3. Boot production system from flash.
4. Boot recovery system from flash.
5. Load production system via TFTP then write to flash.
6. Load recovery system via TFTP then write to flash.
7. Load BL31+U-Boot FIP via TFTP then write to flash.
8. Load BL2 preloader via TFTP then write to flash.
9. Reboot.
a. Reset all settings to factory defaults.
0. U-Boot console
Press UP/DOWN to move, ENTER to select, ESC to quit
UBI partition 'ubi' already selected
No size specified -> Using max size (10285056)
Read 10285056 bytes from volume fit to 0000000048000000
## Checking Image at 48000000 ...
FIT image found
FIT description: ARM64 OpenWrt FIT (Flattened Image Tree)
Image 0 (kernel-1)
Description: ARM64 OpenWrt Linux-6.1.81
Type: Kernel Image
Compression: gzip compressed
Data Start: 0x48001000
Data Size: 5717796 Bytes = 5.5 MiB
Architecture: AArch64
OS: Linux
Load Address: 0x44000000
Entry Point: 0x44000000
Hash algo: crc32
Hash value: 057f9a86
Hash algo: sha1
Hash value: 05d4696007697cf48d34cf13716ce59c7e271c26
Image 1 (fdt-1)
Description: ARM64 OpenWrt linksys_e8450-ubi device tree blob
Type: Flat Device Tree
Compression: uncompressed
Data Start: 0x48575000
Data Size: 31487 Bytes = 30.7 KiB
Architecture: AArch64
Hash algo: crc32
Hash value: 84395f01
Hash algo: sha1
Hash value: b7387188109ae379e21d083798e981b32eb2e6f8
Image 2 (rootfs-1)
Description: ARM64 OpenWrt linksys_e8450-ubi rootfs
Type: Filesystem Image
Compression: uncompressed
Data Start: 0x4857d000
Data Size: 4403200 Bytes = 4.2 MiB
Hash algo: crc32
Hash value: df591676
Hash algo: sha1
Hash value: 1fee6cb89868aa5f2d82f9771216f2fb7e37d908
Default Configuration: 'config-1'
Configuration 0 (config-1)
Description: OpenWrt linksys_e8450-ubi
Kernel: kernel-1
FDT: fdt-1
Loadables: rootfs-1
## Checking hash(es) for FIT Image at 48000000 ...
Hash(es) for Image 0 (kernel-1): crc32+ sha1+
Hash(es) for Image 1 (fdt-1): crc32+ sha1+
Hash(es) for Image 2 (rootfs-1): crc32+ sha1+
## Loading kernel from FIT Image at 48000000 ...
Using 'config-1' configuration
Trying 'kernel-1' kernel subimage
Description: ARM64 OpenWrt Linux-6.1.81
Type: Kernel Image
Compression: gzip compressed
Data Start: 0x48001000
Data Size: 5717796 Bytes = 5.5 MiB
Architecture: AArch64
OS: Linux
Load Address: 0x44000000
Entry Point: 0x44000000
Hash algo: crc32
Hash value: 057f9a86
Hash algo: sha1
Hash value: 05d4696007697cf48d34cf13716ce59c7e271c26
Verifying Hash Integrity ... crc32+ sha1+ OK
## Loading fdt from FIT Image at 48000000 ...
Using 'config-1' configuration
Trying 'fdt-1' fdt subimage
Description: ARM64 OpenWrt linksys_e8450-ubi device tree blob
Type: Flat Device Tree
Compression: uncompressed
Data Start: 0x48575000
Data Size: 31487 Bytes = 30.7 KiB
Architecture: AArch64
Hash algo: crc32
Hash value: 84395f01
Hash algo: sha1
Hash value: b7387188109ae379e21d083798e981b32eb2e6f8
Verifying Hash Integrity ... crc32+ sha1+ OK
Booting using the fdt blob at 0x48575000
Working FDT set to 48575000
## Loading loadables from FIT Image at 48000000 ...
Trying 'rootfs-1' loadables subimage
Description: ARM64 OpenWrt linksys_e8450-ubi rootfs
Type: Filesystem Image
Compression: uncompressed
Data Start: 0x4857d000
Data Size: 4403200 Bytes = 4.2 MiB
Hash algo: crc32
Hash value: df591676
Hash algo: sha1
Hash value: 1fee6cb89868aa5f2d82f9771216f2fb7e37d908
Verifying Hash Integrity ... crc32+ sha1+ OK
Uncompressing Kernel Image
Loading Device Tree to 000000005e7c4000, end 000000005e7ceafe ... OK
Working FDT set to 5e7c4000
Add 'ramoops@42ff0000' node failed: FDT_ERR_EXISTS
Starting kernel ...
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
[ 0.000000] Linux version 6.1.81 (builder@buildhost) (aarch64-openwrt-linux-4
[ 0.000000] Machine model: Linksys E8450 (UBI)
[ 0.000000] earlycon: uart8250 at MMIO32 0x0000000011002000 (options '')
[ 0.000000] printk: bootconsole [uart8250] enabled
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000040000000-0x000000005fffffff]
[ 0.000000] DMA32 empty
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000040000000-0x0000000042ffffff]
[ 0.000000] node 0: [mem 0x0000000043000000-0x000000004302ffff]
[ 0.000000] node 0: [mem 0x0000000043030000-0x000000005fffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff]
[ 0.000000] psci: probing for conduit method from DT.
[ 0.000000] psci: PSCIv1.1 detected in firmware.
[ 0.000000] psci: Using standard PSCI v0.2 function IDs
[ 0.000000] psci: MIGRATE_INFO_TYPE not supported.
[ 0.000000] psci: SMC Calling Convention v1.4
[ 0.000000] percpu: Embedded 18 pages/cpu s33896 r8192 d31640 u73728
[ 0.000000] pcpu-alloc: s33896 r8192 d31640 u73728 alloc=18*4096
[ 0.000000] pcpu-alloc: [0] 0 [0] 1
[ 0.000000] Detected VIPT I-cache on CPU0
[ 0.000000] CPU features: kernel page table isolation disabled by kernel conn
[ 0.000000] CPU features: detected: ARM erratum 843419
[ 0.000000] alternatives: applying boot alternatives
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 129024
[ 0.000000] Kernel command line: earlycon=uart8250,mmio32,0x11002000 consolet
[ 0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, )
[ 0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, l)
[ 0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[ 0.000000] Memory: 500696K/524288K available (8704K kernel code, 900K rwdat)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[ 0.000000] rcu: Hierarchical RCU implementation.
[ 0.000000] Tracing variant of Tasks RCU enabled.
[ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 ji.
[ 0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[ 0.000000] Root IRQ handler: gic_handle_irq
[ 0.000000] GIC: Using split EOI/Deactivate mode
[ 0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[ 0.000000] arch_timer: cp15 timer(s) running at 12.50MHz (phys).
[ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycless
[ 0.000000] sched_clock: 56 bits at 13MHz, resolution 80ns, wraps every 4398s
[ 0.008280] Calibrating delay loop (skipped), value calculated using timer f)
[ 0.018680] pid_max: default: 32768 minimum: 301
[ 0.023596] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, line)
[ 0.030944] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes,)
[ 0.039989] cblist_init_generic: Setting adjustable number of callback queue.
[ 0.047278] cblist_init_generic: Setting shift to 1 and lim to 1.
[ 0.053522] rcu: Hierarchical SRCU implementation.
[ 0.058334] rcu: Max phase no-delay instances is 1000.
[ 0.064070] smp: Bringing up secondary CPUs ...
[ 0.069013] Detected VIPT I-cache on CPU1
[ 0.069024] CPU features: SANITY CHECK: Unexpected variation in SYS_CNTFRQ_E0
[ 0.069046] CPU features: Unsupported CPU feature variation detected.
[ 0.069127] CPU1: Booted secondary processor 0x0000000001 [0x410fd034]
[ 0.069199] smp: Brought up 1 node, 2 CPUs
[ 0.102122] SMP: Total of 2 processors activated.
[ 0.106839] CPU features: detected: 32-bit EL0 Support
[ 0.111994] CPU features: detected: CRC32 instructions
[ 0.117174] CPU features: emulated: Privileged Access Never (PAN) using TTBRg
[ 0.125560] CPU: All CPU(s) started at EL2
[ 0.129664] alternatives: applying system-wide alternatives
[ 0.140019] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, ms
[ 0.149914] futex hash table entries: 512 (order: 3, 32768 bytes, linear)
[ 0.156861] pinctrl core: initialized pinctrl subsystem
[ 0.163038] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[ 0.169320] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
[ 0.176437] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA pool for atomic alls
[ 0.184228] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic as
[ 0.192632] thermal_sys: Registered thermal governor 'fair_share'
[ 0.192638] thermal_sys: Registered thermal governor 'bang_bang'
[ 0.198751] thermal_sys: Registered thermal governor 'step_wise'
[ 0.204785] thermal_sys: Registered thermal governor 'user_space'
[ 0.210891] ASID allocator initialised with 65536 entries
[ 0.222826] pstore: Registered ramoops as persistent store backend
[ 0.229036] ramoops: using 0x10000@0x42ff0000, ecc: 0
[ 0.254966] cryptd: max_cpu_qlen set to 1000
[ 0.260634] SCSI subsystem initialized
[ 0.264591] libata version 3.00 loaded.
[ 0.269810] clocksource: Switched to clocksource arch_sys_counter
[ 0.276696] NET: Registered PF_INET protocol family
[ 0.281742] IP idents hash table entries: 8192 (order: 4, 65536 bytes, linea)
[ 0.289669] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096)
[ 0.298086] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes,)
[ 0.305880] TCP established hash table entries: 4096 (order: 3, 32768 bytes,)
[ 0.313684] TCP bind hash table entries: 4096 (order: 5, 131072 bytes, linea)
[ 0.321035] TCP: Hash tables configured (established 4096 bind 4096)
[ 0.327514] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
[ 0.334092] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
[ 0.341254] NET: Registered PF_UNIX/PF_LOCAL protocol family
[ 0.346957] PCI: CLS 0 bytes, default 64
[ 0.352123] workingset: timestamp_bits=46 max_order=17 bucket_order=0
[ 0.362318] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[ 0.368183] jffs2: version 2.2 (NAND) (SUMMARY) (LZMA) (RTIME) (CMODE_PRIORI.
[ 0.411153] Block layer SCSI generic (bsg) driver version 0.4 loaded (major )
[ 0.420017] mt7622-pinctrl 10211000.pinctrl: invalid group "pwm_ch7_2" for f"
[ 0.432958] mt-pmic-pwrap 10001000.pwrap: unexpected interrupt int=0x1
[ 0.455583] Serial: 8250/16550 driver, 16 ports, IRQ sharing enabled
[ 0.464452] printk: console [ttyS0] disabled
[ 0.488939] 11002000.serial: ttyS0 at MMIO 0x11002000 (irq = 118, base_baud 2
[ 0.498274] printk: console [ttyS0] enabled
[ 0.498274] printk: console [ttyS0] enabled
[ 0.506652] printk: bootconsole [uart8250] disabled
[ 0.506652] printk: bootconsole [uart8250] disabled
[ 0.537293] 11004000.serial: ttyS1 at MMIO 0x11004000 (irq = 119, base_baud 2
[ 0.547863] mtk_rng 1020f000.rng: registered RNG driver
[ 0.548041] random: crng init done
[ 0.560639] loop: module loaded
[ 0.564539] mtk-ecc 1100e000.ecc: probed
[ 0.571901] spi-nand spi2.0: Fidelix SPI NAND was found.
[ 0.577222] spi-nand spi2.0: 128 MiB, block size: 128 KiB, page size: 2048, 4
[ 0.585420] mtk-snand 1100d000.spi: ECC strength: 4 bits per 512 bytes
[ 0.592289] 2 fixed-partitions partitions found on MTD device spi2.0
[ 0.598668] OF: Bad cell count for /spi@1100d000/flash@0/partitions
[ 0.604965] OF: Bad cell count for /spi@1100d000/flash@0/partitions
[ 0.611498] Creating 2 MTD partitions on "spi2.0":
[ 0.616304] 0x000000000000-0x000000080000 : "bl2"
[ 0.621984] 0x000000080000-0x000008000000 : "ubi"
[ 0.759552] ubi0: default fastmap pool size: 50
[ 0.764110] ubi0: default fastmap WL pool size: 25
[ 0.768899] ubi0: attaching mtd1
[ 1.050288] ubi0: scanning is finished
[ 1.058835] ubi0: attached mtd1 (name "ubi", size 127 MiB)
[ 1.064355] ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[ 1.071232] ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[ 1.078012] ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
[ 1.084972] ubi0: good PEBs: 1020, bad PEBs: 0, corrupted PEBs: 0
[ 1.091061] ubi0: user volume: 8, internal volumes: 1, max. volumes count: 18
[ 1.098274] ubi0: max/mean erase counter: 10/5, WL threshold: 4096, image se8
[ 1.107314] ubi0: available PEBs: 0, total reserved PEBs: 1020, PEBs reserve0
[ 1.116624] ubi0: background thread "ubi_bgt0d" started, PID 242
[ 1.116839] OF: Bad cell count for /spi@1100d000/flash@0/partitions
[ 1.129082] OF: Bad cell count for /spi@1100d000/flash@0/partitions
[ 1.136073] block ubiblock0_5: created from ubi0:5(fit)
[ 1.309008] mtk_soc_eth 1b100000.ethernet eth0: mediatek frame engine at 0xf6
[ 1.318927] i2c_dev: i2c /dev entries driver
[ 1.324935] mtk-wdt 10212000.watchdog: Watchdog enabled (timeout=31 sec, now)
[ 1.335946] NET: Registered PF_INET6 protocol family
[ 1.341771] Segment Routing with IPv6
[ 1.345451] In-situ OAM (IOAM) with IPv6
[ 1.349416] NET: Registered PF_PACKET protocol family
[ 1.354526] bridge: filtering via arp/ip/ip6tables is no longer available by.
[ 1.367690] 8021q: 802.1Q VLAN Support v1.8
[ 1.373477] pstore: Using crash dump compression: deflate
[ 1.391535] mtk-pcie 1a143000.pcie: host bridge /pcie@1a143000 ranges:
[ 1.398077] mtk-pcie 1a143000.pcie: Parsing ranges property...
[ 1.403953] mtk-pcie 1a143000.pcie: MEM 0x0020000000..0x0027ffffff -> 00
[ 1.666311] mtk-pcie 1a143000.pcie: PCI host bridge to bus 0000:00
[ 1.672540] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 1.678031] pci_bus 0000:00: root bus resource [mem 0x20000000-0x27ffffff]
[ 1.684909] pci_bus 0000:00: scanning bus
[ 1.689071] pci 0000:00:00.0: [14c3:3258] type 01 class 0x060400
[ 1.695266] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x1ffffffff 64bit pr]
[ 1.705555] pci_bus 0000:00: fixups for bus
[ 1.709763] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0
[ 1.716482] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reg
[ 1.724562] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1
[ 1.731690] pci_bus 0000:01: scanning bus
[ 1.735844] pci 0000:01:00.0: [14c3:7915] type 00 class 0x000280
[ 1.742044] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit pre]
[ 1.749371] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00003fff 64bit pre]
[ 1.756702] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x00000fff 64bit pre]
[ 1.764687] pci 0000:01:00.0: supports D1 D2
[ 1.768952] pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[ 1.775596] pci 0000:01:00.0: PME# disabled
[ 1.780130] pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited b)
[ 1.820102] pci_bus 0000:01: fixups for bus
[ 1.824303] pci_bus 0000:01: bus scan returning with max=01
[ 1.829904] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[ 1.836534] pci_bus 0000:00: bus scan returning with max=01
[ 1.842142] pci 0000:00:00.0: BAR 0: no space for [mem size 0x200000000 64bi]
[ 1.849816] pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x200000000 ]
[ 1.857825] pci 0000:00:00.0: BAR 8: assigned [mem 0x20000000-0x201fffff]
[ 1.864632] pci 0000:01:00.0: BAR 0: assigned [mem 0x20000000-0x200fffff 64b]
[ 1.872481] pci 0000:01:00.0: BAR 2: assigned [mem 0x20100000-0x20103fff 64b]
[ 1.880336] pci 0000:01:00.0: BAR 4: assigned [mem 0x20104000-0x20104fff 64b]
[ 1.888172] pci 0000:00:00.0: PCI bridge to [bus 01]
[ 1.893155] pci 0000:00:00.0: bridge window [mem 0x20000000-0x201fffff]
[ 1.900179] pcieport 0000:00:00.0: assign IRQ: got 130
[ 1.905332] pcieport 0000:00:00.0: enabling device (0000 -> 0002)
[ 1.911475] pcieport 0000:00:00.0: enabling bus mastering
[ 1.916956] mtk-pcie 1a143000.pcie: msi#0 address_hi 0x0 address_lo 0x44efd00
[ 1.924482] pcieport 0000:00:00.0: PME: Signaling with IRQ 130
[ 1.930511] pcieport 0000:00:00.0: saving config space at offset 0x0 (readin)
[ 1.938703] pcieport 0000:00:00.0: saving config space at offset 0x4 (readin)
[ 1.946737] pcieport 0000:00:00.0: saving config space at offset 0x8 (readin)
[ 1.954855] pcieport 0000:00:00.0: saving config space at offset 0xc (readin)
[ 1.962785] pcieport 0000:00:00.0: saving config space at offset 0x10 (readi)
[ 1.970453] pcieport 0000:00:00.0: saving config space at offset 0x14 (readi)
[ 1.978111] pcieport 0000:00:00.0: saving config space at offset 0x18 (readi)
[ 1.986387] pcieport 0000:00:00.0: saving config space at offset 0x1c (readi)
[ 1.994576] pcieport 0000:00:00.0: saving config space at offset 0x20 (readi)
[ 2.002858] pcieport 0000:00:00.0: saving config space at offset 0x24 (readi)
[ 2.010528] pcieport 0000:00:00.0: saving config space at offset 0x28 (readi)
[ 2.018185] pcieport 0000:00:00.0: saving config space at offset 0x2c (readi)
[ 2.025852] pcieport 0000:00:00.0: saving config space at offset 0x30 (readi)
[ 2.033521] pcieport 0000:00:00.0: saving config space at offset 0x34 (readi)
[ 2.041275] pcieport 0000:00:00.0: saving config space at offset 0x38 (readi)
[ 2.048932] pcieport 0000:00:00.0: saving config space at offset 0x3c (readi)
[ 2.057751] mtk-pcie 1a145000.pcie: host bridge /pcie@1a145000 ranges:
[ 2.064357] mtk-pcie 1a145000.pcie: Parsing ranges property...
[ 2.070219] mtk-pcie 1a145000.pcie: MEM 0x0028000000..0x002fffffff -> 00
[ 2.409876] mtk-pcie 1a145000.pcie: Port1 link down
[ 2.414969] mtk-pcie 1a145000.pcie: PCI host bridge to bus 0001:00
[ 2.421180] pci_bus 0001:00: root bus resource [bus 00-ff]
[ 2.426669] pci_bus 0001:00: root bus resource [mem 0x28000000-0x2fffffff]
[ 2.433557] pci_bus 0001:00: scanning bus
[ 2.439452] pci_bus 0001:00: fixups for bus
[ 2.443643] pci_bus 0001:00: bus scan returning with max=00
The problem has been resolved. It's my IP address configuration issue. Thank you for your firmware.
Yes, just that.
First two plus years with the old partition layout made by with Daniel's older installer (1.0.3 or older).
Last month with partitions from the 1.1.0 installer.
The curious part is that power-off did not help, but totally unplugging for a period did help. Points toward heating/cooling. But I am unsure if that is about cooling the router or about cooling the power adapter.
(Room temperature is 22 degrees Celsius, so the router should get enough cooling from the air)
I do not have serial attached to this router, so no idea which of possible u-boot error messages listed in this thread that "no LEDs" condition matches.
Ps.
I wonder if attaching a memory chip heat sink to the flash chip would help. Ten years ago I had a router where CPU overheated and attaching a small memory heat sink to it helped. (something in style of the Raspberry cooler here: https://www.amazon.com/Enokay-Cooling-Heatsink-Raspberry-Heatsinks/dp/B014KKY3KI )
Pps.
I add here the LuCI stats temperature graph from the period. It clearly shows the "really cold" start after being unplugged for a while.
Could the new installer somehow have rendered the temperature during flash write and/or read significant in terms of integrity? Somehow we have to account for why the freezer trick fixes things.
Flash write itself seems to have gone ok, as router now seems to operate normally and even the settings were carried over ok.
But maybe the flashing process heats the flash chip so that reading it at the boot occasionally fails for some blocks?
That would be consistent with the freezer trick fixing the boot.
And this did not happen before the new installer, right? So doesn't it mean there is something in the software or a hardware effect caused by the software, which suddenly amplifies the significance of the temperature of the router or flash chip specifically in respect of reads to the point that the router may not boot? Does UBI play a part here?
I did a little digging and perhaps one or more of the following may help:
3.17 MB
https://www.sciencedirect.com/science/article/abs/pii/S0026271412002831
There are several advantages to SLC NAND flash memory, but one of the most important ones is that it comes with an extended temperature range
Est. reading time: 3 minutes
Hello, I am using C8051F124 processor for one of the project. The processor has 128KB on-chip, reprogrammable Flash memory for program code or non-volatile data
Flashβs endurance limits its employment in data-intensive processing in-memory (PiM) applications. We propose a high-temperature environment and Circadian Rhythm adoption to tackle the endurance limitation of flash memory that is used as a processing...
The following, taken from the first link above, seems potentially significant to me:
... and especially the last paragraph.
So might it not help mitigate temperature-related read problems to alter the way in which the flash memory is read during boot, e.g. by avoiding reading from the same block too quickly?
Can I flash 23.05.3 if I have already run the new installer?
I think categorically not.
it seems we've got some very plausible links between
-iintermittent failures
temperature-dependent boot
unurusal sensitivity to power conditions
NAND read access mode (x4, x2) - changed in installer and maybe some 6.x kernel versions?
why read-write would 'fix' for some
maybe even the fix for unreliable NAND with bad blocks impelling th emove to UBI, has made this more of a problem by re-trying reads during boot?
not sure what the mitigation steps are, but it's satifying to find plausible mechanisms!
You can see same situation with new ubi layout and going dead by not initiating id 5, which is Bl2 -> booting bl31
And cold start fixes it, you can see serial output just shared 1 or 2 posts before
I think it needs some attention and hardware or heating is no issue.
Yes, like many users, previous Ubi recovery and installer , never faced such issue , i hope most are not facing like me but tbh eventually with frequent reboots they'll
As it can be reproduced easily in 5/10 reboots , thanks
With the new documents revealed here, I think even more strongly that @daniel's hypothesis is likely at play, too. Combine the two problems together, and the end result is a more likely scenario that results in problems. As I understand it, the chip is being driven at a lower current, which might lead to weaker writes with a reduced charge voltage. Combine that with an increased noise floor from heating (read disturbance), and the comparator is left with an ambiguous value that is improperly interpreted. This would also explain why re-writing the weak bits with the very same data can recover the device so easily but full writes can sometimes corrupt again.
We already know that the bootloader is going to read the blocks multiple times in short order, and we may not be able to do much about that in short term if at all. However, if we can test with the driving current set to the same higher value as the OEM firmware used, it's very likely that's all we need.
However, if we can test with the driving current set to the same higher value as the OEM firmware used, it's very likely that's all we need.
I've already applied this in OpenWrt:
Set 12mA driving strength for SPI-NAND pins like the stock firmware's bootloaderβ¦
Discussion to adjust SPI pin driving strength also already in BL2 is still ongoing:
mtk-openwrt:mtksoc
β dangowrt:fix-FM35Q1GA
Dont allow x2 read and cache read operations on FM35Q1GA as they [seem to be unsβ¦
However, the v1.1.1 installer at this point still uses the old values. Regenerating the installer would fix that. Re-writing the fip
volume using an OpenWrt snapshot after the above commit as well.