R7800 -> Flashing openwrt causes bootloop (bad block in kernel area)

yes, you are correct, dd-wrt did not work.

This seems like a infrequent issue but I'm kinda interested. I'll post in a new thread if find anything interesting.

Likely this part in mtdpart.c (based on grepping for "correct location of partition":
R7800-V1.0.2.44_gpl_src\git_home\linux.git\sourcecode\drivers\mtd\mtdpart.c

static void correct_rootfs_partition(struct mtd_part *slave)
{
	uint64_t rootfs_offset = 0;
	int bad_blocks = 0;

	/* Search rootfs header and reset the offset and size of rootfs partition */
	if (slave->mtd.name && !strcmp(slave->mtd.name, "rootfs") &&
	    !(find_rootfs_header(slave->master, &slave->mtd, &rootfs_offset, &bad_blocks))) {
		slave->offset += rootfs_offset;
		slave->mtd.size -= rootfs_offset;
		if (slave->master->_block_isbad)
			slave->mtd.ecc_stats.badblocks -= bad_blocks;

		printk(KERN_INFO "the correct location of partition \"%s\": 0x%012llx-0x%012llx\n", slave->mtd.name,
		       (unsigned long long)slave->offset, (unsigned long long)(slave->offset + slave->mtd.size));

		/* Build partition mapping for rootfs partition */
		create_partition_mapping(&slave->mtd);
	}
}
#endif

EDIT:
It seems to be a larger code block inside ifdef DNI_PARTITION_MAPPING, so something that Netgear has added for their mtd version.

2 Likes

I'm happy to test anything. After so many flashes on this router I don't mind a few more :smile: Plus there seem to be quite a few other cases on the net with what was most likely the same issue.

DD-WRT also failed, only Stock and Voxel's worked.

:+1: on getting something like this into OpenWrt.

Bad-block handling (including appropriate reservations) has been haunting me since I started working with low-level NAND early this year. Some thought and questions at http://lists.infradead.org/pipermail/openwrt-devel/2019-November/020023.html

Well, this is the whole mtdpart.c file extracted into Github:

It seems to contains rather large logic for parsing the partition table with Netgear's extensions. I have not compared it to the standard mtd.

1 Like

I think I'm experiencing the same issue with my R7800. It keeps boot looping no matter what I flash. Although I don't have any serial port hardware to confirm it. I'm doubting whether I should try and fix it or just buy another router.

How did you came to the conclusion/calculation that it was 128kB that should've been deleted? And between which two bytes should that amount of white space be deleted in the factory img?

If this is the problem with your unit you could flash the Netgear factory firmware then check the boot log to see if there is a re-allocation message.

128 kilo Byte (131072 bytes counted in decimal, or hex 0x20000) is the size of one erase block. In the OpenWrt "factory" image, there are a couple of MB of zeros inbetween the end of the kernel (around 0x250000) and the start of the rootfs. You can remove any 128 kB of them.

2 Likes

A small update: The sysupgrade to hnyman's R7800-owrt1907-r10833-91dde4291c-20191229-1243-sysupgrade.bin worked flawlessly. It seems I only had to flash the initramfs image once and can now upgrade via the sysupgrades.

@mdevreeze: Sadly it seems that many have this issue. I believe the manufacturers buy chips not of the highest quality to keep costs down. If you get a serial adapter it is not too difficult to get it up and running. Deleting the 128kb didn't work for me. I'd be happy to help if you get a serial adapter and the jumper wires. Something like this would do: USB 2.0 to TTL UART 5PIN Module Serial Converter CP2102

2 Likes

Hi guys, it looks like I have a similar problem, see:

It is sad that r7800 is still recommended by openwrt as one of the best routers to buy. And the manufacturer does not release any documentation on this or provide a patch.

On trying to resolve the issue: I do not have the serial cable either, but could login into the Netgear factory firmware shell. Is it possible to get some info there?

Got bitten by the same issue, I think (stock FW boots fine, any OpenWRT I try boot loops). Should there be a warning on the Wiki page that this is a common problem?

1 Like

Will it help if all of us having this issue telnet to the stock FW and print the content of /proc/mtd
like this?

/proc/mtd

dev: size erasesize name
mtd0: 00c80000 00020000 "qcadata"
mtd1: 00500000 00020000 "APPSBL"
mtd2: 00080000 00020000 "APPSBLENV"
mtd3: 00140000 00020000 "art"
mtd4: 00140000 00020000 "artbak"
mtd5: 00400000 00020000 "kernel"
mtd6: 06080000 00020000 "ubi"
mtd7: 00700000 00020000 "reserve"

The info on telneting to the official FW can be found here:

No.
Mtd info is the same for all.

The problem is that the "factory" initial installation image is one unified piece like kernel+padding+rootfs. And when there is a bad block in the kernel area, ubifs takes care of it and jumps over it, but the start of the rootfs gets written one block too far.

Like

ok
kkkkkkkkkkkkkk...........rrrrrrrrr
one bad block and rootfs starting in wrong location
kkkkkkXkkkkkkkk...........rrrrrrrrr
it should be  (sysupgrade does it in two pieces, kernel and rootfs separately)
kkkkkkXkkkkkkkk..........rrrrrrrrr

The OEM firmware has a logic noticing that bad block and new rootfs address, but OpenWrt does not.

1 Like

So the question is to find the number of bad blocks (Xs) in kkkkkkkkkkkkkk (without serial cable).
Is it possible to see it somehow from the OEM telnet shell?

Finally got into the console of mine (my PL3203 died and I had to order an FTDI to get in). Just wondering how I calculate the offset now.

U-Boot 2012.07 [local,local] (Sep 03 2015 - 17:33:28)

U-boot 2012.07 dni1 V0.4 for DNI HW ID: 29764958 NOR flash 0MB; NAND flash 128MB                                                                                                                                                                                                                                             ; RAM 512MB; 1st Radio 4x4; 2nd Radio 4x4; Cascade
smem ram ptable found: ver: 0 len: 5
DRAM:  491 MiB
NAND:  SF: Unsupported manufacturer 00
ipq_spi: SPI Flash not found (bus/cs/speed/mode) = (0/0/48000000/0)
128 MiB
MMC:
*** Warning - bad CRC, using default environment

PCI0 Link Intialized
PCI1 Link Intialized
In:    serial
Out:   serial
Err:   serial
 131072 bytes read: OK
MMC Device 0 not found
cdp: get part failed for 0:HLOS
Net:   MAC1 addr:bc:a5:11:3e:6f:b9
athrs17_reg_init: complete
athrs17_vlan_config ...done
S17c init  done
MAC2 addr:bc:a5:11:3e:6f:b8
eth0, eth1
Hit any key to stop autoboot:  0

 Client starts...[Listening] for ADVERTISE...TTT
Retry count exceeded; boot the image as usual

 nmrp server is stopped or failed !

Loading from device 0: nand0 (offset 0x1480000)
Skipping bad block 0x017e0000

** check kernel image **
   Verifying Checksum ... OK

** check rootfs image **
   Verifying Checksum ... OK
MMC Device 0 not found

Loading from nand0, offset 0x1480000
   Image Name:   Linux-3.4.103
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2160064 Bytes = 2.1 MiB
   Load Address: 41508000
   Entry Point:  41508000
Automatic boot of image at addr 0x44000000 ...
   Image Name:   Linux-3.4.103
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2160064 Bytes = 2.1 MiB
   Load Address: 41508000
   Entry Point:  41508000
   Verifying Checksum ... OK
   Loading Kernel Image ... OK
OK
mtdparts variable not set, see 'help mtdparts'
no partitions defined

defaults:
mtdids  : nand0=msm_nand
mtdparts: none
info: "mtdparts" not set
Using machid 0x136c from environment

Starting kernel ...

Booting Linux on physical CPU 0
Linux version 3.4.103 (li.zhang@CNXMDNICP01) (gcc version 4.6.3 20120201 (prerelease) (Linaro GCC 4.6-2012.02) ) #1 SMP Thu Oct 17 15:17:32 CST 2019
CPU: ARMv7 Processor [512f04d0] revision 0 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
Machine: Qualcomm Atheros AP161 reference board
QCA command line: console=ttyHSL1,115200n8
DNI command line: console=ttyHSL1,115200n8 ubi.mtd=netgear root=/dev/mtdblock6
msm_reserve_memory: 0x44600000, 0x200000
memory pool 3 (start 5fc00000 size 400000) initialized
Memory policy: ECC disabled, Data cache writealloc
smem_find(137, 80): wrong size 72
socinfo_init: v6, id=280, ver=3.0, raw_id=17, raw_ver=17, hw_plat=0,  hw_plat_ver=65536
 accessory_chip=0 hw_plat_subtype=0
PERCPU: Embedded 8 pages/cpu @c0d56000 s10624 r8192 d13952 u32768
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 123178
Kernel command line: console=ttyHSL1,115200n8 ubi.mtd=netgear root=/dev/mtdblock6
PID hash table entries: 2048 (order: 1, 8192 bytes)
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 49MB 436MB = 485MB total
Memory: 482488k/488632k available, 14152k reserved, 0K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    vmalloc : 0xdf000000 - 0xff000000   ( 512 MB)
    lowmem  : 0xc0000000 - 0xdeb00000   ( 491 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .text : 0xc0008000 - 0xc0630000   (6304 kB)
      .init : 0xc0700000 - 0xc0802980   (1035 kB)
      .data : 0xc0804000 - 0xc08a7940   ( 655 kB)
       .bss : 0xc08a7964 - 0xc09525d8   ( 684 kB)
SLUB: Genslabs=11, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
Hierarchical RCU implementation.
NR_IRQS:1689
sched_clock: 32 bits at 32kHz, resolution 31240ns, wraps every 134175798ms
Console: colour dummy device 80x30
Calibrating delay using timer specific routine.. 12.55 BogoMIPS (lpj=62792)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x41952b58 - 0x41952be0
CPU1: Booted secondary processor
CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
Brought up 2 CPUs
SMP: Total of 2 processors activated (25.11 BogoMIPS).
dummy:
NET: Registered protocol family 16
AXI: msm_bus_fabric_init_driver(): msm_bus_fabric_init_driver
meminfo_init: smem ram ptable found: ver: 0 len: 5
Found 1 memory banks grouped into 8 memory regions
gpiochip_add: registered GPIOs 0 to 151 on device: msmgpio
smem_find(137, 80): wrong size 72
socinfo_init: v6, id=280, ver=3.0, raw_id=17, raw_ver=17, hw_plat=0,  hw_plat_ver=65536
 accessory_chip=0 hw_plat_subtype=0
msm_rpm_init: RPM firmware 3.0.16777364
clk_tbl_nss_fast - loaded
msm_dmov_memcpy_init: Success
sps:BAM 0x12244000 enabled: ver:0x5, number of pipes:20
sps:BAM 0x12244000 is registered.
sps:sps is ready.
msm_pcie_setup: link initialized
PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [mem 0x08000000-0x0fefffff]
PCI: bus0: Fast back to back transfers disabled
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
PCI: bus1: Fast back to back transfers disabled
msm_pcie_setup: link initialized
PCI host bridge to bus 0000:02
pci_bus 0000:02: root bus resource [mem 0x2e000000-0x31efffff]
PCI: bus2: Fast back to back transfers disabled
pci 0000:02:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
PCI: bus3: Fast back to back transfers disabled
msm_pcie_setup: link initialization failed
pci 0000:02:00.0: BAR 8: assigned [mem 0x2e000000-0x2e1fffff]
pci 0000:03:00.0: BAR 0: assigned [mem 0x2e000000-0x2e1fffff 64bit]
pci 0000:02:00.0: PCI bridge to [bus 03-03]
pci 0000:02:00.0:   bridge window [mem 0x2e000000-0x2e1fffff]
PCI: enabling device 0000:02:00.0 (0140 -> 0143)
pci 0000:00:00.0: BAR 8: assigned [mem 0x08000000-0x081fffff]
pci 0000:01:00.0: BAR 0: assigned [mem 0x08000000-0x081fffff 64bit]
pci 0000:00:00.0: PCI bridge to [bus 01-01]
pci 0000:00:00.0:   bridge window [mem 0x08000000-0x081fffff]
PCI: enabling device 0000:00:00.0 (0140 -> 0143)
bio: create slab <bio-0> at 0
SCSI subsystem initialized
spi_qsd spi_qsd.5: master is unqueued, this is deprecated
spi_qsd spi_qsd.6: master is unqueued, this is deprecated
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Switching to clocksource gp_timer
NET: Registered protocol family 2
create ipmac proc
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 5, 131072 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
TCP: reno registered
UDP hash table entries: 256 (order: 1, 8192 bytes)
UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
NET: Registered protocol family 1
smd: register irq failed on wcnss_a11
smd: deregistering IRQs
SMD: smd_core_platform_init() failed
Partition (from dni partition table) qcadata -- Offset:0 Size:64
Partition (from dni partition table) APPSBL -- Offset:64 Size:28
Partition (from dni partition table) APPSBLENV -- Offset:8c Size:4
Partition (from dni partition table) ART -- Offset:90 Size:a
Partition (from dni partition table) ART.bak -- Offset:9a Size:a
Partition (from dni partition table) kernel -- Offset:a4 Size:11
Partition (from dni partition table) rootfs -- Offset:b5 Size:ef
Partition (from dni partition table) netgear -- Offset:1a4 Size:224
Partition (from dni partition table) firmware -- Offset:a4 Size:100
Partition (from dni partition table) crashdump -- Offset:3c8 Size:4
Partition (from dni partition table) language -- Offset:3cc Size:1c
Partition (from dni partition table) config -- Offset:3e8 Size:9
Partition (from dni partition table) pot -- Offset:3f1 Size:9
smem_find(427, 88): wrong size 96
get_bootconfig_partition 0 0 : v2 magic not found
acpuclk-ipq806x acpuclk-ipq806x: SPEED BIN: 0
acpuclk-ipq806x acpuclk-ipq806x: ACPU PVS: 5
acpuclk-ipq806x acpuclk-ipq806x: CPU0: 6 frequencies supported
acpuclk-ipq806x acpuclk-ipq806x: CPU1: 6 frequencies supported
msm_rpm_log_probe: OK
squashfs: version 4.0 (2009/01/31) Phillip Lougher
msgmni has been set to 942
Asymmetric key parser 'x509' registered
io scheduler noop registered
io scheduler deadline registered (default)
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
msm_serial_hs: probe of msm_serial_hs.0 failed with error -2
msm_serial_hs module loaded
msm_serial_hsl: detected port #1
msm_serial_hsl.1: ttyHSL1 at MMIO 0x16340000 (irq = 184) is a MSM
msm_serial_hsl: console setup on port #1
console [ttyHSL1] enabled
msm_serial_hsl: driver initialized
ahci ahci.0: forcing PORTS_IMPL to 0x1
ahci ahci.0: AHCI 0001.0300 32 slots 1 ports 6 Gbps 0x1 impl platform mode
ahci ahci.0: flags: ncq sntf pm led clo only pmp pio slum part ccc apst
scsi0 : ahci_platform
ata1: SATA max UDMA/133 mmio [mem 0x29000000-0x2900017f] port 0x100 irq 241
msm_nand_probe: phys addr 0x1ac00000
msm_nand_probe: dmac 0x3
msm_nand_probe: allocated dma buffer at ffdfc000, dma_addr 5f5b8000
status: 20
nandid: 1580a1c2 maker c2 device a1
ONFI probe : Found an ONFI compliant device MX30UF1G18AC        Â
Found a supported NAND device
NAND Controller ID : 0x4030
NAND Device ID  : 0x1580a1c2
Buswidth : 8 Bits
Density  : 128 MByte
Pagesize : 2048 Bytes
Erasesize: 131072 Bytes
Oobsize  : 64 Bytes
CFG0 Init  : 0xa8d408c0
CFG1 Init  : 0x0004745c
ECCBUFCFG  : 0x00000203
Creating 13 MTD partitions on "msm_nand":
0x000000000000-0x000000c80000 : "qcadata"
0x000000c80000-0x000001180000 : "APPSBL"
0x000001180000-0x000001200000 : "APPSBLENV"
0x000001200000-0x000001340000 : "ART"
0x000001340000-0x000001480000 : "ART.bak"
0x000001480000-0x0000016a0000 : "kernel"
0x0000016a0000-0x000003480000 : "rootfs"
mtd: find squashfs magic at 0x16a0000 of "msm_nand"
the correct location of partition "rootfs": 0x0000016a0000-0x000003480000
0x000003480000-0x000007900000 : "netgear"
0x000001480000-0x000003480000 : "firmware"
0x000007900000-0x000007980000 : "crashdump"
ata1: SATA link down (SStatus 0 SControl 300)
0x000007980000-0x000007d00000 : "language"
0x000007d00000-0x000007e20000 : "config"
0x000007e20000-0x000007f40000 : "pot"
m25p80 spi5.0: found pm25lv512, expected s25fl512s
m25p80 spi5.0: pm25lv512 (64 Kbytes)
UBI: attaching mtd7 to ubi0
UBI: physical eraseblock size:   131072 bytes (128 KiB)
UBI: logical eraseblock size:    126976 bytes
UBI: smallest flash I/O unit:    2048
UBI: VID header offset:          2048 (aligned 2048)
UBI: data offset:                4096
UBI: max. sequence number:       236
UBI: attached mtd7 to ubi0
UBI: MTD device name:            "netgear"
UBI: MTD device size:            68 MiB
UBI: number of good PEBs:        548
UBI: number of bad PEBs:         0
UBI: number of corrupted PEBs:   0
UBI: max. allowed volumes:       128
UBI: wear-leveling threshold:    4096
UBI: number of internal volumes: 1
UBI: number of user volumes:     6
UBI: available PEBs:             33
UBI: total number of reserved PEBs: 515
UBI: number of PEBs reserved for bad PEB handling: 5
UBI: max/mean erase counter: 2/1
UBI: image sequence number:  0
UBI: background thread "ubi_bgt0d" started, PID 662

Reason I'm asking is that this thread states "OpenWrt requires the rootfs to exist exactly at 16a0000", but my stock fw says that's where the rootfs is:

mtd: find squashfs magic at 0x16a0000 of "msm_nand"
the correct location of partition "rootfs": 0x0000016a0000-0x000003480000

Here's the OpenWRT crash:

[    2.384575] NET: Registered protocol family 10
[    2.390253] Segment Routing with IPv6
[    2.393185] NET: Registered protocol family 17
[    2.397014] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    2.401824] 8021q: 802.1Q VLAN Support v1.8
[    2.414401] Registering SWP/SWPB emulation handler
[    2.430348] qcom_rpm 108000.rpm: RPM firmware 3.0.16777364
[    2.442409] s1a: supplied by regulator-dummy
[    2.442511] s1a: Bringing 0uV into 1050000-1050000uV
[    2.446092] s1b: supplied by regulator-dummy
[    2.450741] s1b: Bringing 0uV into 1050000-1050000uV
[    2.455290] s2a: supplied by regulator-dummy
[    2.459960] s2a: Bringing 0uV into 775000-775000uV
[    2.464380] s2b: supplied by regulator-dummy
[    2.468901] s2b: Bringing 0uV into 775000-775000uV
[    2.478138] UBI error: no valid UBI magic found inside í[    2.484394] VFS: Cannot open root device "(null)" or unknown-block(0,0): error -6
[    2.484415] Please append a correct "root=" boot option; here are the available partitions:
[    2.490940] 1f00           12800 mtdblock0
[    2.490944]  (driver?)
[    2.503169] 1f01            5120 mtdblock1
[    2.503173]  (driver?)
[    2.509680] 1f02             512 mtdblock2
[    2.509684]  (driver?)
[    2.516260] 1f03            1280 mtdblock3
[    2.516265]  (driver?)
[    2.522698] 1f04            1280 mtdblock4
[    2.522702]  (driver?)
[    2.529210] 1f05            4096 mtdblock5
[    2.529215]  (driver?)
[    2.535786] 1f06           98816 mtdblock6
[    2.535790]  (driver?)
[    2.542230] 1f07            7168 mtdblock7
[    2.542234]  (driver?)
[    2.548741] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    2.551192] CPU1: stopping
[    2.559432] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.171 #0
[    2.562117] Hardware name: Generic DT based system
[    2.568292] Function entered at [<c030f1c4>] from [<c030b390>]
[    2.572973] Function entered at [<c030b390>] from [<c07c0664>]
[    2.578787] Function entered at [<c07c0664>] from [<c030e40c>]
[    2.584602] Function entered at [<c030e40c>] from [<c03014b8>]
[    2.590418] Function entered at [<c03014b8>] from [<c030bf8c>]
[    2.596233] Exception stack(0xdd461f80 to 0xdd461fc8)
[    2.602080] 1f80: 00000001 00000000 00000000 c0315100 ffffe000 c0a03cb8 c0a03c6c 00000000
[    2.607204] 1fa0: 00000000 512f04d0 00000000 00000000 dd461fc8 dd461fd0 c030854c c0308550
[    2.615341] 1fc0: 60000013 ffffffff
[    2.623488] Function entered at [<c030bf8c>] from [<c0308550>]
[    2.626789] Function entered at [<c0308550>] from [<c03589c8>]
[    2.632693] Function entered at [<c03589c8>] from [<c0358d10>]
[    2.638508] Function entered at [<c0358d10>] from [<423017cc>]
[    2.644328] Rebooting in 1 seconds..

Update: Replying to myself but maybe it helps others. I ended up going the "sysupgrade via uboot-initramfs" route and got OpenWRT to boot.

Here's an overview on how I did it. DISCLAIMER: Portions of this may be bad practice or even outright wrong. I assume no responsibility for you bricking your hardware, or any other damages, by following this.

Prerequisites:

  • You will need a way to get to the serial console (I'm using an FT232RL. A PL3203 did not work and even died on me while trying.)
  • A PC set to 192.168.1.10. (You may be able to change the IP in uBoot, or even specify which one to use to download, but I preferred not to mess with it and instead reconfigured my PC.)
  • A TFTP server running on that PC (I'm using atftpd on a Linux machine, but you may be able to use tftpd64 on Windows or others)
  • The initramFS uBoot image, as well as Internet access to get the sysupgrade image (https://downloads.openwrt.org/releases/19.07.2/targets/ipq806x/generic/openwrt-19.07.2-ipq806x-generic-netgear_r7800-initramfs-uImage at the time of writing this). Place this image into the root directory of your TFTP server.

Procedure:

  1. Get to the serial console. Make sure your serial client (PuTTY in my case) is configured with 1152008N1, and set 'Flow Control' to None (if this is set to Xon/Xoff, you won't be able to type into your console.
  2. While booting, you should see a prompt that says: Hit any key to stop autoboot Do that and you should end up in uBoot, with a prompt that looks like this:
(IPQ) #
  1. You can type 'printenv' to get some information on the uBoot settings. Mine look like this:
(IPQ) # printenv
baudrate=115200
bootargs=console=ttyHSL1,115200n8
bootcmd=sleep 2;   nmrp;  if loadn_dniimg 0 0x1480000 0x44000000 && chk_dniimg 0x44000000; then bootipq2; else fw_recovery; fi
bootdelay=2
eth1addr=bc:a5:11:3e:6f:b8
ethact=eth0
ethaddr=bc:a5:11:3e:6f:b9
ipaddr=192.168.1.1
loadaddr=0x42000000
machid=136c
modelid=R7800
serverip=192.168.1.10
stderr=serial
stdin=serial
stdout=serial
updateloader=ipq_nand linux && nand erase 0x01180000 0x00080000 && imgaddr=0x42000000 && source $imgaddr:script

(note the "serverip" indicating which tftp server uBoot will use by default)

  1. Now download the initramfs image to RAM using this command:
tftpboot openwrt-19.07.2-ipq806x-generic-netgear_r7800-initramfs-uImage

(You may be able to do a "tftpboot 192.168.1.10:openwrt-19.07.2-ipq806x-generic-netgear_r7800-initramfs-uImage" here to specify the IP. I never tested it with anything other than the default.)

  1. Finally, boot that image with this command:
bootm

After a while, OpenWRT should come up and be available at the default IP (192.168.1.1). From there, you can then issue a sysupgrade (without keeping settings).

I did it manually via SSH:

  1. Configured default gateway and /etc/resolv.conf so that I had internet.
  2. Downloaded the sysupgrade image with wget:
    cd /tmp ; wget http://downloads.openwrt.org/releases/19.07.2/targets/ip q806x/generic/openwrt-19.07.2-ipq806x-generic-netgear_r7800-squashfs-sysupgrade. bin
  3. Launched the sysupgrade:
    sysupgrade -n openwrt-19.07.2-ipq806x-generic-netgear_r7800-squashfs-sysupgrade.bin
2 Likes

Here to say that I believe I'm in the same unfortunate boat.

Just received a new R7800 today, and any flash attempt results in a boot loop. I can successfully go back to the vendor R7800-V1.0.2.68.img, but haven't managed to successfully flash OpenWrt yet (despite years of doing so on other equipment).

Should I be looking to just get another R7800, and maybe it won't have any of the same bad blocks?

I've read through all the related threads I can find on this. Did I miss any options for confirming the bad blocks, and/or successfully flashing OpenWrt - without opening the case (and violating the warranty / options for return)? Are there any custom builds I could try over TFTP (no serial or JTAG) that would overcome this?

This was intended to replace my TP-Link AC1750 / Archer C7 V2, on which I cannot rely upon stable wireless connections (Archer C7 2.4 GHz wireless dies in 24~48 hours, Archer C7 V2 - Massive Problems). If anyone has any better recommendations for similar hardware that is well-supported under OpenWrt, please share...

Very unlikely, as I assume you tried to flash OpenWrt from the webinterface and not the only correct way of using push-button tftp recovery for the initial flashing of OpenWrt (since 18.06.x the flash partitioning differs between the OEM firmware and OpenWrt, as the OEM partition for the kernel has gotten too small). Only the tftp flashing method can deal with differing flash partitioning layouts (re-create the UBI volume), while flashing from the webinterface will lead to the symptoms you describe.

Of course, it is possible that you still have to deal with bad blocks at an inconvenient location, but my bets would be on you having used the wrong flashing procedure.

Very unlikely, as I assume you tried to flash OpenWrt from the webinterface and not the only correct way of using push-button tftp recovery for the initial flashing of OpenWrt

No, I only used TFTP.

Boot log of stock firmware should show where bad blocks are.

Boot log of stock firmware should show where bad blocks are.

Is there any way I can get that without cracking the case open? (From the original vendor firmware somehow? As Bugfunder was asking above?)