Netgear R7800 exploration (IPQ8065, QCA9984)

Sysupgrade is perfectly fine.

1 Like

factory image flash are only needed when the mtd partition is changed...here we change how the reserved memory is allocated and used.
I honestly think we should consider enabling this feature by default... Would help with the ath10k problem and the crash problem. (example we notice that the panic is not random but is actually the same for everyone and we never notice that)

2 Likes

Yeah, with mt7622 this revealed a common standard error: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000053 which then led to https://github.com/openwrt/mt76/issues/565 and fix.

Yes I will probably be enabling this on my personal builds at the least.
It would be great if this could be enabled for more devices easier instead of individual dts mods.

It's really how uboot init the memory. If it's cleared on reboot then rip pstore

Thanks for the example @quarky,

this seems to work nicely with R7800 in master:

Kernel config change + DTS change:

--- a/target/linux/ipq806x/config-5.10
+++ b/target/linux/ipq806x/config-5.10
@@ -365,6 +365,19 @@ CONFIG_POWER_RESET_MSM=y
 CONFIG_POWER_SUPPLY=y
 CONFIG_PPS=y
 CONFIG_PRINTK_TIME=y
+CONFIG_PSTORE=y
+# CONFIG_PSTORE_842_COMPRESS is not set
+CONFIG_PSTORE_COMPRESS=y
+CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
+# CONFIG_PSTORE_CONSOLE is not set
+CONFIG_PSTORE_DEFLATE_COMPRESS=y
+CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
+# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
+# CONFIG_PSTORE_LZ4_COMPRESS is not set
+# CONFIG_PSTORE_LZO_COMPRESS is not set
+# CONFIG_PSTORE_PMSG is not set
+CONFIG_PSTORE_RAM=y
+# CONFIG_PSTORE_ZSTD_COMPRESS is not set
 CONFIG_PTP_1588_CLOCK=y
 # CONFIG_QCOM_A53PLL is not set
 CONFIG_QCOM_ADM=y
@@ -397,6 +410,9 @@ CONFIG_QCOM_WDT=y
 # CONFIG_QCS_TURING_404 is not set
 CONFIG_RAS=y
 CONFIG_RATIONAL=y
+CONFIG_REED_SOLOMON=y
+CONFIG_REED_SOLOMON_DEC8=y
+CONFIG_REED_SOLOMON_ENC8=y
 CONFIG_REGMAP=y
 CONFIG_REGMAP_MMIO=y
 CONFIG_REGULATOR=y
--- a/target/linux/ipq806x/files/arch/arm/boot/dts/qcom-ipq8065-nighthawk.dtsi
+++ b/target/linux/ipq806x/files/arch/arm/boot/dts/qcom-ipq8065-nighthawk.dtsi
@@ -13,6 +13,15 @@
 			reg = <0x5fe00000 0x200000>;
 			reusable;
 		};
+
+		ramoops@42100000 {
+			compatible = "ramoops";
+			reg = <0x42100000 0x40000>;
+			record-size = <0x4000>;
+			console-size = <0x4000>;
+			ftrace-size = <0x4000>;
+			pmsg-size = <0x4000>;
+		};
 	};
 
 	aliases {

pstore file after reboot:

 -----------------------------------------------------
 OpenWrt SNAPSHOT, r18562-0765466a42
 -----------------------------------------------------
root@router1:~# ls -l /sys/fs/pstore/
-r--r--r--    1 root     root         27199 Jan 14 18:30 dmesg-ramoops-0
root@router1:~# cat /sys/fs/pstore/dmesg-ramoops-0
Panic#1 Part1
...
<6>[  131.097393] br-lan: port 2(wlan0) entered forwarding state
<6>[  132.324175] sysrq: Trigger a crash
<0>[  132.324215] Kernel panic - not syncing: sysrq triggered crash
<2>[  132.326486] CPU1: stopping
<4>[  132.332288] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.10.90 #0
<4>[  132.334893] Hardware name: Generic DT based system
<4>[  132.341070] [<c030e32c>] (unwind_backtrace) from [<c030a1ac>] (show_stack+0x14/0x20)
<4>[  132.345668] [<c030a1ac>] (show_stack) from [<c062eac8>] (dump_stack+0x94/0xa8)
<4>[  132.353561] [<c062eac8>] (dump_stack) from [<c030d050>] (do_handle_IPI+0x140/0x184)
<4>[  132.360593] [<c030d050>] (do_handle_IPI) from [<c030d0b0>] (ipi_handler+0x1c/0x2c)
<4>[  132.368145] [<c030d0b0>] (ipi_handler) from [<c0370f7c>] (__handle_domain_irq+0x90/0xf4)
<4>[  132.375789] [<c0370f7c>] (__handle_domain_irq) from [<c0648e20>] (gic_handle_irq+0x90/0xb8)
<4>[  132.384034] [<c0648e20>] (gic_handle_irq) from [<c0300b0c>] (__irq_svc+0x6c/0x90)
<4>[  132.392100] Exception stack(0xc146df18 to 0xc146df60)
<4>[  132.399739] df00:                                                       00000000 0000001e
<4>[  132.404784] df20: 1cd5a000 dd9a0cc0 00000000 cf226ee0 c1c73040 00000000 dd99ffb0 0000001e
<4>[  132.412944] df40: 00000000 0000001e 0003fd40 c146df68 c07b5fec c07b600c 60000013 ffffffff
<4>[  132.421106] [<c0300b0c>] (__irq_svc) from [<c07b600c>] (cpuidle_enter_state+0x180/0x380)
<4>[  132.429256] [<c07b600c>] (cpuidle_enter_state) from [<c07b625c>] (cpuidle_enter+0x3c/0x5c)
<4>[  132.437417] [<c07b625c>] (cpuidle_enter) from [<c034df10>] (do_idle+0x208/0x2a4)
<4>[  132.445487] [<c034df10>] (do_idle) from [<c034e268>] (cpu_startup_entry+0x1c/0x20)
<4>[  132.453040] [<c034e268>] (cpu_startup_entry) from [<4230152c>] (0x4230152c)
root@router1:~#

(I applied the DTS change to the combined R7800+XR500 Nighthawk .dtsi in master, but in 21.02 it would be directly to the R7800 .dts file.)

4 Likes

Thanks, I was stuck on the dts part but your diff helped.

Interesting while bisecting some problem with smem and usb in 5.15 i discovered r7800 have smem entries declared... this is what is present...

[    1.861491] 12 qcomsmem partitions found on MTD device qcom_nand.0
[    1.868867] Creating 12 MTD partitions on "qcom_nand.0":
[    1.874930] 0x000000000000-0x000000040000 : "0:sbl1"
[    1.881801] 0x000000040000-0x000000180000 : "0:mibib"
[    1.888496] 0x000000180000-0x0000002c0000 : "0:sbl2"
[    1.893522] 0x0000002c0000-0x000000540000 : "0:sbl3"
[    1.905073] 0x000000540000-0x000000660000 : "0:ddrconfig"
[    1.908028] 0x000000660000-0x000000780000 : "0:ssd"
[    1.912473] 0x000000780000-0x000000a00000 : "0:tz"
[    1.919771] 0x000000a00000-0x000000c80000 : "0:rpm"
[    1.925380] 0x000000c80000-0x000001180000 : "0:appsbl"
[    1.935811] 0x000001180000-0x000001200000 : "0:appsblenv"
[    1.937573] 0x000001200000-0x000001340000 : "0:art"
[    1.943463] 0x000001340000-0x000005340000 : "ubi"

@robimarko i'm a bit confused smem is something from uboot or something set in the compilation process of the boot partitions?

SMEM is a way to share the items on a HEAP between all of the processors, be it the ARM cores or various other stuff they have.
And one of the things it provides is a partition map that as far as I know is read from the MIBIB partition, this is something that they modify (Or usually don't) in an XML fashion as part of QSDK, and then it spits out an image to be flashed.

It usually also provides the boot media type (NOR, NAND etc), block size and some more details about boot media but they are also platform-dependent and the upstream parser doesn't implement them.
So if both SPI-NOR and NAND are used then if they defined partitions for both types of devices the parser will populate the partition table with all of them.

U-boot (At least the QCA fork) just has commands to see the table and interact with it (Although limited).

Other than partition table it can provide all kinds of things depending on the platform used.

1 Like

smem = shared memory ?
Ok as I suspected, it's defined at compile time with the xml file and put in the mibib partition... This is probably the default partition layout from the qsdk. Not usable.

Works nicely using @KONG 5.10-nss repo build + @hnyman ramoops patch.
Definitely +1 vote including by default.

Edit: Forgot to credit @quarky who brought this pstore/ramoops up.

The @KONG build works always flawlessly. Never a problem. Full repository support for all needs.

Ramoops/pstore works also in 21.02 with the patch modified as

--- a/target/linux/ipq806x/config-5.4
+++ b/target/linux/ipq806x/config-5.4
@@ -404,6 +404,19 @@ CONFIG_POWER_RESET_MSM=y
 CONFIG_POWER_SUPPLY=y
 CONFIG_PPS=y
 CONFIG_PRINTK_TIME=y
+CONFIG_PSTORE=y
+# CONFIG_PSTORE_842_COMPRESS is not set
+CONFIG_PSTORE_COMPRESS=y
+CONFIG_PSTORE_COMPRESS_DEFAULT="deflate"
+# CONFIG_PSTORE_CONSOLE is not set
+CONFIG_PSTORE_DEFLATE_COMPRESS=y
+CONFIG_PSTORE_DEFLATE_COMPRESS_DEFAULT=y
+# CONFIG_PSTORE_LZ4HC_COMPRESS is not set
+# CONFIG_PSTORE_LZ4_COMPRESS is not set
+# CONFIG_PSTORE_LZO_COMPRESS is not set
+# CONFIG_PSTORE_PMSG is not set
+CONFIG_PSTORE_RAM=y
+# CONFIG_PSTORE_ZSTD_COMPRESS is not set
 CONFIG_PTP_1588_CLOCK=y
 # CONFIG_QCOM_A53PLL is not set
 CONFIG_QCOM_ADM=y
@@ -439,6 +452,9 @@ CONFIG_RCU_CPU_STALL_TIMEOUT=21
 CONFIG_RCU_NEED_SEGCBLIST=y
 CONFIG_RCU_STALL_COMMON=y
 CONFIG_REFCOUNT_FULL=y
+CONFIG_REED_SOLOMON=y
+CONFIG_REED_SOLOMON_DEC8=y
+CONFIG_REED_SOLOMON_ENC8=y
 CONFIG_REGMAP=y
 CONFIG_REGMAP_MMIO=y
 CONFIG_REGULATOR=y
--- a/target/linux/ipq806x/files/arch/arm/boot/dts/qcom-ipq8065-r7800.dts
+++ b/target/linux/ipq806x/files/arch/arm/boot/dts/qcom-ipq8065-r7800.dts
@@ -16,6 +16,15 @@
 			reg = <0x5fe00000 0x200000>;
 			reusable;
 		};
+
+		ramoops@42100000 {
+			compatible = "ramoops";
+			reg = <0x42100000 0x40000>;
+			record-size = <0x4000>;
+			console-size = <0x4000>;
+			ftrace-size = <0x4000>;
+			pmsg-size = <0x4000>;
+		};
 	};
 
 	aliases {
1 Like

Does anyone know what happens if ramoops is enabled but there isn't a RAM partition named "ramoops" for it to dump to? Does it just silently fail?
My main question/concern, any other images produced that didn't have a DTS modification would still be functioning firmware for that device right?

From what I can remember, the ramoops module will not load if it's not configured properly.

1 Like

After a 80 days uptime, my r7800 running with 21.02.1 was just suddenly hang. I had to cycle power. +1 to enable ramoops by default.

Hi,
is there anyone that have 5.15 + dsa pr applied? I'm away from my main home and I need to do some test. Can anyone help me?

1 Like

Where I can find the pr for r7800?

I could try it

pr 4828

Note the first post in this pull indicates you will also have to apply pr: 4036 and pr 4748. Likely you will apply these both before doing 4828 and in that order (e.g. 4036, 4748, and then 4828).

I have not tried 4828 yet (but 4036 on its own is very stable for me).

EDIT: pr 4828 is no longer needed or used. Apply pr [4036] and pr [4748] as described in the first comment in pr 4748. These pr's likely will continue to change so you'll have to keep up with the discussion.

I've been planning to test 5.15 + dsa, but even if i do, I don't have an r7800 and as such it might not help @ansuel so give it a shot if your feeling adventurous.

HTH

1 Like

The idea is that I need a r7800 with serial access so I can test a initramfs image. But I guess I will just wait to come back home on Tuesday