I followed the Netgear R7800 exploration thread and learned that hnyman made a pull request to make this ramoops/pstore persistent after a random reboot.
Basically this means that all you need to do is install the kmod-ramoops package. This will pull in kmod-pstore and kmod-reed-solomon as a dependency. You can either include this in your own diffconfig or install it with opkg later on. When installed with opkg, you need to reboot to make sure everything gets set up properly for a random reboot. You can check if it's properly set up after this reboot with:
root@router1:~# logread | grep -i -E "pstore|ramo"
Wed Feb 9 23:49:14 2022 kern.info kernel: [ 16.377464] pstore: Using crash dump compression: deflate
Wed Feb 9 23:49:14 2022 kern.info kernel: [ 16.377494] pstore: Registered ramoops as persistent store backend
Wed Feb 9 23:49:14 2022 kern.info kernel: [ 16.381920] ramoops: using 0x40000@0x42100000, ecc: 0
Then I did:
echo c > /proc/sysrq-trigger
and immediately my ap-R7800 rebooted. Afterwards I found a file in /sys/fs/pstore:
root@ap-R7800:~# ls -l /sys/fs/pstore/
-r--r--r-- 1 root root 27182 Sep 9 14:40 dmesg-ramoops-0
It had the following content:
<4>[877148.946890] br-lan: received packet on eth1.1 with own address as source address (addr:12:33:3b:4c:73:ec, vlan:0)
<4>[877151.426933] br-lan: received packet on eth1.1 with own address as source address (addr:12:33:3b:4c:73:ec, vlan:0)
<4>[877151.427177] br-lan: received packet on eth1.1 with own address as source address (addr:12:33:3b:4c:73:ec, vlan:0)
<4>[877153.621894] br-lan: received packet on eth1.1 with own address as source address (addr:12:33:3b:4c:73:ec, vlan:0)
<4>[877153.622162] br-lan: received packet on eth1.1 with own address as source address (addr:12:33:3b:4c:73:ec, vlan:0)
<4>[882242.977602] ath10k_pci 0000:01:00.0: Invalid peer id 476 peer stats buffer
<4>[886458.948966] ath10k_pci 0000:01:00.0: Invalid peer id 482 peer stats buffer
<4>[886515.128957] ath10k_pci 0000:01:00.0: Invalid peer id 477 peer stats buffer
<4>[928488.983994] ath10k_pci 0000:01:00.0: Invalid peer id 490 peer stats buffer
<6>[940901.364018] sysrq: Trigger a crash
<0>[940901.364058] Kernel panic - not syncing: sysrq triggered crash
<2>[940901.366335] CPU0: stopping
<4>[940901.372228] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.138 #0
<4>[940901.374914] Hardware name: Generic DT based system
<4>[940901.381190] [<c030e46c>] (unwind_backtrace) from [<c030a204>] (show_stack+0x14/0x20)
<4>[940901.385955] [<c030a204>] (show_stack) from [<c062ef48>] (dump_stack+0x94/0xa8)
<4>[940901.393938] [<c062ef48>] (dump_stack) from [<c030d190>] (do_handle_IPI+0x140/0x184)
<4>[940901.401054] [<c030d190>] (do_handle_IPI) from [<c030d1f0>] (ipi_handler+0x1c/0x2c)
<4>[940901.409043] [<c030d1f0>] (ipi_handler) from [<c037174c>] (__handle_domain_irq+0x90/0xf4)
<4>[940901.416429] [<c037174c>] (__handle_domain_irq) from [<c06482e0>] (gic_handle_irq+0x90/0xb8)
<4>[940901.424759] [<c06482e0>] (gic_handle_irq) from [<c0300b8c>] (__irq_svc+0x6c/0x90)
<4>[940901.433252] Exception stack(0xc0d01ee0 to 0xc0d01f28)
<4>[940901.440647] 1ee0: 00000000 000357be 1cd4a000 dd990d80 00000000 abadff60 c1ca8840 00000000
<4>[940901.445773] 1f00: dd990030 000357be 00000000 000357be 826a5280 c0d01f30 c07b64ac c07b64cc
<4>[940901.454003] 1f20: 60000013 ffffffff
<4>[940901.462255] [<c0300b8c>] (__irq_svc) from [<c07b64cc>] (cpuidle_enter_state+0x180/0x380)
<4>[940901.465989] [<c07b64cc>] (cpuidle_enter_state) from [<c07b671c>] (cpuidle_enter+0x3c/0x5c)
<4>[940901.474061] [<c07b671c>] (cpuidle_enter) from [<c034e670>] (do_idle+0x208/0x2a4)
<4>[940901.482216] [<c034e670>] (do_idle) from [<c034e9c8>] (cpu_startup_entry+0x1c/0x20)
<4>[940901.489866] [<c034e9c8>] (cpu_startup_entry) from [<c0c01008>] (start_kernel+0x528/0x538)
So I've now learned and confirmed how to verify ramoops is working. Just got a bit wiser again today and hopefully by sharing this reassure others on how to verify this.
I went through the Netgear R7800 exploration thread and saw that @quarky and @Ansuel were chasing down a kernel crash issue and we're looking into CPU frequency changes that might be the cause of that. Around that time the WiFi slowdown related to ATF/AQL was also showing up and a lot of joint effort was put into that too. Ansuel found a few strange things in old R7800 code around February this year. Some time later in March he discovered some more strange things, it's technical and I think I can follow most of it. Up until this point I kept getting the idea that setting the CPU frequency to a fixed value might be a solution. When I read this post, I'm even more convinced to try a fixed CPU frequency for a R7800. Maybe not the performance governor, but set it fixed to 1400MHz or so?
Now, in reading all of this I'm developing a theory as I write. It appears that there's something strange in R7800 CPU frequency code when scaling up/down. We all know that we need to increase the minimum frequency to at least 600MHz or perhaps 800MHz to improve stability right? This is a known issue for a long time on R7800 anyway. Now bear with me; what if there's some kind of regression or enhancement somewhere in the master code for kernel 5.10 that somehow defeats the stability we got in the R7800 platform up until kernel 5.4? Meaning that if you have not fixed the CPU frequency to any value your R7800 would change frequency on demand. Here comes the NSS acceleration without PPPoE support in play. Before we had NSS acceleration with PPPoE, our CPU's were probably maxed out most of the time, hardly any on demand frequency switching from 600/800MHz to 1700MHz, so any issue with CPU frequency switching on kernel 5.10 would probably go unnoticed for a while. With NSS acceleration on PPPoE we now see CPU's hardly doing much work, until sometimes a burst of work is required and the CPU quickly changes frequency. And when reading all of these discoveries in R7800 code where it seemed to be "quickly built without proper documentation" it wouldn't surprise me that it's a hit and miss when changing CPU frequency and having a random reboot or not...
The latest work from Ansuel is a fix for 5.15 for the cache scaling driver, from the looks of it, he put in serious effort in this CPU frequency scaling issue. I don't think this was back ported (yet) to 5.10.
So if anything; I'm going to build a fresh new 22.03 build, keep my config but set the CPU frequency scaling to a fixed number. Something like this: