Thanks for the extra details. This highlights an extra problem I forgot: I didn't add kmod-ramoops to DEVICE_PACKAGES, so it doesn't get included by default. (I've enabled it for my own local builds.) If you are feeling adventurous, you could enable kmod-ramoops for your own builds too.
Separately, I noticed another problem: the kmod-ramoops package will log panics, but it won't log kernel messages for other occasions, like clean reboots, or the kernel messages leading up to a hardware watchdog event. I can fix that by enabling CONFIG_PSTORE_CONSOLE=y in the kernel -- I might send such a patch, to improve the ramoops logging in the future.
In the meantime, I've already been running 1 OnHub with these logging fixes, and it rebooted with no explanation recently. Unfortunately, that log shows nothing useful in the ramoops dump. This suggests the kernel didn't print anything useful, but there was likely a hardware watchdog event or similar that caused the reboot.
So, that doesn't give me very many leads at the moment.
One other data point: I have 3 OnHubs running (1 ASUS and 2 TP-Link), and only 1 of them (my TP-Link test device) has rebooted like this. The other 2 have been running without issue for ~18 days. There are two main differences between the good and bad:
- The good ones are running an image based off commit 895f38ca1efe. The bad one is running off commit 7396263680b9.
- The good ones are actively running a mesh network that I occasionally use. The bad one is just sitting idle on my desk most of the time, with no wireless active.
I'd guess that difference #2 (usage pattern) is more relevant than #1, since the ipq806x-related changes between the two are minimal. But I suppose there's always room for regression in there. For one, there are some 5.15.x kernel bumps in there.
So, still no great leads. It might help if others could get kmod-ramoops + CONFIG_PSTORE_CONSOLE=y builds running, and provide the contents of /sys/fs/pstore/ if/when there are failures. I pushed my latest work to my branch again. I don't expect any different result, but it could be good data anyway.