Netgear R7800 randomly goes down

Hi all,

I've been having this issue since I first got my R7800 with OpenWRT 18.06, and now I'm on 19.07.6 and am still getting it.

In short, sometimes I notice my internet connection is gone, and I find that my router is not responsive. I cannot log onto the LuCI interface or to the router through SSH. I can sometimes access other devices on my local network, so it seems like the WAN is down. The only way I can fix this is to completely restart the router from power.

The problem is, I can't find any logs indicating unusual behavior. I have logs in logread only for the last time I rebooted the device. Below are some of the logs - Jan 21 is the last time I restarted the device.

Thu Jan 21 12:07:00 2021 daemon.notice netifd: Interface 'wan6' is now up
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: reading /tmp/resolv.conf.auto
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain test
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain onion
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain localhost
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain local
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain invalid
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain bind
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using local addresses only for domain lan
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using nameserver redacted
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using nameserver redacted
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using nameserver redacted
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using nameserver redacted
Thu Jan 21 12:07:00 2021 daemon.info dnsmasq[638]: using nameserver redacted
Thu Jan 21 12:07:00 2021 user.notice firewall: Reloading firewall due to ifup of wan6 (eth0.2)
Tue Jan 26 19:39:42 2021 authpriv.info dropbear[2144]: Child connection from redacted
Tue Jan 26 19:39:43 2021 authpriv.notice dropbear[2144]: Password auth succeeded for 'root' from redacted

Nothing from before or when it went down. If it helps, here's my list of packages:

ath10k-firmware-qca9984-ct
base-files
busybox
cgi-io
dnsmasq
dropbear
firewall
fstools
fwtool
getrandom
hostapd-common
ip6tables
iptables
iw
iwinfo
jshn
jsonfilter
kernel
kmod-ata-ahci
kmod-ata-ahci-platform
kmod-ata-core
kmod-ath
kmod-ath10k-ct
kmod-cfg80211
kmod-gpio-button-hotplug
kmod-hwmon-core
kmod-ip6tables
kmod-ipt-conntrack
kmod-ipt-core
kmod-ipt-nat
kmod-ipt-offload
kmod-leds-gpio
kmod-lib-crc-ccitt
kmod-mac80211
kmod-nf-conntrack
kmod-nf-conntrack6
kmod-nf-flow
kmod-nf-ipt
kmod-nf-ipt6
kmod-nf-nat
kmod-nf-reject
kmod-nf-reject6
kmod-nls-base
kmod-ppp
kmod-pppoe
kmod-pppox
kmod-scsi-core
kmod-slhc
kmod-usb-core
kmod-usb-dwc3
kmod-usb-dwc3-of-simple
kmod-usb-ehci
kmod-usb-ledtrig-usbport
kmod-usb-ohci
kmod-usb-phy-qcom-dwc3
kmod-usb2
kmod-usb3
libblobmsg-json
libc
libgcc1
libip4tc2
libip6tc2
libiwinfo-lua
libiwinfo20181126
libjson-c2
libjson-script
liblua5.1.5
liblucihttp-lua
liblucihttp0
libnl-tiny
libpthread
libubox20191228
libubus-lua
libubus20191227
libuci20130104
libuclient20160123
libxtables12
logd
lua
luci
luci-app-firewall
luci-app-opkg
luci-base
luci-lib-ip
luci-lib-jsonc
luci-lib-nixio
luci-mod-admin-full
luci-mod-network
luci-mod-status
luci-mod-system
luci-proto-ipv6
luci-proto-ppp
luci-theme-bootstrap
mtd
netifd
odhcp6c
odhcpd-ipv6only
openwrt-keyring
opkg
ppp

The only notable thing I'm running on my router is a dynamic DNS service. I'm running my own DHCP and DNS server on a LAN device, which the router points to.

I'd appreciate any pointers to help debug this and hopefully resolve it soon. Thank you very much!

I'm on 19.07.7 and I'm still getting this issue. logread still doesn't indicate anything noteworthy to me. The log output level is currently set to Debug - is there a way to get even more detailed logs?

Mon Feb 15 07:23:41 2021 daemon.info dnsmasq[637]: using nameserver 2001:558:feed::1#53
Mon Feb 15 07:23:41 2021 daemon.info dnsmasq[637]: using nameserver 2001:558:feed::2#53
Mon Feb 15 07:23:41 2021 user.notice firewall: Reloading firewall due to ifup of wan6 (eth0.2)
Fri Apr 23 02:46:07 2021 daemon.err uhttpd[1016]: luci: accepted login on / for root from 10.0.0.90
Fri Apr 23 02:46:23 2021 daemon.info hostapd: wlan0: STA c4:9d:ed:11:80:0d IEEE 802.11: authenticated
Fri Apr 23 02:46:23 2021 daemon.notice hostapd: wlan0: STA-OPMODE-N_SS-CHANGED c4:9d:ed:11:80:0d 2
Fri Apr 23 02:46:23 2021 daemon.info hostapd: wlan0: STA c4:9d:ed:11:80:0d IEEE 802.11: associated (aid 1)
Fri Apr 23 02:46:23 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED c4:9d:ed:11:80:0d
Fri Apr 23 02:46:23 2021 daemon.info hostapd: wlan0: STA c4:9d:ed:11:80:0d WPA: pairwise key handshake completed (RSN)
Fri Apr 23 02:46:27 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a IEEE 802.11: authenticated
Fri Apr 23 02:46:27 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a IEEE 802.11: associated (aid 2)
Fri Apr 23 02:46:27 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED ee:0f:d3:24:9c:4a
Fri Apr 23 02:46:27 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a WPA: pairwise key handshake completed (RSN)
Fri Apr 23 02:46:39 2021 daemon.info hostapd: wlan0: STA 9c:b6:d0:64:65:49 IEEE 802.11: authenticated
Fri Apr 23 02:46:39 2021 daemon.info hostapd: wlan0: STA 9c:b6:d0:64:65:49 IEEE 802.11: associated (aid 3)
Fri Apr 23 02:46:39 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 9c:b6:d0:64:65:49
Fri Apr 23 02:46:39 2021 daemon.info hostapd: wlan0: STA 9c:b6:d0:64:65:49 WPA: pairwise key handshake completed (RSN)
Fri Apr 23 02:47:27 2021 daemon.info hostapd: wlan0: STA 06:6c:e8:51:bf:62 IEEE 802.11: authenticated
Fri Apr 23 02:47:27 2021 daemon.info hostapd: wlan0: STA 06:6c:e8:51:bf:62 IEEE 802.11: associated (aid 4)
Fri Apr 23 02:47:27 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 06:6c:e8:51:bf:62
Fri Apr 23 02:47:27 2021 daemon.info hostapd: wlan0: STA 06:6c:e8:51:bf:62 WPA: pairwise key handshake completed (RSN)
Fri Apr 23 02:47:57 2021 authpriv.info dropbear[2537]: Child connection from 10.0.0.90:57080
Fri Apr 23 02:47:58 2021 authpriv.notice dropbear[2537]: Password auth succeeded for 'root' from 10.0.0.90:57080
Fri Apr 23 02:50:28 2021 authpriv.info dropbear[2537]: Exit (root): Disconnect received
Fri Apr 23 03:04:07 2021 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED ee:0f:d3:24:9c:4a
Fri Apr 23 03:04:07 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a IEEE 802.11: disassociated
Fri Apr 23 03:04:08 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a IEEE 802.11: authenticated
Fri Apr 23 03:04:08 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a IEEE 802.11: associated (aid 2)
Fri Apr 23 03:04:08 2021 daemon.notice hostapd: wlan0: AP-STA-CONNECTED ee:0f:d3:24:9c:4a
Fri Apr 23 03:04:08 2021 daemon.info hostapd: wlan0: STA ee:0f:d3:24:9c:4a WPA: pairwise key handshake completed (RSN)
Fri Apr 23 03:17:38 2021 kern.warn kernel: [ 1955.274497] ath10k_pci 0000:01:00.0: Invalid VHT mcs 15 peer stats

This issue happens seemingly randomly. The router has been on for several months with absolutely no issues, and now I got this problem within a few days of each other.

I would sincerely appreciate any pointers to further debug this.

logs are stored in RAM buffer (to avoid flash writes), so if you really talk about router crash & reboot, the reasons do not get stored into log. The log after boot shows only the happenings after the reboot.

At this point it would probably makes sense to just try 21.02.0-rc1, before spending a lot of time on trying to debug 19.07.x. There have been considerable changes since 19.07.x (and more are pending, so testing a kernel 5.10/ DSA branch might be even more interesting).