Hi guys. I wish you happy holidays.
I want to report an unexpected return of an year old issue.
I haven't had any major issue for a year.
The above was true during the year and now suddenly when a torrent is run (on a PC) and is saved to SSD (attached to USB3 router port using ksmbd for SMB sharing), Luci web interface and SSH completely hang.
During the time the router has hanged, no DNS requests can be made (so there is no Internet), already established connections and traffic continue to work but no new connections are possible. No OOM or other crashes but it seems like all services do not respond.
Once I stop the torrent both Luci and SSH recover immediately. DNS and other services recover too and I can access the internet like nothing wrong has ever happened.
These errors can be seen in the log. Obviously a consequence of the services stall.
daemon.notice hostapd: nl80211: nl80211_recv_beacons->nl_recvmsgs failed: -5
daemon.warn collectd[4112]: Sleeping only 2s because the next interval is 112.108 seconds in the past!
daemon.notice hostapd: nl80211: wpa_driver_nl80211_event_receive->nl_recvmsgs failed: -5
daemon.warn collectd[4112]: plugin_read_thread: read-function of the `cpufreq' plugin took 518.148 seconds, which is above its read interval (30.000 seconds). You might want to adjust the `Interval' or `ReadThreads' settings.
daemon.warn collectd[4112]: plugin_read_thread: read-function of the `thermal' plugin took 516.716 seconds, which is above its read interval (30.000 seconds). You might want to adjust the `Interval' or `ReadThreads' settings.
If I check the statistics all the graphics have empty intervals when the hang occurred.
root@QNAP:~# nss_diag
MODEL: QNAP 301w
OPENWRT: r28489-408dbcb419
IPQ BRANCH: main-nss
IPQ COMMIT: 408dbcb419
IPQ DATE: 2024-12-16
NSS FW: NSS.FW.12.2-161-HK.R
MAC80211: v6.11.2-0-g7aa21fec187b
ATH11K FW: WLAN.HK.2.9.0.1-02146-QCAHKSWPL_SILICONZ-1
INTERFACE: br-lan tx-checksumming: on rx-gro-list: off
10g-1 tx-checksumming: on rx-gro-list: off
10g-2 tx-checksumming: on rx-gro-list: off
lan1 tx-checksumming: on rx-gro-list: off
lan2 tx-checksumming: on rx-gro-list: off
lan3 tx-checksumming: on rx-gro-list: off
lan4 tx-checksumming: on rx-gro-list: off
phy0-ap0 tx-checksumming: on rx-gro-list: off
phy1-ap0 tx-checksumming: on rx-gro-list: off
NSS PKGS: kmod-qca-mcs-6.6.65.12.5.2024.02.27~26d6424-r1 aarch64_cortex-a53 {feeds/nss_packages/qca-mcs} () [installed]
kmod-qca-nss-dp-6.6.65.2024.04.16~5bf8b91e-r1 aarch64_cortex-a53 {feeds/base/kernel/qca-nss-dp} () [installed]
kmod-qca-nss-drv-6.6.65.12.5.2024.04.06~53a0dc1-r15 aarch64_cortex-a53 {feeds/nss_packages/qca-nss-drv} () [installed]
kmod-qca-nss-drv-bridge-mgr-6.6.65.12.5.2024.06.12~1bcef16-r7 aarch64_cortex-a53 {feeds/nss_packages/qca-nss-clients} () [installed]
kmod-qca-nss-drv-igs-6.6.65.12.5.2024.06.12~1bcef16-r7 aarch64_cortex-a53 {feeds/nss_packages/qca-nss-clients} () [installed]
kmod-qca-nss-drv-qdisc-6.6.65.12.5.2024.06.12~1bcef16-r7 aarch64_cortex-a53 {feeds/nss_packages/qca-nss-clients} () [installed]
kmod-qca-nss-drv-vlan-mgr-6.6.65.12.5.2024.06.12~1bcef16-r7 aarch64_cortex-a53 {feeds/nss_packages/qca-nss-clients} () [installed]
kmod-qca-nss-ecm-6.6.65.12.5.5.2024.09.02~bd5057b-r3 aarch64_cortex-a53 {feeds/nss_packages/qca-nss-ecm} () [installed]
kmod-qca-ssdk-6.6.65.2024.06.13~c451136b-r3 aarch64_cortex-a53 {feeds/base/kernel/qca-ssdk} () [installed]
nss-firmware-default-2024.08.04~794fe373-r1 aarch64_cortex-a53 {feeds/nss_packages/firmware/nss-firmware} () [installed]
nss-firmware-ipq8074-2024.08.04~794fe373-r1 aarch64_cortex-a53 {feeds/nss_packages/firmware/nss-firmware} () [installed]
This happens only when the torrent is saved to a ksmbd
share on a SSD connected to the router USB3 port.
If I save the torrent to a local PC disk drive this hang never occurs.
I cannot tell for sure if ksmbd
causes the hang but watching htop (till the SSH hangs) I can see that several services like dnsmasq, https-dns-proxy cause high CPU usage and then the router stops responding.