Ipq806x NSS build (Netgear R7800 / TP-Link C2600 / Linksys EA8500)

I was posting on KONG's thread but I assume this works here too. For the past month, without any changes to my network environment, I've started expecting dropouts for a few 2.4Ghz clients every other minute.

I've tried troubleshooting by reverting all the way down to v19, coming back to v23 with CT and non-CT drivers, restoring factory settings, connecting each client and checking stability, and unfortunately there's no conclusion at all.

The best I could get was after flashing the latest 23.05 build and using channel 6 I had 5 hours of stable connections, for everything to start deauthing clients again.

A few log entries below from the past days:

Fri Jun  7 17:54:41 2024 daemon.info hostapd: phy1-ap0: STA ec:fa:bc:7c:72:8e WPA: pairwise key handshake completed (RSN)
Fri Jun  7 17:54:41 2024 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED ec:fa:bc:7c:72:8e
Fri Jun  7 17:54:42 2024 daemon.info dnsmasq-dhcp[11052]: DHCPDISCOVER(br-lan) ec:fa:bc:7c:72:8e
Fri Jun  7 17:54:42 2024 daemon.info dnsmasq-dhcp[11052]: DHCPOFFER(br-lan) 192.168.1.210 ec:fa:bc:7c:72:8e
Fri Jun  7 17:54:46 2024 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED ec:fa:bc:7c:72:8e
Fri Jun  7 17:54:46 2024 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED c8:2b:96:04:bd:1a
Fri Jun  7 17:54:47 2024 daemon.notice hostapd: phy1-ap0: STA ec:fa:bc:7c:72:8e IEEE 802.11: did not acknowledge authentication response
Fri Jun  7 17:54:47 2024 daemon.info hostapd: phy1-ap0: STA c8:2b:96:04:bd:1a IEEE 802.11: authenticated
Fri Jun  7 17:54:47 2024 daemon.info hostapd: phy1-ap0: STA c8:2b:96:04:bd:1a IEEE 802.11: associated (aid 7)
Fri Jun  7 17:54:56 2024 daemon.info hostapd: phy1-ap0: STA c8:2b:96:04:bd:1a IEEE 802.11: deauthenticated due to local deauth request
Fri Jun  7 17:55:12 2024 daemon.info hostapd: phy1-ap0: STA ec:fa:bc:7c:72:8e IEEE 802.11: authenticated
Fri Jun  7 17:55:12 2024 daemon.info hostapd: phy1-ap0: STA ec:fa:bc:7c:72:8e IEEE 802.11: associated (aid 4)
Fri Jun  7 17:55:21 2024 daemon.info hostapd: phy1-ap0: STA ec:fa:bc:7c:72:8e IEEE 802.11: deauthenticated due to local deauth request
Fri Jun  7 17:55:23 2024 daemon.info hostapd: phy1-ap0: STA c8:2b:96:04:bd:1a IEEE 802.11: authenticated
[   66.024137] br-lan: port 3(phy0-ap0) entered forwarding state
[  676.957838] ath10k_pci 0001:01:00.0: failed to flush transmit queue (skip 0 ar-state 1): 0
[  797.598971] ath10k_pci 0001:01:00.0: failed to flush transmit queue (skip 0 ar-state 1): 0
[ 1140.077592] ath10k_pci 0001:01:00.0: failed to flush transmit queue (skip 0 ar-state 1): 0
[ 1311.187312] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1311.187349] rcu: 	0-...!: (0 ticks this GP) idle=478/0/0x0 softirq=33440/33440 fqs=0  (false positive?)
[ 1311.192042] 	(detected by 1, t=2102 jiffies, g=53989, q=477)
[ 1311.201413] Sending NMI from CPU 1 to CPUs 0:
[ 1311.207319] NMI backtrace for cpu 0
[ 1311.207325] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.158 #0
[ 1311.207331] Hardware name: Generic DT based system
[ 1311.207333] PC is at arch_cpu_idle+0x38/0x3c
[ 1311.207347] LR is at arch_cpu_idle+0x34/0x3c
[ 1311.207353] pc : [<c03071dc>]    lr : [<c03071d8>]    psr: 60000013
[ 1311.207357] sp : c0d01f60  ip : de80400c  fp : c0d04f98
[ 1311.207359] r10: c0d04f08  r9 : ffffe000  r8 : 00000000
[ 1311.207362] r7 : 00000000  r6 : c0d00000  r5 : c0d04f68  r4 : 00000000
[ 1311.207365] r3 : c0316a00  r2 : 00000001  r1 : 00000000  r0 : 00bf6530
[ 1311.207368] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[ 1311.207373] Control: 10c5787d  Table: 45e8c06a  DAC: 00000051
[ 1311.207376] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.158 #0
[ 1311.207380] Hardware name: Generic DT based system
[ 1311.207387] [<c030ddd8>] (unwind_backtrace) from [<c0309d28>] (show_stack+0x10/0x14)
[ 1311.207400] [<c0309d28>] (show_stack) from [<c05eb0a4>] (dump_stack_lvl+0x40/0x4c)
[ 1311.207415] [<c05eb0a4>] (dump_stack_lvl) from [<c05f295c>] (nmi_cpu_backtrace+0xc4/0x110)
[ 1311.207427] [<c05f295c>] (nmi_cpu_backtrace) from [<c030c698>] (do_handle_IPI+0x5c/0x12c)
[ 1311.207437] [<c030c698>] (do_handle_IPI) from [<c030c780>] (ipi_handler+0x18/0x20)
[ 1311.207445] [<c030c780>] (ipi_handler) from [<c0370df0>] (handle_percpu_devid_irq+0x78/0x13c)
[ 1311.207457] [<c0370df0>] (handle_percpu_devid_irq) from [<c036afe4>] (handle_domain_irq+0x5c/0x78)
[ 1311.207471] [<c036afe4>] (handle_domain_irq) from [<c03012e4>] (gic_handle_irq+0x7c/0x90)
[ 1311.207482] [<c03012e4>] (gic_handle_irq) from [<c0300b7c>] (__irq_svc+0x5c/0x78)
[ 1311.207491] Exception stack(0xc0d01f10 to 0xc0d01f58)
[ 1311.207496] 1f00:                                     00bf6530 00000000 00000001 c0316a00
[ 1311.207500] 1f20: 00000000 c0d04f68 c0d00000 00000000 00000000 ffffe000 c0d04f08 c0d04f98
[ 1311.207504] 1f40: de80400c c0d01f60 c03071d8 c03071dc 60000013 ffffffff
[ 1311.207507] [<c0300b7c>] (__irq_svc) from [<c03071dc>] (arch_cpu_idle+0x38/0x3c)
[ 1311.207517] [<c03071dc>] (arch_cpu_idle) from [<c034f3e0>] (do_idle+0x23c/0x29c)
[ 1311.207534] [<c034f3e0>] (do_idle) from [<c034f744>] (cpu_startup_entry+0x18/0x1c)
[ 1311.207545] [<c034f744>] (cpu_startup_entry) from [<c0c011a0>] (start_kernel+0x6b4/0x6c4)
[ 1311.208316] rcu: rcu_sched kthread timer wakeup didn't happen for 2103 jiffies! g53989 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[ 1311.418521] rcu: 	Possible timer handling issue on cpu=0 timer-softirq=23471
Jun  8 16:05:03 OpenWrt hostapd: wlan1: STA 8e:85:80:0a:e4:9a IEEE 802.11: authenticated
Jun  8 16:05:03 OpenWrt hostapd: wlan1: STA 8e:85:80:0a:e4:9a IEEE 802.11: associated (aid 3)
Jun  8 16:05:12 OpenWrt hostapd: wlan1: STA 8e:85:80:0a:e4:9a IEEE 802.11: deauthenticated due to local deauth request
Jun  8 16:05:19 OpenWrt hostapd: wlan1: STA b8:2d:28:1b:ef:1c IEEE 802.11: authenticated
Jun  8 16:05:19 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:83:66 IEEE 802.11: authenticated
Jun  8 16:05:35 OpenWrt hostapd: wlan1: STA b8:2d:28:1b:ef:1c IEEE 802.11: did not acknowledge authentication response
Jun  8 16:05:35 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:96:4f IEEE 802.11: authenticated
Jun  8 16:05:38 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:96:4f IEEE 802.11: authenticated
Jun  8 16:05:56 OpenWrt igmpproxy[4565]: MRT_DEL_MFC; Errno(2): No such file or directory
Jun  8 16:06:00 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:96:4f IEEE 802.11: authenticated
Jun  8 16:06:00 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:96:4f IEEE 802.11: associated (aid 3)
Jun  8 16:06:03 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:83:66 IEEE 802.11: authenticated
Jun  8 16:06:03 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:83:66 IEEE 802.11: associated (aid 4)
Jun  8 16:06:05 OpenWrt hostapd: wlan1: AP-STA-CONNECTED 78:0f:77:fd:83:66
Jun  8 16:06:05 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:83:66 RADIUS: starting accounting session 98F5F94467AD77BF
Jun  8 16:06:05 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:83:66 WPA: pairwise key handshake completed (RSN)
Jun  8 16:06:05 OpenWrt hostapd: wlan1: EAPOL-4WAY-HS-COMPLETED 78:0f:77:fd:83:66
Jun  8 16:06:26 OpenWrt hostapd: wlan1: AP-STA-DISCONNECTED b4:8a:0a:c5:3d:30
Jun  8 16:06:26 OpenWrt hostapd: wlan1: STA b4:8a:0a:c5:3d:30 IEEE 802.11: authenticated
Jun  8 16:06:31 OpenWrt hostapd: wlan1: AP-STA-DISCONNECTED 78:0f:77:fd:83:66
Jun  8 16:06:33 OpenWrt hostapd: wlan1: STA 78:0f:77:fd:83:66 IEEE 802.11: did not acknowledge authentication response
Jun  8 16:06:38 OpenWrt igmpproxy[4565]: MRT_DEL_MFC; Errno(2): No such file or directory
Jun  8 16:06:42 OpenWrt hostapd: wlan1: STA 8e:85:80:0a:e4:9a IEEE 802.11: authenticated
Jun  8 16:06:42 OpenWrt hostapd: wlan1: STA 8e:85:80:0a:e4:9a IEEE 802.11: associated (aid 1)
Jun  8 16:06:43 OpenWrt hostapd: wlan1: STA c8:2b:96:04:bd:1a IEEE 802.11: did not acknowledge authentication response
Sun Jun  9 16:06:11 2024 kern.info kernel: [  711.998838] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0  arvif->paused: 0x0
Sun Jun  9 16:06:12 2024 kern.info kernel: [  712.058720] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0  arvif->paused: 0x0
Sun Jun  9 16:06:12 2024 kern.info kernel: [  712.208711] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0  arvif->paused: 0x0
Sun Jun  9 16:06:12 2024 kern.info kernel: [  712.348699] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0  arvif->paused: 0x0
Sun Jun  9 16:06:12 2024 kern.info kernel: [  712.478700] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0  arvif->paused: 0x0
Sun Jun  9 16:06:12 2024 kern.info kernel: [  712.608738] ath10k_pci 0001:01:00.0: mac flush vdev 0 drop 0 queues 0x1 ar->paused: 0x0  arvif->paused: 0x0
Sun Jun  9 16:06:12 2024 kern.info kernel: [  712.617682] ath10k_pci 0001:01:00.0: mac flush null vif, drop 0 queues 0xffff

I had an Asus RT-AC87U around. Set it with the same SSID, channels, clients and wifi configs. Everything has been stable for the past 24h, no disconnections/deauths. This makes me believe something is wrong with my R7800 firmware/configs.

I'd appreciate some guidance into fixing this issue. 5Ghz works flawlessly.

You have multiple issues, but this line and subsequent dump show a design issue that hasn't been addressed and won't disappear on its own. It is likely separate from the disconnect issues but will probably confound your efforts at a stable network.

It likely began with the move to kernel 5.15. Stick with openwrt v22 or oem firmware if you want to troubleshoot this. In the march - april '24 timeframe, i dug into this and posted initial findings here but nobody has continued this work.

Since it's beyond me, i dusted off my Netgear R6700 v2 which works brilliantly .

1 Like

Have you tried to use recent master snapshot images on your R7800 to see if the RCU problem still persists. With master snapshot images, I can get as much as 650Mbps for WIFI 5GHz (with software offloading enabled), so I have stopped using NSS images.

In the past, I think you said your WAN connectivity is not very fast (less than 50Mbps?) so there's no reason to use NSS images at all.

1 Like

Have you tried to use a vanilla (non NSS) image to see if the problem persists? Master snapshots on R7800 work very well for me these days.

Also, it looks like you have a few Expressif ESP32 devices that may keep going to low-power mode. You may also want to try these settings on your 2.4 GHz.

  • Disable (uncheck): "Disassociate On Low Acknowledgement"

and/or

  • Enable (check): "Disable Inactivity Polling"

I did try those Wifi options, but didn't had the chance to use a non-NSS firmware. Will try this weekend and report back. Thanks for the suggestion!

I have included QAM-256 and will test it on my main router which serves around 20 clients.

4 Likes

Just to note that QAM-256 doesn't work on channel 1 for me. I don't know the reason but the radio simply doesn't start.

I suppose you will push the commits to your repo when you finish testing QAM-256.

I have an XR500 running 23.05. This morning the download speed is very slow (50 mbps max). I attempted to reboot the router via Lucí but the command failed. Next, I logged in via ssh and issued the reboot command. Nothing happened. How does one reboot an openwrt router without power cycling at the plug?

root@OpenWrt:~# reboot
root@OpenWrt:~# uptime
 06:31:47 up 30 days, 12:47,  load average: 0.00, 0.00, 0.00

@KONG Just for your information.
I have build around your sources using ath10k firmware and kmod.
After almost two weeks working fine 5g radio stop working

Tue Jun 25 02:18:59 2024 daemon.err hostapd: 20/40 MHz: center segment 0 (=138) and center freq 1 (=5670) not in sync
Tue Jun 25 02:18:59 2024 kern.info kernel: [1175802.738603] device phy0-ap0 left promiscuous mode
Tue Jun 25 02:18:59 2024 kern.info kernel: [1175802.738736] br-home: port 2(phy0-ap0) entered disabled state
Tue Jun 25 02:18:59 2024 daemon.notice netifd: Network device 'phy0-ap0' link is down
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175802.810781] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175802.913065] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.015464] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.117865] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.220264] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.322669] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.425068] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.527466] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.629867] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:00 2024 kern.warn kernel: [1175803.732266] ath10k_pci 0000:01:00.0: could not get mac80211 beacon
Tue Jun 25 02:19:05 2024 kern.warn kernel: [1175807.819212] ath10k_warn: 39 callbacks suppressed
Tue Jun 25 02:19:05 2024 kern.warn kernel: [1175807.819225] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
Tue Jun 25 02:19:05 2024 kern.warn kernel: [1175807.822973] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
Tue Jun 25 02:19:05 2024 kern.warn kernel: [1175807.829844] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
Tue Jun 25 02:19:05 2024 daemon.info avahi-daemon[1862]: Interface phy0-ap0.IPv6 no longer relevant for mDNS.
Tue Jun 25 02:19:05 2024 daemon.info avahi-daemon[1862]: Leaving mDNS multicast group on interface phy0-ap0.IPv6 with address fe80::deef:9ff:fexx:xxxx.
Tue Jun 25 02:19:05 2024 daemon.info avahi-daemon[1862]: Withdrawing address record for fe80::deef:9ff:fef3:269c on phy0-ap0.
Tue Jun 25 02:19:11 2024 kern.warn kernel: [1175814.108403] ath10k_pci 0000:01:00.0: Unknown eventid: 36933
Tue Jun 25 02:19:11 2024 kern.info kernel: [1175814.122214] br-home: port 2(phy0-ap0) entered blocking state
Tue Jun 25 02:19:11 2024 kern.info kernel: [1175814.122244] br-home: port 2(phy0-ap0) entered disabled state
Tue Jun 25 02:19:11 2024 kern.info kernel: [1175814.127099] device phy0-ap0 entered promiscuous mode
Tue Jun 25 02:19:11 2024 daemon.err hostapd: could not get valid channel
Tue Jun 25 02:49:02 2024 daemon.err hostapd: could not get valid channel
Tue Jun 25 02:49:02 2024 daemon.err hostapd: 20/40 MHz: center segment 0 (=138) and center freq 1 (=5670) not in sync
Tue Jun 25 02:49:02 2024 daemon.err hostapd: Can't set freq params
Tue Jun 25 02:49:02 2024 daemon.err hostapd: DFS start_dfs_cac() failed, -1
Tue Jun 25 02:49:02 2024 daemon.err hostapd: 20/40 MHz: center segment 0 (=138) and center freq 1 (=5670) not in sync
Tue Jun 25 02:49:02 2024 daemon.err hostapd: Can't set freq params
Tue Jun 25 02:49:02 2024 daemon.err hostapd: DFS start_dfs_cac() failed, -1
Tue Jun 25 02:49:02 2024 daemon.err hostapd: 20/40 MHz: center segment 0 (=138) and center freq 1 (=5670) not in sync
Tue Jun 25 02:49:02 2024 daemon.err hostapd: Can't set freq params
Tue Jun 25 02:49:02 2024 daemon.err hostapd: DFS start_dfs_cac() failed, -1

2.4 ghz band still fine.

I've restarted 5g radio and it is working again.

Tue Jun 25 08:52:30 2024 daemon.err hostapd: rmdir[ctrl_interface=/var/run/hostapd]: Permission denied
Tue Jun 25 08:52:30 2024 daemon.err hostapd: hostapd_free_hapd_data: Interface phy0-ap0 wasn't started
Tue Jun 25 08:52:30 2024 kern.warn kernel: [1199412.827080] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
Tue Jun 25 08:52:30 2024 kern.info kernel: [1199412.911045] br-home: port 2(phy0-ap0) entered disabled state
Tue Jun 25 08:52:30 2024 kern.info kernel: [1199412.912537] device phy0-ap0 left promiscuous mode
Tue Jun 25 08:52:30 2024 kern.info kernel: [1199412.915879] br-home: port 2(phy0-ap0) entered disabled state
Tue Jun 25 08:52:30 2024 kern.info kernel: [1199412.997739] phy0-ap0: Destroyed NSS virtual interface
Tue Jun 25 08:52:30 2024 daemon.notice netifd: Wireless device 'radio0' is now down
Tue Jun 25 08:52:30 2024 daemon.notice netifd: radio0 (7952): WARNING: Variable 'data' does not exist or is not an array/object
Tue Jun 25 08:52:30 2024 kern.err kernel: [1199413.529402] debugfs: File 'virt_if' in directory 'stats' already present!
Tue Jun 25 08:52:30 2024 kern.info kernel: [1199413.529739] phy0-ap0: Created a NSS virtual interface
Tue Jun 25 08:52:37 2024 kern.warn kernel: [1199419.772228] ath10k_pci 0000:01:00.0: Unknown eventid: 36933
Tue Jun 25 08:52:37 2024 kern.info kernel: [1199419.778579] br-home: port 2(phy0-ap0) entered blocking state
Tue Jun 25 08:52:37 2024 kern.info kernel: [1199419.778611] br-home: port 2(phy0-ap0) entered disabled state
Tue Jun 25 08:52:37 2024 kern.info kernel: [1199419.783513] device phy0-ap0 entered promiscuous mode
Tue Jun 25 08:52:37 2024 kern.info kernel: [1199419.789280] br-home: port 2(phy0-ap0) entered blocking state
Tue Jun 25 08:52:37 2024 kern.info kernel: [1199419.794253] br-home: port 2(phy0-ap0) entered forwarding state
Tue Jun 25 08:52:37 2024 kern.info kernel: [1199419.800817] br-home: port 2(phy0-ap0) entered disabled state
Tue Jun 25 08:52:37 2024 daemon.notice netifd: Wireless device 'radio0' is now up
Tue Jun 25 08:53:40 2024 kern.info kernel: [1199483.097509] IPv6: ADDRCONF(NETDEV_CHANGE): phy0-ap0: link becomes ready
Tue Jun 25 08:53:40 2024 kern.info kernel: [1199483.097654] br-home: port 2(phy0-ap0) entered blocking state
Tue Jun 25 08:53:40 2024 kern.info kernel: [1199483.103278] br-home: port 2(phy0-ap0) entered forwarding state
Tue Jun 25 08:53:40 2024 daemon.notice netifd: Network device 'phy0-ap0' link is up
Tue Jun 25 08:53:40 2024 user.info usteer: Creating local node hostapd.phy0-ap0
Tue Jun 25 08:53:40 2024 user.info usteer: Found nl80211 phy on wdev hostapd.phy0-ap0, ssid=MY_WIFI_5G
Tue Jun 25 08:53:40 2024 user.info usteer: Connecting to local node hostapd.phy0-ap0
Tue Jun 25 08:53:42 2024 daemon.info avahi-daemon[1862]: Joining mDNS multicast group on interface phy0-ap0.IPv6 with address fe80::deef:9ff:fexx:xxxx.
Tue Jun 25 08:53:42 2024 daemon.info avahi-daemon[1862]: New relevant interface phy0-ap0.IPv6 for mDNS.
Tue Jun 25 08:53:42 2024 daemon.info avahi-daemon[1862]: Registering new address record for fe80::deef:9ff:fexx:xxxx on phy0-ap0.*.

1 Like