Ath11k possible firmware bug - traffic interruptions when a client disconnects from WLAN

I have seen 1/2 hour ago. I have did recompile and now i have a different problem

Wed Aug  2 17:24:13 2023 daemon.err hostapd: nl80211: disabling multicast to unicast failed with -19 (No such device) on interface phy0-ap0
Wed Aug  2 17:24:13 2023 daemon.warn hostapd: phy0-ap0: Could not connect to kernel driver
Wed Aug  2 17:24:13 2023 daemon.err hostapd: Interface initialization failed

I need again to restart radio0 manualy.

Updating to latest snapshot of today via ImageBuilder ASU on Xiaomi ax3600 is unusable for me because the meshSsid does not come up.

Anyone with account to Ath11k bug report page wishing to file a bug?

At developers, I've just tried the official regular OpenWrt master build (Non NSS) with latest Ath11k firmware WLAN.HK.2.9.0.1-01862 and the issue is absolutely reproducible every time for my setup described in the first post.
I've even tried with reset to default but no success either.

I've tried the official regular OpenWrt master build (Non NSS) with Ath11k firmware WLAN.HK.2.9.0.1-01385 too. There is no issue with it keeping the same settings and packages installed.

@robimarko, @Ansuel, @nbd, @kirdes, @hnyman, @quarky.

I'm still crashing out ath11k if I switch a client from "just a common" Ssid A to B.

Log:

Aug  5 12:17:01 kernel: [72418.620424] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: fatal error received: 
Aug  5 12:17:01 kernel: [72418.620424] QC Image Version: QC_IMAGE_VERSION_STRING=WLAN.HK.2.9.0.1-01862-QCAHKSWPL_SILICONZ-1
Aug  5 12:17:01 kernel: [72418.620424] Image Variant : IMAGE_VARIANT_STRING=8074.wlanfw.eval_v2Q
Aug  5 12:17:01 kernel: [72418.620424] 
Aug  5 12:17:01 kernel: [72418.620424] wal_peer_control.c:2904 Assertion is_graceful_to_handle failedparam0 :zero, param1 :zero, param2 :zero.
Aug  5 12:17:01 kernel: [72418.620424] Thread ID      : 0x00000069  Thread name    : WLAN RT0  Process ID     : 0
Aug  5 12:17:01 kernel: [72418.620424] Register:
Aug  5 12:17:01 kernel: [72418.620424] SP : 0x4bfb9340
Aug  5 12:17:01 kernel: [72418.620424] FP : 0x4bfb9348
Aug  5 12:17:01 kernel: [72418.620424] PC : 0x4b1080c4
Aug  5 12:17:01 kernel: [72418.620424] SSR : 0x00000008
Aug  5 12:17:01 kernel: [72418.620424] BADVA : 0x00020000
Aug  5 12:17:01 kernel: [72418.620424] LR : 0x4b107860
Aug  5 12:17:01 kernel: [72418.620424] 
Aug  5 12:17:01 kernel: [72418.620424] Stack Dump
Aug  5 12:17:01 kernel: [72418.620424] from : 0x4bfb9340
Aug  5 12:17:01 kernel: [72418.620424] to   : 0x4bfb9ba0
Aug  5 12:17:01 kernel: [72418.620424] 
Aug  5 12:17:01 kernel: [72418.669620] remoteproc remoteproc0: crash detected in cd00000.q6v5_wcss: type fatal error
Aug  5 12:17:01 kernel: [72418.691710] remoteproc remoteproc0: handling crash #1 in cd00000.q6v5_wcss
Aug  5 12:17:01 kernel: [72418.699932] remoteproc remoteproc0: recovering cd00000.q6v5_wcss
Aug  5 12:17:01 kernel: [72418.732582] remoteproc remoteproc0: stopped remote processor cd00000.q6v5_wcss
Aug  5 12:17:01 kernel: [72418.828774] ath11k c000000.wifi: failed to find peer aa:bb:cc:dd:ee:ff on vdev 0 after creation
Aug  5 12:17:01 kernel: [72418.828833] ath11k c000000.wifi: failed to find peer vdev_id 0 addr aa:bb:cc:dd:ee:ff in delete
Aug  5 12:17:01 kernel: [72418.836319] ath11k c000000.wifi: failed peer aa:bb:cc:dd:ee:ff delete vdev_id 0 fallback ret -22
Aug  5 12:17:01 kernel: [72418.845328] ath11k c000000.wifi: Failed to add peer: aa:bb:cc:dd:ee:ff for VDEV: 0
Aug  5 12:17:01 kernel: [72418.854060] ath11k c000000.wifi: Failed to add station: aa:bb:cc:dd:ee:ff for VDEV: 0
Aug  5 12:17:01 hostapd: phy1-ap0: STA aa:bb:cc:dd:ee:ff IEEE 802.11: Could not add STA to kernel driver
Aug  5 12:17:01 kernel: [72418.919164] ath11k c000000.wifi: failed to update rx tid queue, tid 0 (-108)
Aug  5 12:17:01 kernel: [72418.919217] ath11k c000000.wifi: failed to update reo for rx tid 0: -108
Aug  5 12:17:01 kernel: [72418.925322] phy1-ap1: HW problem - can not stop rx aggregation for aa:bb:cc:dd:ee:ff tid 0
Aug  5 12:17:06 kernel: [72424.028722] ath11k c000000.wifi: failed to flush transmit queue, data pkts pending 24
Aug  5 12:17:06 kernel: [72424.028835] ath11k c000000.wifi: failed to send WMI_PEER_DELETE cmd
Aug  5 12:17:06 kernel: [72424.035543] ath11k c000000.wifi: failed to delete peer vdev_id 1 addr aa:bb:cc:dd:ee:ff ret -108
Aug  5 12:17:06 kernel: [72424.041663] ath11k c000000.wifi: Failed to delete peer: aa:bb:cc:dd:ee:ff for VDEV: 1
Aug  5 12:17:06 kernel: [72424.050678] ath11k c000000.wifi: Found peer entry aa:bb:cc:dd:ee:ff n vdev 1 after it was supposedly removed
Aug  5 12:17:06 kernel: [72424.058438] ------------[ cut here ]------------
Aug  5 12:17:06 kernel: [72424.068265] WARNING: CPU: 0 PID: 2069 at sta_set_sinfo+0xc44/0xca0 [mac80211]
Aug  5 12:17:06 kernel: [72424.072870] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_inet batman_adv ath11k_ahb ath11k ath10k_pci ath10k_core ath pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mac80211 iptable_mangle iptable_filter ipt_REJECT ipt_ECN ip_tables cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_comment xt_TCPMSS xt_LOG xt_HL xt_DSCP xt_CLASSIFY x_tables slhc sch_cake qrtr_smd qrtr qmi_helpers nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc_ccitt compat sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact ifb tun sha512_generic seqiv jitterentropy_rng drbg
Aug  5 12:17:06 kernel: [72424.073151]  michael_mic hmac cmac leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom qca_nss_dp qca_ssdk gpio_button_hotplug ext4 mbcache jbd2 aquantia hwmon crc32c_generic
Aug  5 12:17:06 kernel: [72424.166900] CPU: 0 PID: 2069 Comm: hostapd Not tainted 6.1.42 #0
Aug  5 12:17:06 kernel: [72424.182764] Hardware name: Xiaomi AX3600 (DT)
Aug  5 12:17:06 kernel: [72424.189011] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Aug  5 12:17:06 kernel: [72424.193267] pc : sta_set_sinfo+0xc44/0xca0 [mac80211]
Aug  5 12:17:06 kernel: [72424.200036] lr : sta_set_sinfo+0xc40/0xca0 [mac80211]
Aug  5 12:17:06 kernel: [72424.205243] sp : ffffffc00d79b880
Aug  5 12:17:06 kernel: [72424.210275] x29: ffffffc00d79b880 x28: ffffff800448c800 x27: ffffffc00d79bdc8
Aug  5 12:17:06 kernel: [72424.213581] x26: ffffff8002072080 x25: ffffffc008cca3c0 x24: ffffffc000ca8000
Aug  5 12:17:06 kernel: [72424.220699] x23: ffffffc00d79ba70 x22: ffffff800ed84940 x21: ffffff8010250aa8
Aug  5 12:17:06 kernel: [72424.227817] x20: ffffff80067b0880 x19: ffffff8010250000 x18: 000000000000018f
Aug  5 12:17:06 kernel: [72424.234936] x17: 312076656476206e x16: 2032343a65353a36 x15: ffffffc008bc7298
Aug  5 12:17:06 kernel: [72424.242054] x14: 00000000000004ad x13: 000000000000018f x12: 00000000ffffffea
Aug  5 12:17:06 kernel: [72424.249172] x11: 00000000ffffefff x10: ffffffc008c1f298 x9 : ffffffc008bc7240
Aug  5 12:17:06 kernel: [72424.256290] x8 : 0000000000000024 x7 : ffffff8016b60000 x6 : 0000000000008e20
Aug  5 12:17:06 kernel: [72424.263407] x5 : ffffffc0173d0000 x4 : 0000000000000000 x3 : ffffff800448c800
Aug  5 12:17:06 kernel: [72424.270526] x2 : 0000000000000000 x1 : ffffff800448c800 x0 : 00000000ffffff94
Aug  5 12:17:06 kernel: [72424.277645] Call trace:
Aug  5 12:17:06 kernel: [72424.284753]  sta_set_sinfo+0xc44/0xca0 [mac80211]
Aug  5 12:17:06 kernel: [72424.287015]  sta_info_destroy_addr_bss+0x50/0x74 [mac80211]
Aug  5 12:17:06 kernel: [72424.291876]  ieee80211_color_change_finish+0x1ac8/0x1d70 [mac80211]
Aug  5 12:17:06 kernel: [72424.297261]  cfg80211_check_station_change+0x11b8/0x4c30 [cfg80211]
Aug  5 12:17:06 kernel: [72424.303509]  genl_family_rcv_msg_doit+0xb8/0x11c
Aug  5 12:17:06 kernel: [72424.309757]  genl_rcv_msg+0x108/0x230
Aug  5 12:17:06 kernel: [72424.314615]  netlink_rcv_skb+0x5c/0x130
Aug  5 12:17:06 kernel: [72424.318175]  genl_rcv+0x38/0x50
Aug  5 12:17:06 kernel: [72424.321820]  netlink_unicast+0x1e8/0x2d4
Aug  5 12:17:06 kernel: [72424.324947]  netlink_sendmsg+0x1a0/0x3d0
Aug  5 12:17:06 kernel: [72424.329114]  ____sys_sendmsg+0x1c8/0x270
Aug  5 12:17:06 kernel: [72424.333020]  ___sys_sendmsg+0x7c/0xc0
Aug  5 12:17:06 kernel: [72424.336924]  __sys_sendmsg+0x48/0xb0
Aug  5 12:17:06 kernel: [72424.340484]  __arm64_sys_sendmsg+0x24/0x30
Aug  5 12:17:06 kernel: [72424.344131]  invoke_syscall.constprop.0+0x5c/0x104
Aug  5 12:17:06 kernel: [72424.348038]  do_el0_svc+0x58/0x17c
Aug  5 12:17:06 kernel: [72424.352810]  el0_svc+0x18/0x54
Aug  5 12:17:06 kernel: [72424.356195]  el0t_64_sync_handler+0xf4/0x120
Aug  5 12:17:06 kernel: [72424.359235]  el0t_64_sync+0x174/0x178
Aug  5 12:17:06 kernel: [72424.363662] ---[ end trace 0000000000000000 ]---
Aug  5 12:17:06 kernel: [72424.368104] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Aug  5 12:17:06 kernel: [72424.371959] ath11k c000000.wifi: failed to send pdev bss chan info request
Aug  5 12:17:06 kernel: [72424.379804] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Aug  5 12:17:06 kernel: [72424.386152] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 3
Aug  5 12:17:06 kernel: [72424.392699] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Aug  5 12:17:06 hostapd: phy1-ap1: STA aa:bb:cc:dd:ee:ff IEEE 802.11: deauthenticated due to local deauth request
Aug  5 12:17:07 kernel: [72424.668906] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: start timed out
Aug  5 12:17:07 kernel: [72424.668957] remoteproc remoteproc0: can't start rproc cd00000.q6v5_wcss: -110
Aug  5 12:17:12 kernel: [72430.402821] ath11k_warn: 75 callbacks suppressed
Aug  5 12:17:12 kernel: [72430.402842] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Aug  5 12:17:12 kernel: [72430.406534] ath11k c000000.wifi: failed to send pdev bss chan info request
Aug  5 12:17:12 kernel: [72430.414238] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Aug  5 12:17:12 kernel: [72430.420794] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 3
Aug  5 12:17:12 kernel: [72430.427295] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Aug  5 12:17:12 kernel: [72430.433802] ath11k c000000.wifi: failed to submit beacon template command: -108
Aug  5 12:17:12 kernel: [72430.439963] ath11k c000000.wifi: failed to update bcn template: -108
Aug  5 12:17:12 kernel: [72430.447157] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:12 kernel: [72430.453764] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 3
Aug  5 12:17:12 kernel: [72430.460449] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:18 kernel: [72436.473875] ath11k_warn: 70 callbacks suppressed
Aug  5 12:17:18 kernel: [72436.473898] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Aug  5 12:17:18 kernel: [72436.477590] ath11k c000000.wifi: failed to send pdev bss chan info request
Aug  5 12:17:18 kernel: [72436.485291] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Aug  5 12:17:18 kernel: [72436.491851] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 3
Aug  5 12:17:18 kernel: [72436.498351] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Aug  5 12:17:18 kernel: [72436.504853] ath11k c000000.wifi: failed to submit beacon template command: -108
Aug  5 12:17:18 kernel: [72436.511018] ath11k c000000.wifi: failed to update bcn template: -108
Aug  5 12:17:18 kernel: [72436.518214] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:18 kernel: [72436.524820] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 3
Aug  5 12:17:18 kernel: [72436.531510] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:24 kernel: [72442.544950] ath11k_warn: 70 callbacks suppressed
Aug  5 12:17:24 kernel: [72442.544973] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Aug  5 12:17:24 kernel: [72442.548698] ath11k c000000.wifi: failed to send pdev bss chan info request
Aug  5 12:17:24 kernel: [72442.556379] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Aug  5 12:17:24 kernel: [72442.562921] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 3
Aug  5 12:17:24 kernel: [72442.569442] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Aug  5 12:17:24 kernel: [72442.575920] ath11k c000000.wifi: failed to submit beacon template command: -108
Aug  5 12:17:24 kernel: [72442.582096] ath11k c000000.wifi: failed to update bcn template: -108
Aug  5 12:17:24 kernel: [72442.589305] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:24 kernel: [72442.595885] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 3
Aug  5 12:17:24 kernel: [72442.602583] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:31 kernel: [72448.611885] ath11k_warn: 70 callbacks suppressed
Aug  5 12:17:31 kernel: [72448.611904] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Aug  5 12:17:31 kernel: [72448.615597] ath11k c000000.wifi: failed to send pdev bss chan info request
Aug  5 12:17:31 kernel: [72448.623352] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Aug  5 12:17:31 kernel: [72448.629859] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 3
Aug  5 12:17:31 kernel: [72448.636361] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Aug  5 12:17:31 kernel: [72448.642872] ath11k c000000.wifi: failed to submit beacon template command: -108
Aug  5 12:17:31 kernel: [72448.649037] ath11k c000000.wifi: failed to update bcn template: -108
Aug  5 12:17:31 kernel: [72448.656224] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:31 kernel: [72448.662832] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 3
Aug  5 12:17:31 kernel: [72448.669516] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:37 kernel: [72454.679267] ath11k_warn: 70 callbacks suppressed
Aug  5 12:17:37 kernel: [72454.679290] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Aug  5 12:17:37 kernel: [72454.682983] ath11k c000000.wifi: failed to send pdev bss chan info request
Aug  5 12:17:37 kernel: [72454.690661] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Aug  5 12:17:37 kernel: [72454.697218] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 3
Aug  5 12:17:37 kernel: [72454.703777] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Aug  5 12:17:37 kernel: [72454.710256] ath11k c000000.wifi: failed to submit beacon template command: -108
Aug  5 12:17:37 kernel: [72454.716400] ath11k c000000.wifi: failed to update bcn template: -108
Aug  5 12:17:37 kernel: [72454.723635] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:37 kernel: [72454.730218] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 3
Aug  5 12:17:37 kernel: [72454.736889] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Aug  5 12:17:39 kernel: [72457.149163] nss-dp 3a001400.dp3 lan1: PHY Link up speed: 1000
Aug  5 12:17:39 kernel: [72457.149248] br-vl177: port 1(lan1) entered blocking state
Aug  5 12:17:39 kernel: [72457.153906] br-vl177: port 1(lan1) entered forwarding state


I can confirm this issue on wrt36 with snapshot build both with firmware WLAN.HK.2.9.0.1-01862-QCAHKSWPL_SILICONZ-1 and WLAN.HK.2.9.0.1-01837-QCAHKSWPL_SILICONZ-1.

Very esy to test and repro.

2 Likes

I'm definitely seeing this same issue also with WLAN.HK.2.9.0.1-01385-QCAHKSWPL_SILICONZ-1.

I'm only changing firmware files, my system image is the same - snapshot r23636-ba7d6dddc7. @sppmaster If you are not seeing this with 01385, could it be that there are other variables in your setup, like you used different build with 01385 where you did not see issue?

1 Like

No, there are no other variables. I use the same NSS image but just changed the firmware files in /lib/firmware/IPQ8074 then rebooted.
I've tried with regular OpenWrt (Non-NSS) build too. The same issue is present with it too.

The issue has been present for quite long time. The line number in assertion varies according to the firmware versions, but the remaining part is roughly the same (and naturally the underlying reason is similar, a device leaving AP).
Examples of the same:

this here is WLAN.HK.2.9.0.1-01862-QCAHKSWPL_SILICONZ-1

They all show the same call trace:

Aug  5 12:17:06 kernel: [72424.277645] Call trace:
Aug  5 12:17:06 kernel: [72424.284753]  sta_set_sinfo+0xc44/0xca0 [mac80211]
Aug  5 12:17:06 kernel: [72424.287015]  sta_info_destroy_addr_bss+0x50/0x74 [mac80211]
Aug  5 12:17:06 kernel: [72424.291876]  ieee80211_color_change_finish+0x1ac8/0x1d70 [mac80211]
Aug  5 12:17:06 kernel: [72424.297261]  cfg80211_check_station_change+0x11b8/0x4c30 [cfg80211]
Aug  5 12:17:06 kernel: [72424.303509]  genl_family_rcv_msg_doit+0xb8/0x11c
3 Likes

I see that @Ansuel reported bugs for Ath11k.
Unfortunately I have one Xiaomi Smart Phone that frequently disconnects itself from the 5G network and its 2.4 WLAN is just awful and I cannot use it. Disconnecting frequently from the 5G WLAN it hits all other devices connected to it.
At least in my case the firmware doesn't crash.

Can you tell us what is your test setup. I've tested one more time extensively for more than 10 minutes sequentially running iperf3 tests on a Laptop and a Smart Phone and disconnecting two other devices from WLAN. I couldn't reproduce the issue with WLAN.HK.2.9.0.1-01385-QCAHKSWPL_SILICONZ-1.
Or at least it doesn't happen every time as with both later versions of the firmware.
There are other users that reported they don't see similar behaviour with the latest firmware version.
Probably there are some other factors that are not so obvious and may cause an adverse effect too.

Default radio settings, channel 149 (HE80), psk2. Nothing fancy.
I have around 10 clients connected, mostly Apple devices (Macs and iPhones).

The test is running ping to router from connected mac client and toggling WiFi off on iPhone.
Yes, it does not happen every time, but toggle WiFi enough and it will happen.

I have checked logs, nothing interesting. Ran hostapd with debug logging - also nothing of interest.

2 Likes

Also, just to be clear, I don't see such extreme interruptions as you reported. Interruption is usually around 1 second long or less. I would see pings going normal (under 10ms), then on client disconnect one ping taking anywhere from 300ms to timeout.

Oh, not sure if I mentioned, it's wrx36 device.

3 Likes

I am now using 1835 and it has run fine for a week without crashes or apparent interruptions.
((DL-WRX36, 23.05) but I only have 2 modern Samsung phones a Samsung tablet and an LG TV using the wireless (5GHz) so my network is lightly taxed.

1 Like

I downgraded firmware on my wrx36 to WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1. It is very very hard to repro the issue with this older firmware.

From practical perspective this is definitely better than 2.9.0.1 firmware. Reproducing the issue on 2.9.0.1 is very easy just by togglig WiFi on/off on iPhone a couple of times.

2.7.0.1 also fixes broadcast/multicast bug for me.

2 Likes

As a reference to the issues in this thread and a workaround from this post I've tried the following.

I've just returned to the latest (at the moment) Ath11k firmware WLAN.HK.2.9.0.1-01862.
Rebooted the router and tested once again without multi to unicast option. Confirmed the loss of both Ping and Iperf3 traffic.

I've turned on multi to unicast option only for the 5G radio.
In wireless config it is option multicast_to_unicast_all '1'
I've tested once again with multi to unicast option turned on.
There is no more loss of both Ping and Iperf3 traffic if a client disconnects from the 5G WLAN.
There is a long discussion commenting this issue.

So I suggested in my earlier post these issues were connected. That is now confirmed, I think.

@egc, @asvio, @Catfriend1 @Pow maybe you can try to use the workaround enabling multi to unicast option in wireless settings.

Maybe developers can find the real cause and resolve this.
@robimarko, @Ansuel, @nbd, @kirdes, @hnyman, @quarky

4 Likes

Hey @sppmaster
Since commit 549e710fc I've not had any more disconnection problems or network errors.
I think it's essentially due to the new implementation of the hostapd package but I'm not 100% sure.

I have done some tests enabling and disabling multi_to_unicast and in my case (nbg7815) I have not found any difference in performance and/or ping errors.

2 Likes

Since commit 549e710fc I've not had any more disconnection problems or network errors.

This.

I don't know what fixed this - hostapd upgrade, kernel upgrade or kernel patches but with latest snapshot (r23763-46ed38adeb) I'm not seeing this issue anymore. At least it is not easily reproducible as before.

I am also not seeing IPv6 connectivity issue that previously required multicast_to_unicast_all workaround or downgrading firmware to 2.7.0.1. It is too early to call this yet on the multicast bug as the group rekey interval for CCMP is day, so I want to keep running this for extended time to be certain.

One thing I noticed with latest snapshot is no ath11k errors in dmesg. Previously running wifi would always have these errors, something about flushing the ring or something, can't recall exact message.

Of course just as I posted this and ran wifi command a couple of times I got the dreaded error messages:

[ 4205.867960] ath11k c000000.wifi: failed to flush transmit queue, data pkts pending 1
[ 4210.987999] ath11k c000000.wifi: failed to flush transmit queue, data pkts pending 1

After that the traffic interruption bug was easy to repro just like before. Damn.

2 Likes

I've flashed freshly compiled build with latest commits (kernel 6.1.46).
For me the issue is still there absolutely repeatable when multi to unicast is not checked.
The "workaround" option multicast_to_unicast_all '1' only masks the issue temporary but at one point it doesn't matter at all.
I'll continue to monitor how it goes/changes as time goes by.

Update - reading this post.

I can confirm this behaviour too with today snapshot and obviously multi to unicast is not even a workaround for the above issues.

Just an update. I have been running OpenWrt SNAPSHOT, r23763-46ed38adeb on wrx36 with ath11k firmware downgraded to WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1 for two months (uptime 56 days) without problems.

Dual radio - channel 149 HE80 with sae-mixed encryption, channel 6 HE20 with psk2+ccmp encryption.

4 Likes