Xiaomi AX3600: ath11k firmware crash - qcom-q6v5-wcss-pil cd00000.q6v5_wcss: fatal error received:

Also testing this fw with the recent snapshot now...

Wifi crashed today, after my computer attempted to disconnect ssid 1 and join ssid 2 shortly after. Both ssid were running on ath11k radio. They are "normal" wpa2-psk ssid. I had to POR reboot thus did not get any logs.

@robimarko I can reproduce the ath11k radio crash by creating two "common" SSIDs with WPA2-PSK, let's call them A and B.
When I use my Windows client (Intel AX210 NGBW PCIe Wireless Card) to connect to A, all is fine. When I stay on A also. When I select SSID "B" and connect to it, the radio driver crashes reproducibly.

It doesn't matter if I switch the Windows client from SSID A to B or vice versa.

Staying on one SSID since boot up does not trigger any problems and ath11k works fine "forever" on my AX3600.

1 Like

I can support that claim.
I had no problems with my setup of two ax3600. I then created a second ssid for all my iot devices and from then on once and while the wifi crashed and only a reboot helps.
I removed the second ssid again and since then, no fatal crashes anymore.

1 Like

The damn FW is just buggy, I cant believe they still did not fix that

1 Like

Again, a crash on the recent firmware from recent snapshot of yesterday.

Is there a way to restart ath11k without reboot from CLI? I am currently this watchcat user script to reboot in case a ping to an always-on WiFi devices fails:

/root/watchcat.user.sh

#!/bin/sh
logread | grep "qcom-q6v5-wcss-pil .* fatal error received" && reboot
exit 0

This may be related : https://bugzilla.kernel.org/show_bug.cgi?id=216515

Given that the bug was reported by @Ansuel and was commented on by @robimarko then it isn't "related" so much as "triggered by this and other threads" :slight_smile:

I may try the 2.9 testing firmware at some point soon though.

1 Like

Just a quick update:

I'm on ax3600 using firmware SNAPSHOT - r23845-abc536f547 . This one works very stable for me. It contains fw_build_id WLAN.HK.2.9.0.1-01862-QCAHKSWPL_SILICONZ-1 .

But when I upgrade to a recent snapshot containing a newer version of the wifi firmware since https://git.openwrt.org/?p=openwrt/openwrt.git;a=commitdiff;h=eb8ddf5d5c9df9e3ca97c12ee2e7685ffc4c88be;hp=a9cc3708e0c3c4869711a9ba4b9c1437ed250721 , both of my devices fail with out of memory killing processes in a few hours (4 to 8 h, probably depending on the network load). In the web ui, I can see my ram decreasing. Where before there was always more than 120 mb ram free, it drops constantly and about ~ 23 mb free the oom killer kicks in making the device unusable until reboot.

1 Like

Similar to this?

1 Like

Yes that's the problem

Still the same problem with SNAPSHOT r24194 and Firmware WLAN.HK.2.9.0.1-01890-QCAHKSWPL_SILIONZ-1

It works for some days but crashes when a device is leaving the network.

Crashdump:

Tue Oct 24 23:22:00 2023 daemon.notice hostapd: phy2-ap0: AP-STA-DISCONNECTED 38:80:df:xx:xx:xx
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: fatal error received:
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] QC Image Version: QC_IMAGE_VERSION_STRING=WLAN.HK.2.9.0.1-01890-QCAHKSWPL_SILICONZ-1
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] Image Variant : IMAGE_VARIANT_STRING=8074.wlanfw.eval_v2Q
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201]
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] wal_peer_control.c:2904 Assertion is_graceful_to_handle failedparam0 :zero, param1 :zero, param2 :zero.
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] Thread ID      : 0x00000060  Thread name    : WLAN RT1  Process ID     : 0
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] Register:
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] SP : 0x4bfd5a48
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] FP : 0x4bfd5a50
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] PC : 0x4b1080c4
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] SSR : 0x00000008
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] BADVA : 0x00020000
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] LR : 0x4b107860
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201]
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] Stack Dump
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] from : 0x4bfd5a48
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201] to   : 0x4bfd62a8
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.228201]
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.277257] remoteproc remoteproc0: crash detected in cd00000.q6v5_wcss: type fatal error
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.299440] remoteproc remoteproc0: handling crash #1 in cd00000.q6v5_wcss
Tue Oct 24 23:22:01 2023 kern.err kernel: [79832.307670] remoteproc remoteproc0: recovering cd00000.q6v5_wcss
Tue Oct 24 23:22:01 2023 kern.info kernel: [79832.340355] remoteproc remoteproc0: stopped remote processor cd00000.q6v5_wcss
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.625898] ath11k c000000.wifi: failed to find peer 38:80:df:xx:xx:xx on vdev 2 after creation
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.625957] ath11k c000000.wifi: failed to find peer vdev_id 2 addr 38:80:df:xx:xx:xx in delete
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.633421] ath11k c000000.wifi: failed peer 38:80:df:xx:xx:xx delete vdev_id 2 fallback ret -22
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.642129] ath11k c000000.wifi: Failed to add peer: 38:80:df:xx:xx:xx for VDEV: 2
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.651145] ath11k c000000.wifi: Failed to add station: 38:80:df:xx:xx:xx for VDEV: 2
Tue Oct 24 23:22:01 2023 daemon.notice hostapd: phy2-ap1: STA 38:80:df:xx:xx:xx IEEE 802.11: Could not add STA to kernel driver
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.736256] ath11k c000000.wifi: failed to update rx tid queue, tid 0 (-108)
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.736290] ath11k c000000.wifi: failed to update reo for rx tid 0: -108
Tue Oct 24 23:22:01 2023 kern.info kernel: [79832.742370] phy2-ap0: HW problem - can not stop rx aggregation for 38:80:df:xx:xx:xx tid 0
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.749084] ath11k c000000.wifi: failed to update rx tid queue, tid 0 (-108)
Tue Oct 24 23:22:01 2023 kern.warn kernel: [79832.757141] ath11k c000000.wifi: failed to update reo for rx tid 4: -108
Tue Oct 24 23:22:01 2023 kern.info kernel: [79832.764330] phy2-ap0: HW problem - can not stop rx aggregation for 38:80:df:xx:xx:xx tid 4
Tue Oct 24 23:22:06 2023 kern.err kernel: [79837.595872] qcom-q6v5-wcss-pil cd00000.q6v5_wcss: start timed out
Tue Oct 24 23:22:06 2023 kern.err kernel: [79837.595923] remoteproc remoteproc0: can't start rproc cd00000.q6v5_wcss: -110
Tue Oct 24 23:22:06 2023 kern.warn kernel: [79837.835855] ath11k c000000.wifi: failed to flush transmit queue, data pkts pending 1
Tue Oct 24 23:22:06 2023 kern.warn kernel: [79837.835954] ath11k c000000.wifi: failed to send WMI_PEER_DELETE cmd
Tue Oct 24 23:22:06 2023 kern.warn kernel: [79837.842662] ath11k c000000.wifi: failed to delete peer vdev_id 1 addr 38:80:df:xx:xx:xx ret -108
Tue Oct 24 23:22:06 2023 kern.warn kernel: [79837.848683] ath11k c000000.wifi: Failed to delete peer: 38:80:df:xx:xx:xx for VDEV: 1
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.857696] ath11k c000000.wifi: Found peer entry 28:d1:27:xx:xx:xx n vdev 1 after it was supposedly removed
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.865468] ------------[ cut here ]------------
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.875297] WARNING: CPU: 1 PID: 1737 at sta_set_sinfo+0xcb4/0xd30 [mac80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.879902] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_inet ath11k_ahb ath11k ath10k_pci ath10k_core ath wireguard pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mac80211 libchacha20poly1305 chacha_neon cfg80211 slhc qrtr_smd qrtr qmi_helpers poly1305_neon nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcurve25519_generic libcrc32c libchacha crc_ccitt compat ip6_udp_tunnel udp_tunnel sha512_generic seqiv jitterentropy_rng drbg michael_mic kpp hmac cmac leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom qca_nss_dp qca_ssdk gpio_button_hotplug ext4 mbcache jbd2 aquantia hwmon crc32c_generic
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.942300] CPU: 1 PID: 1737 Comm: hostapd Not tainted 6.1.59 #0
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.964535] Hardware name: Xiaomi AX3600 (DT)
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.970782] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.975040] pc : sta_set_sinfo+0xcb4/0xd30 [mac80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.981806] lr : sta_set_sinfo+0xcb0/0xd30 [mac80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.987015] sp : ffffffc00d15b870
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.992047] x29: ffffffc00d15b870 x28: ffffff8002c33c00 x27: ffffffc00d15bdc8
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79837.995353] x26: ffffff8002070080 x25: ffffffc008cda400 x24: ffffff800ec24940
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.002470] x23: ffffff800f91aab8 x22: ffffff800ec24940 x21: ffffff80065d08a0
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.009589] x20: 0000000000000001 x19: ffffff800f91a000 x18: 0000000000000179
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.016706] x17: 312076656476206e x16: 2066613a64623a64 x15: ffffffc008bd7298
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.023826] x14: 000000000000046b x13: 0000000000000179 x12: 00000000ffffffea
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.030943] x11: 00000000ffffefff x10: ffffffc008c2f298 x9 : ffffffc008bd7240
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.038061] x8 : 0000000000000024 x7 : ffffff80029a5000 x6 : 0000000000008e20
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.045178] x5 : ffffffc0173d1000 x4 : 0000000000000000 x3 : ffffff8002c33c00
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.052297] x2 : 0000000000000000 x1 : ffffff8002c33c00 x0 : 00000000ffffff94
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.059416] Call trace:
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.066524]  sta_set_sinfo+0xcb4/0xd30 [mac80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.068785]  sta_info_destroy_addr_bss+0x54/0x80 [mac80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.073648]  ieee80211_color_change_finish+0x1c68/0x1f80 [mac80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.079031]  cfg80211_check_station_change+0x1268/0x3530 [cfg80211]
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.085281]  genl_family_rcv_msg_doit+0xb8/0x11c
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.091527]  genl_rcv_msg+0x108/0x230
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.096386]  netlink_rcv_skb+0x5c/0x12c
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.099946]  genl_rcv+0x38/0x50
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.103590]  netlink_unicast+0x1e8/0x2d4
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.106717]  netlink_sendmsg+0x1a0/0x3d0
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.110884]  ____sys_sendmsg+0x1c8/0x270
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.114790]  ___sys_sendmsg+0x7c/0xc0
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.118696]  __sys_sendmsg+0x48/0xb0
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.122255]  __arm64_sys_sendmsg+0x24/0x30
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.125902]  invoke_syscall.constprop.0+0x5c/0x104
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.129810]  do_el0_svc+0x58/0x17c
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.134580]  el0_svc+0x18/0x54
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.137965]  el0t_64_sync_handler+0xf4/0x120
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.141006]  el0t_64_sync+0x174/0x178
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.145433] ---[ end trace 0000000000000000 ]---
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.149612] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.153690] ath11k c000000.wifi: failed to send pdev bss chan info request
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.161289] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.167937] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 0
Tue Oct 24 23:22:07 2023 kern.warn kernel: [79838.174447] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Tue Oct 24 23:22:07 2023 daemon.info hostapd: phy2-ap0: STA 38:80:df:xx:xx:xx IEEE 802.11: deauthenticated due to local deauth request
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.187083] ath11k_warn: 62 callbacks suppressed
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.187105] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.190798] ath11k c000000.wifi: failed to send pdev bss chan info request
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.198447] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.205040] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 0
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.211572] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.218061] ath11k c000000.wifi: failed to submit beacon template command: -108
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.224216] ath11k c000000.wifi: failed to update bcn template: -108
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.231430] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.238026] ath11k c000000.wifi: failed to set BA BUFFER SIZE 256 for vdev: 0
Tue Oct 24 23:22:13 2023 kern.warn kernel: [79844.244703] ath11k c000000.wifi: failed to send WMI_VDEV_SET_PARAM_CMDID
Tue Oct 24 23:22:19 2023 kern.warn kernel: [79850.253516] ath11k_warn: 69 callbacks suppressed

2 Likes

Hi, I am experiencing the exact same issue I guess. I tried spinning up 2 separate Wifi networks. When connecting and disconnecting to it, the ath11k somehow crashes, here are my logs:

Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] QC Image Version: QC_IMAGE_VERSION_STRING=WLAN.HK.2.9.0.1-01385-QCAHKSWPL_SILICONZ-1
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] Image Variant : IMAGE_VARIANT_STRING=8074.wlanfw.eval_v2Q
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894]
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] wal_peer_control.c:2870 Assertion is_graceful_to_handle failedparam0 :zero, param1 :zero, param2 :zero.
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] Thread ID      : 0x00000060  Thread name    : WLAN RT1  Process ID     : 0
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] Register:
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] SP : 0x4bfd5938
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] FP : 0x4bfd5940
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] PC : 0x4b1080c4
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] SSR : 0x00000008
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] BADVA : 0x00020000
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] LR : 0x4b107860
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894]
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] Stack Dump
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] from : 0x4bfd5938
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894] to   : 0x4bfd61b0
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.697894]
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.746886] remoteproc remoteproc0: crash detected in cd00000.q6v5_wcss: type fatal error
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.769132] remoteproc remoteproc0: handling crash #1 in cd00000.q6v5_wcss
Thu Oct 26 16:47:54 2023 kern.err kernel: [  177.777358] remoteproc remoteproc0: recovering cd00000.q6v5_wcss
Thu Oct 26 16:47:54 2023 kern.info kernel: [  177.810125] remoteproc remoteproc0: stopped remote processor cd00000.q6v5_wcss
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.116592] ath11k c000000.wifi: failed to find peer 38:1f:8d:df:74:18 on vdev 2 after creation
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.116649] ath11k c000000.wifi: failed to find peer vdev_id 2 addr 38:1f:8d:df:74:18 in delete
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.124115] ath11k c000000.wifi: failed peer 38:1f:8d:df:74:18 delete vdev_id 2 fallback ret -22
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.132817] ath11k c000000.wifi: Failed to add peer: 38:1f:8d:df:74:18 for VDEV: 2
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.141833] ath11k c000000.wifi: Failed to add station: 38:1f:8d:df:74:18 for VDEV: 2
Thu Oct 26 16:47:54 2023 daemon.notice hostapd: phy2-ap1: STA 38:1f:8d:df:74:18 IEEE 802.11: Could not add STA to kernel driver
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.186900] ath11k c000000.wifi: failed to send WMI_PDEV_BSS_CHAN_INFO_REQUEST cmd
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.186938] ath11k c000000.wifi: failed to send pdev bss chan info request
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.193560] ath11k c000000.wifi: failed to send WMI_PDEV_SET_PARAM cmd
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.200249] ath11k c000000.wifi: Failed to set beacon mode for VDEV: 1
Thu Oct 26 16:47:54 2023 kern.warn kernel: [  178.206760] ath11k c000000.wifi: failed to send WMI_BCN_TMPL_CMDID

The firmware does not recover from this error and a reboot is needed
The rooter works fine running only 1 SSID. I already read the thread mentioned by @Catfriend1. I guess we have to wait until its fixxed? Or did somebody already found a working solution? I yet tried the different versions of openwrt for the ax3600 with always the same result.
I am new to openwrt, any help is apreciated

1 Like

Is this openwrt or firmware code affected in ath11k?

Hi Hugo,

I'm running into the same problem using another router (DL-WRX36), even with much newer ath11k firmware (WLAN.HK.2.9.0.1-01862-QCAHKSWPL_SILICONZ-1).

Someone suggested enabling Multi To Unicast, which LuCI shows on Advanced Settings of the Interface Configuration for each Wireless Network. (You could also add option multicast_to_unicast_all '1' to each wifi-iface in /etc/config/wireless.)

There are those who say this shouldn't make a difference, but I put it in place a week ago and I haven't seen this issue since. It could be coincidence, but maybe this is a half-decent workaround until Qualcom fixes the firmware bug.

2 Likes

This is the syslog of openwrt from logread cmd

cc @ansuel @robimarko

I have been debugging the ath11k firmware crash "when a wifi client disassociates" with my DL-WRX36. I have only been testing with the 5GHz radio, as that has had more complex config.

I originally thought that the crash might be about WPA3 or 802.11r, but apparently not.

My experiments have actually provided me with a hypothesis about the crash occurrence. The evidence is small so far (only a few weeks of real-life experimenting) , so I hope that others could verify my thoughts:

  • The crash does not occur if there is just one SSID on the radio
  • The crash does not occur if the disassociating client is connected to the first SSID of the radio.
  • The crash may occur if the client is associated with the second or third SSID.

Reorganising the SSID order in my wifi config file has (so far) stopped crashes, as the 5GHZ clients are currently connected to the first SSID in the file.

This makes me to think that there is a bug in the firmware, where somewhere (in the "color change" / "sta_info_destroy_addr_bss" / "sta_set_sinfo" ?) code the firmware erroneously always tries to handle the first SSID.

I will next try reorganising the SSIDs to a different order again, so that all moving clients are connected to the second SSID.

Ps.
The earlier observations of @Catfriend1 seem to contradict my own evidence so far, so I am not yet really sure about my conclusions, but wanted still to mention my thoughts.

5 Likes

I think we experience the same, regarding your hypothesis.

My ath11k runs these three radios in order:

  • mesh Ssid
  • Client Ssid A
  • Client Ssid B

So mesh would be my "first" from drivers point of view. A and B thus second and third, experiencing the problems on disconnect and reconnect you've written above.

2 Likes

Hi @Catfriend1, if your running one of the latest snapshots, could you please post your output of:

cat /var/run/hostapd-phy*.conf | grep he_bss_color

I just wanna see if anyone else got 128...

What do you mean reorganizing the WiFi?