IPQ806x NSS Drivers

@ACwifidude there is apparently a bug, vanila or your build? my 5g radio is disabled ... see full dmesg log


apparently something with ath drivers...


[19382.093612] ath10k_pci 0001:01:00.0: wmi: fixing invalid VHT TX rate code 0xff
[168813.440524] device wlan0 left promiscuous mode
[168813.440696] br-lan: port 3(wlan0) entered disabled state
[168813.484709] ath10k_pci 0000:01:00.0: mac flush null vif, drop 0 queues 0xffff
[168813.486047] wlan0: Destroyed NSS virtual interface
[168813.486466] ath10k_pci 0000:01:00.0: mac-vif-chan had error in htt_rx_h_vdev_channel, peer-id: 0  vdev-id: 0 peer-addr: 8c:3b:ad:ba:b5:60.
[168813.495615] ath10k_pci 0000:01:00.0: No VIF found for vdev 0
[168813.508140] ------------[ cut here ]------------
[168813.514097] WARNING: CPU: 0 PID: 0 at target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/ath10k-ct-regular/ath10k-ct-2021-09-22-e6a7d5b5/ath10k-5.10/htt_rx.c:1096 ath10k_htt_rx_pktlog_completion_handler+0x54c/0x1790 [ath10k_core]
[168813.518701] Modules linked in: ecm ath10k_pci ath10k_core ath wireguard mac80211 libchacha20poly1305 libblake2s ipt_REJECT curve25519_neon cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY sch_cake ppp_async poly1305_arm nf_reject_ipv4 nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_conntrack_netlink nf_conncount libcurve25519_generic libblake2s_generic iptable_raw iptable_nat iptable_mangle iptable_filter ipt_ECN ip_tables exfat crc_ccitt compat chacha_neon fuse sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact qca_nss_qdisc qca_nss_pppoe pppoe pppox ppp_generic slhc ledtrig_usbport cryptodev
[168813.518799]  xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_NPT nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 nfsv4 nfsv3 nfs msdos bonding ifb ip6_udp_tunnel udp_tunnel sit qca_nss_drv qca_nss_gmac tunnel4 ip_tunnel tun xfrm_user xfrm_ipcomp af_key xfrm_algo vfat fat lockd sunrpc grace hfsplus hfs dns_resolver dm_mirror dm_region_hash dm_log dm_crypt nls_utf8 nls_iso8859_15 nls_iso8859_1 nls_cp850 nls_cp437 nls_cp1250 wp512 twofish_generic twofish_common tgr192 tea serpent_generic khazad cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common anubis xts crypto_user
[168813.605533]  algif_skcipher algif_rng algif_hash algif_aead af_alg sha1_generic md5 kpp gf128mul echainiv ecb des_generic libdes cbc authenc dm_mod dax uas usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom ohci_platform ohci_hcd phy_qcom_ipq806x_usb ahci fsl_mph_dr_of ehci_platform ehci_fsl sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug f2fs ext4 mbcache jbd2 crc32c_generic crc32_generic
[168813.732000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.171 #0
[168813.754182] Hardware name: Generic DT based system
[168813.760355] [<c030f974>] (unwind_backtrace) from [<c030b968>] (show_stack+0x14/0x20)
[168813.765044] [<c030b968>] (show_stack) from [<c0908dd8>] (dump_stack+0x94/0xa8)
[168813.773031] [<c0908dd8>] (dump_stack) from [<c031e944>] (__warn+0xb4/0xd0)
[168813.780143] [<c031e944>] (__warn) from [<c031e9b0>] (warn_slowpath_fmt+0x50/0x90)
[168813.787110] [<c031e9b0>] (warn_slowpath_fmt) from [<bfa02f84>] (ath10k_htt_rx_pktlog_completion_handler+0x54c/0x1790 [ath10k_core])
[168813.794845] [<bfa02f84>] (ath10k_htt_rx_pktlog_completion_handler [ath10k_core]) from [<bfa031a8>] (ath10k_htt_rx_pktlog_completion_handler+0x770/0x1790 [ath10k_core])
[168813.806846] [<bfa031a8>] (ath10k_htt_rx_pktlog_completion_handler [ath10k_core]) from [<bfa07a20>] (ath10k_htt_rx_proc_rx_frag_ind_hl+0xc08/0x1050 [ath10k_core])
[168813.821861] [<bfa07a20>] (ath10k_htt_rx_proc_rx_frag_ind_hl [ath10k_core]) from [<bfa085ec>] (ath10k_htt_txrx_compl_task+0x784/0x1144 [ath10k_core])
[168813.836168] [<bfa085ec>] (ath10k_htt_txrx_compl_task [ath10k_core]) from [<bfa65948>] (ath10k_pci_napi_poll+0x60/0x14c [ath10k_pci])
[168813.849621] [<bfa65948>] (ath10k_pci_napi_poll [ath10k_pci]) from [<c0795d34>] (__napi_poll+0x34/0x168)
[168813.861569] [<c0795d34>] (__napi_poll) from [<c0796088>] (net_rx_action+0xd8/0x21c)
[168813.871116] [<c0796088>] (net_rx_action) from [<c0302298>] (__do_softirq+0x130/0x2d4)
[168813.878842] [<c0302298>] (__do_softirq) from [<c0322d30>] (irq_exit+0xbc/0xe0)
[168813.886568] [<c0322d30>] (irq_exit) from [<c036fba0>] (__handle_domain_irq+0x6c/0xd0)
[168813.893780] [<c036fba0>] (__handle_domain_irq) from [<c05e053c>] (gic_handle_irq+0x5c/0xb8)
[168813.901759] [<c05e053c>] (gic_handle_irq) from [<c0301a8c>] (__irq_svc+0x6c/0x90)
[168813.910348] Exception stack(0xc0c01ee0 to 0xc0c01f28)
[168813.917734] 1ee0: 00000000 00009988 1ce4e000 dd98fa80 dcc13000 00000000 dd98ee30 00009988
[168813.922859] 1f00: 00009988 00000000 f3e46360 f3de59c0 00000015 c0c01f30 c0735460 c0735464
[168813.931098] 1f20: 80000013 ffffffff
[168813.939348] [<c0301a8c>] (__irq_svc) from [<c0735464>] (cpuidle_enter_state+0x94/0x498)
[168813.943080] [<c0735464>] (cpuidle_enter_state) from [<c07358ac>] (cpuidle_enter+0x30/0x4c)
[168813.951151] [<c07358ac>] (cpuidle_enter) from [<c034ae34>] (do_idle+0x1d8/0x240)
[168813.959222] [<c034ae34>] (do_idle) from [<c034b144>] (cpu_startup_entry+0x1c/0x20)
[168813.966867] [<c034b144>] (cpu_startup_entry) from [<c0b00e5c>] (start_kernel+0x4dc/0x4ec)
[168813.974387] ---[ end trace 96b0b29aad2b71ff ]---
[168813.982709] ath10k_pci 0000:01:00.0: No VIF found for vdev 0
[168813.987449] ath10k_pci 0000:01:00.0: No VIF found for vdev 0


any idea?

I restarted wifi 5ghz via luci> and it works.

[237051.200176] device wlan0 left promiscuous mode
[237051.200286] br-lan: port 3(wlan0) entered disabled state
[237051.250738] wlan0: Destroyed NSS virtual interface
[237051.250971] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 0
[237057.727068] ath10k_pci 0000:01:00.0: 10.4 wmi init: vdevs: 16  peers: 48  tid: 96
[237057.727096] ath10k_pci 0000:01:00.0: msdu-desc: 2500  skid: 32
[237057.809052] ath10k_pci 0000:01:00.0: wmi print 'P 48/48 V 16 K 144 PH 176 T 186  msdu-desc: 2500  sw-crypt: 0 ct-sta: 0'
[237057.809876] ath10k_pci 0000:01:00.0: wmi print 'free: 84920 iram: 13156 sram: 11224'
[237058.193616] ath10k_pci 0000:01:00.0: rts threshold -1
[237058.193925] ath10k_pci 0000:01:00.0: Firmware lacks feature flag indicating a retry limit of > 2 is OK, requested limit: 4
[237058.197908] debugfs: File 'virt_if' in directory 'stats' already present!
[237058.209207] wlan0: Created a NSS virtual interface
[237058.219449] br-lan: port 3(wlan0) entered blocking state
[237058.220470] br-lan: port 3(wlan0) entered disabled state
[237058.226147] device wlan0 entered promiscuous mode

but in such a case/ scenario i could say its unstable device... ;/

any idea here on the previous post? thx

After the restart any further issues?

after restart it works - for now.

@ACwifidude i assume as it happened once it can happen again...?

@quarky @KONG
Seems like this commit break the 999-mac80211-NSS-support patch?
It doesn't apply anymore, should be an easy fix tho' but I have no idea how it will impact the NSS code.

Should not affect the NSS flow. Just remove those codes that are removed using the latest mac80211 patch.

1 Like

@ACwifidude it happened again,
its unstable to be used properly

is that issue of your image or openwrt in general?

Never seen that myself...

well logs are here as a proof .. so either sw or hw problem, no clue?

I've never debugged Linux warnings/oops, so I can't tell what that error actually is.
@quarky @ansuel Sorry to tag you, but any ideas?

First look appears to be related to the ath10k driver. Not sure if it’s related to the issue I’m encountering.

Agree with @quarky - looks like errors with ath (general OpenWrt problem). There is good discussion in the exploration thread on troubleshooting more recent wifi issues some people are having. Might be related or might be another bug.

I was tired of seeing NSS_TX_FAILURE_TOO_SHORT on wlan1 in the logs all the time (things seems to work fine despite that failure), this patch just fixes the log spam, it doesn't fix the actual issue. I think I got the code right, it doesn't crash anyway :slight_smile:
Should be applied after the 999-mac80211-NSS-support.patch that's in @ACwifidude's repository.

--- a/net/mac80211/iface.c	2022-02-17 09:28:56.041204675 +0100
+++ b/net/mac80211/iface.c	2022-02-17 09:27:50.454300512 +0100
@@ -1206,8 +1206,8 @@
 		skb_push(skb, ETH_HLEN);
 		ret = nss_virt_if_tx_buf(sdata->nssctx, skb);
 		if (unlikely(ret)) {
-			if (net_ratelimit()) {
-				sdata_err(sdata, "NSS TX failed with error: %s\n",
+			if (net_ratelimit() && ret != NSS_TX_FAILURE_TOO_SHORT) {
+				sdata_err(sdata, "NSS TX failed with error: %s\n",
 			skb_pull(skb, ETH_HLEN);

Patch updated to what quarky mentioned below.

Your code will not work. You can't compare two strings in that way. You should still see the error if I'm not wrong. This should work:

if (net_ratelimit() && ret != NSS_TX_FAILURE_TOO_SHORT) {

The error actually disappeared from the logs, but I'm sure your code is more correct. It sure makes more sense than what I did :slight_smile: ret should contain the string of course

I can add that in next build. Is there an ability to fix the origin of the too short error or is this just log spam in your opinion (@shelterx I too don’t know what the error means and haven’t noticed any issues related to it)

The issue seems to be a result of having mesh nodes in the network. Those nodes sends zero length frames it seems. So I think there’s nothing that can be done at the AP end.

The TX error on wlan1 is got to be from the only device i have connected on the 2.4Ghz Wifi, which's a Yamaha Receiver. All other devices are connected to 5Ghz Wifi.

I don't have any mesh networking going on... Just an extender that I use as a bridge, so I have ethernet clients behind it. It doesn't actually extends the Wifi, that function is turned off.

I can reproduce this wifi crash issue. It happens exactly on my R7800. The version I am using is OpenWrt 21.02-SNAPSHOT r16474+17-97b95ef8b9 named with R7800-20220127-Stable2012NSS-factory. Hope the information helps. The log is the same.