Wireless instability on ath10k radios after upgrade to 21.02.1

For WRT3200ACM with 21.02.1 it is likely at least this bug causing various trouble... (already fixed in 21.02 branch, but not in the old 21.02.1 release)

Thanks @hnyman. I'll keep my eye out for the 21.02.2 and give it another whirl.

Channel 2 overlaps with other channels on the 2.4 Ghz band.

Select either channel 1, 6, or 11

i've switched to channel 1 and i'm still having this issue.

the problem i'm experiencing most consistently is that the device is still associated (i can see it in wireless section, excellent signal strength), it has a dhcp lease, but 100% packet loss to the router.

cycling the wifi on the client solves it, but sometimes it fails again as little as 5 minutes later. this is happening to several devices on the network.

absolutely nothing of interest in dmesg or the system logs. the routes look right. i've completely disabled the firewall on the client.

one of the clients is a Intel Comet Lake PCH-LP CNVi WiFi (iwlwifi)

1 Like

Try changing your channel width from 40 to 20.

changing channel width did not help. i also notice that this is not happening to all devices at the same time. individual devices will independently stop being able to ping the router.

Couple of things to look at...

What settings are you using for wireless security?

Not all devices are WPA3 capable.

Any legacy devices?

Some may need 802.11b to work.

the settings are all the same as they were on 19.07.8. this is happening with modern devices like a Gen 8 Lenovo Thinkpad X1 Carbon (with the aforementioned iwlwifi card).

When you upgraded to 21.02.1, did you keep your 19.07.8 configs, or re-configure from scratch?

There were several changes associated with the move from swconfig to DSA.

From the 21.02.0 release announcement...

The following targets are using a switch managed with DSA in OpenWrt 21.02:

    ath79 (only TP-Link TL-WR941ND)
    mediatek (most boards)
    ramips (mt7621 subtarget only)

The Turris Omnia target is mvebu.

Which means your configs were not upgradable from 19.07.8 to 21.02.1

looking back at my old configuration (i have biweekly backups dating back to 2021-05). at the least, /etc/config/wireless from my earliest backup is completely identical to what it was prior to the changes we made in this thread.

i have a machine that just entered this broken state again. it is still associated with the router. it has a DHCP lease. it cannot ping the router by IP address. the router still has it in the list of active wireless clients.

A couple of minor changes to wireless...cell density is one.

Network and other configs have changed for DSA.

I use a file diff program to compare the backup tar.gz files.

Use the 19.07.8 configs as a reference, reset, and re-configure from scratch.

First post, 21.02.0 release announcement...

something interesting. on the router, looking at the assoclist, the RX packets continue to increase, but TX is not:

# iwinfo wlan1 assoclist
<HWADDR>  -47 dBm / -95 dBm (SNR 48)  370 ms ago
	RX: 6.0 MBit/s                                152197 Pkts.
	TX: 130.0 MBit/s, MCS 15, 20MHz               124533 Pkts.
	expected throughput: 46.3 MBit/s

i'm not sure if i'm looking at the right /sys values on the client, but it seems like there is movement on /sys/class/net/wlp0s20f3/statistics/tx_packets but not rx_packets.

so it seems like both devices believe they are sending packets to the other device, but neither of them are receiving. yet somehow they remain associated.

in terms of wireless config, here is the old:

config wifi-device 'radio1'
	option type 'mac80211'
	option hwmode '11g'
	option path 'soc/soc:pcie/pci0000:00/0000:00:01.0/0000:01:00.0'
	option country 'US'
	option htmode 'HT40'
	option channel '2'

and here is the new:

config wifi-device 'radio1'
	option type 'mac80211'
	option hwmode '11g'
	option path 'soc/soc:pcie/pci0000:00/0000:00:01.0/0000:01:00.0'
	option country 'US'
	option cell_density '0'
	option channel '1'
	option htmode 'HT20'

the (enabled) wifi-iface sections are identical.

reading through the release notes, should DSA really be impacting wireless? it seems like this would only affect ethernet ports?

i'm using WPA2-PSK for all of my networks.

i'm reading through https://openwrt.org/docs/guide-developer/debugging#wireless and have increased the radio log level in the hopes that will help me troubleshoot.

approximately 24 hours ago i ran the following:

# uci set wireless.radio1.log_level=1
# uci commit wireless
# wifi up

since then no issues. however, i suspect the issue starts to happen after some period of time.

i'll report here if it happens again and will try to grab a snapshot of the management frames or at least more verbose log output.

finally had another failure today

there is nothing in the log at the time the issue started.

when i reassociate, it looks like any normal reassociation:

Fri Jan 21 03:53:12 2022 daemon.notice hostapd: wlan1: AP-STA-DISCONNECTED <HWADDR>
Fri Jan 21 03:53:12 2022 daemon.debug hostapd: wlan1: STA <HWADDR> WPA: event 3 notification
Fri Jan 21 03:53:12 2022 daemon.debug hostapd: wlan1: STA <HWADDR> IEEE 802.1X: unauthorizing port
Fri Jan 21 03:53:12 2022 daemon.debug hostapd: wlan1: STA <HWADDR> IEEE 802.11: deauthenticated
Fri Jan 21 03:53:12 2022 daemon.debug hostapd: wlan1: STA <HWADDR> MLME: MLME-DEAUTHENTICATE.indication(<HWADDR>, 3)
Fri Jan 21 03:53:12 2022 daemon.debug hostapd: wlan1: STA <HWADDR> MLME: MLME-DELETEKEYS.request(<HWADDR>)

based on some discussion in TP-Link Archer c2600 running poorly from 21.02 and onwards - #18 by dipswitch it appears that some devices benefit from switching back from the -ct driver to the closed source ath10k driver. 21.02.1 switched the default driver to the -ct. i'll test this out and report back.

i've performed the following to switch to the non-ct version of the firmware for my Turris Omnia:

# opkg update
# opkg remove ath10k-firmware-qca988x-ct kmod-ath10k-ct && opkg install ath10k-firmware-qca988x kmod-ath10k
# reboot

so far everything is working fine. i'll check back in a few days to report if the stability has improved.

1 Like

stability has definitely improved since switching back to the non-ct version of the kernel driver and firmware. i'd recommend anyone having trouble with 21.02.1 on a Turris Omnia try this out first before any other troubleshooting.

I have a similar problem. Even tried the ct-htt firmware but it didn't help. I will try the non-ct version, although I really hope it's somehow fixable for the ct version as well.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.