802.11s mesh stops working randomly

Hello folks,
I have an issue regarding 802.11s on my Archer C7 v2.
It's a small setup with the Archer C7 and 2 TL-WR841 (MT76) with a quite minimal build. I basically only have the WiFi stuff and dropbear running on them.
The issue is, that the mesh network just randomly stops working (after only a few minutes to sometimes several hours).
The WR841s are still showing as connected in LuCI. But I can't reach them (can't ping them, clients can connect though, but no traffic goes through at all).
The C7 is the router connected to LAN and WAN and is doing all the "heavy" work (DHCP, DNS...). The TL-WR841 are just dumb APs for range extension.

To get everything working again I have to restart the wireless interface (wifi down && wifi up) on the C7. Restarting the WR841 has no effect.

There are no related entries in system or kernel log and I am running out of ideas.
I am in no way a professional in networking stuff and maybe I just misconfigured something.

The configs look like this:

config wifi-iface 'wifinet0'
        option device 'radio0'
        option mode 'ap'
        option ssid '***'
        option key '***'
        option wpa_disable_eapol_key_retries '1'
        option network 'wireless'
        option dtim_period '2'
        option ft_psk_generate_local '1'
        option nasid 'a63f9cef'
        option mobility_domain '3fa6'
        option ieee80211r '1'
        option ft_over_ds '0'
        option encryption 'psk2+ccmp'
        option ieee80211w '0'

config wifi-iface 'mesh'
        option network 'mesh wireless'
        option device 'radio0'
        option mode 'mesh'
        option mesh_id '***'
        option mesh_fwding '0'
        option encryption 'sae'
        option key '***'
        option mesh_rssi_threshold '-80'

The WR841s have other nasids of course and bridged LAN and WiFi with STP enabled. Otherwise the config is the same.

Memory doesn't seem to be an issue.
C7:

root@OpenWrt:~# free
              total        used        free      shared  buff/cache   available
Mem:         124512       50644       55508        2688       18360       39896
Swap:             0           0           0

one of the WR841:

root@OpenWrt:~# free
              total        used        free      shared  buff/cache   available
Mem:          58648       13036       37148          52        8464       29772
Swap:             0           0           0

I am using 19.07 with wpad-mesh-wolfssl on all devices. Mesh-point is supported on all devices (according to iw).

Any ideas?

Try to use non ct ath10k drivers.

1 Like

Unless something is crashing (you say there is no indication of this), then in my experience the mesh links are breaking down probably due to interference or dropping signal strength for what ever reason.
You need to set:
mesh_fwding='1'
mesh_gate_announcements='1' on at least one device but all 3 would be fine.
mesh_rssi_threshold='-80'

The first two allow the mesh to reconfigure itself and self heal its layer 2 "routing". On startup these are effectively automatically set temporarily, but if something goes wrong, a wireless restart is required without permanent setting.

The last sets the minimum signal strength required to stay in the mesh. In a noisy environment you might have to tune this one, -75, -70 until it is stable - this also effects range of course.

These HAVE to be set using the iw utility though, at startup and after a wifi command or network restart. They do not take effect in the uci config as they have to be set AFTER the interface is up. They are conveniently put in the config so you can run a script to pick them up and set using iw. I just run a background process that checks every 30 seconds or so and sets if necessary.

Of course you might have some other problem here, but..... works for me with what looks like the same symptoms.

1 Like

I know the ct drivers can cause some issues, but that wasn't the case here.

Setting gate announcements via iw solved the problem for me.
I didn't know, that those settings aren't applied post-up automatically, so as suggested, I run a script every minute via cron. I think a post-up action or a hotplug.d script should also work, but the cron was the simplest solution.
Thank you so much for your help and suggestions, both of you.

1 Like

Btw have you been able to encrypt the mesh with ct drivers? For me, mesh encryption only was doable using non ct drivers and wpa3-sae.

I had it running using the ct drivers. Same config as above.
I already switched to non-ct drivers some weeks ago, because of the mesh breaking down (didn't help, obviously), but SAE was never a problem.

1 Like

I can successfully configure mesh encryption through LUCI, but the mesh doesn't start up. Log says:

Thu Apr 29 16:36:47 2021 kern.warn kernel: [  663.095209] ath10k_pci 0000:00:00.0: 10.1 wmi init: vdevs: 16  peers: 127  tid: 256
Thu Apr 29 16:36:47 2021 kern.info kernel: [  663.112543] ath10k_pci 0000:00:00.0: wmi print 'P 128 V 8 T 410'
Thu Apr 29 16:36:47 2021 kern.info kernel: [  663.119029] ath10k_pci 0000:00:00.0: wmi print 'msdu-desc: 1424  sw-crypt: 0 ct-sta: 0'
Thu Apr 29 16:36:47 2021 kern.info kernel: [  663.127199] ath10k_pci 0000:00:00.0: wmi print 'alloc rem: 24984 iram: 38672'
Thu Apr 29 16:36:47 2021 kern.warn kernel: [  663.206527] ath10k_pci 0000:00:00.0: pdev param 0 not supported by firmware
Thu Apr 29 16:36:47 2021 kern.warn kernel: [  663.213856] ath10k_pci 0000:00:00.0: must load driver with rawmode=1 to add mesh interfaces

Screenshots:


image

What am I missing here? It only works with the exact same configuration if I switch to NON-CT ath10k drivers.

Try setting the correct Country Code under "Device Configuration -> Advanced Settings"

Edit: If you already did that, try "driver default" (don't know if it's still legal to run in your area though. Do so at your own risk)

My country code is DE (germany). I'll try driver default next.

Driver default did not help. I still get interface IF_UP_ERROR for the mesh interface when using ct ath10k drivers. Switched to non ct ath 10k drivers and the mesh went working again immediately.

I'm using ct ath10k drivers and wpad-mesh-wolfssl (instead of wpad-basic-wolfssl default pkg) on 21.02rc.1 when the problem occurs . Hw tplink archer c7v5. What should I do now?

I can't reproduce on my C7 v2. If you want to get it running using the ct drivers you should consider opening a new topic. I had it running using ct drivers and wpad-mesh-wolfssl. Maybe it has something to do with the soc/chipset of the v5? I don't know, sry :frowning:

Here are my installed packets:

root@OpenWrt:~# opkg list-installed
adblock - 4.0.7-5
ath10k-firmware-qca988x - 2019-10-03-d622d160-1
base-files - 204.2-r11306-c4a6851c72
busybox - 1.30.1-6
ca-bundle - 20200601-1
cgi-io - 19
coreutils - 8.30-2
coreutils-sort - 8.30-2
curl - 7.66.0-3
dropbear - 2019.78-2
firewall - 2019-11-22-8174814a-3
fstools - 2020-05-12-84269037-1
fwtool - 2
getrandom - 2019-06-16-4df34a4d-3
hostapd-common - 2019-08-08-ca8c2bd2-5
ip6tables - 1.8.3-1
iptables - 1.8.3-1
iw - 5.0.1-1
iwinfo - 2019-10-16-07315b6f-1
jshn - 2020-05-25-66195aee-1
jsonfilter - 2018-02-04-c7e938d6-1
kernel - 4.14.221-1-b84a5a29b1d5ae1dc33ccf9ba292ca1d
kmod-ath - 4.14.221+4.19.161-1-1
kmod-ath10k - 4.14.221+4.19.161-1-1
kmod-ath9k - 4.14.221+4.19.161-1-1
kmod-ath9k-common - 4.14.221+4.19.161-1-1
kmod-cfg80211 - 4.14.221+4.19.161-1-1
kmod-gpio-button-hotplug - 4.14.221-3
kmod-hwmon-core - 4.14.221-1
kmod-ip6tables - 4.14.221-1
kmod-ipt-conntrack - 4.14.221-1
kmod-ipt-core - 4.14.221-1
kmod-ipt-ipset - 4.14.221-1
kmod-ipt-nat - 4.14.221-1
kmod-ipt-offload - 4.14.221-1
kmod-lib-crc-ccitt - 4.14.221-1
kmod-mac80211 - 4.14.221+4.19.161-1-1
kmod-nf-conntrack - 4.14.221-1
kmod-nf-conntrack-netlink - 4.14.221-1
kmod-nf-conntrack6 - 4.14.221-1
kmod-nf-flow - 4.14.221-1
kmod-nf-ipt - 4.14.221-1
kmod-nf-ipt6 - 4.14.221-1
kmod-nf-nat - 4.14.221-1
kmod-nf-reject - 4.14.221-1
kmod-nf-reject6 - 4.14.221-1
kmod-nfnetlink - 4.14.221-1
kmod-nls-base - 4.14.221-1
kmod-phy-ath79-usb - 4.14.221-1
kmod-ppp - 4.14.221-1
kmod-pppoe - 4.14.221-1
kmod-pppox - 4.14.221-1
kmod-slhc - 4.14.221-1
kmod-usb-core - 4.14.221-1
kmod-usb-ehci - 4.14.221-1
kmod-usb-ledtrig-usbport - 4.14.221-1
kmod-usb2 - 4.14.221-1
libblobmsg-json - 2020-05-25-66195aee-1
libc - 1.1.24-2
libcurl4 - 7.66.0-3
libevent2-7 - 2.1.11-1
libevent2-core7 - 2.1.11-1
libevent2-pthreads7 - 2.1.11-1
libgcc1 - 7.5.0-2
libgmp10 - 6.1.2-2
libip4tc2 - 1.8.3-1
libip6tc2 - 1.8.3-1
libiwinfo-lua - 2019-10-16-07315b6f-1
libiwinfo20181126 - 2019-10-16-07315b6f-1
libjson-c2 - 0.12.1-3.1
libjson-script - 2020-05-25-66195aee-1
liblua5.1.5 - 5.1.5-3
liblucihttp-lua - 2019-07-05-a34a17d5-1
liblucihttp0 - 2019-07-05-a34a17d5-1
libmbedtls12 - 2.16.9-1
libmnl0 - 1.0.4-2
libncurses6 - 6.1-5
libnetfilter-conntrack3 - 2018-05-01-3ccae9f5-2
libnettle7 - 3.5.1-1
libnfnetlink0 - 1.0.1-3
libnl-tiny - 0.1-5
libopenssl1.1 - 1.1.1j-1
libpcap1 - 1.9.1-2.1
libpthread - 1.1.24-2
libubox20191228 - 2020-05-25-66195aee-1
libubus-lua - 2019-12-27-041c9d1c-1
libubus20191227 - 2019-12-27-041c9d1c-1
libuci20130104 - 2019-09-01-415f9e48-4
libuclient20160123 - 2020-06-17-51e16ebf-1
libunbound-light - 1.13.1-1
libwolfssl24 - 4.7.0-stable-1
libxtables12 - 1.8.3-1
logd - 2019-06-16-4df34a4d-3
lua - 5.1.5-3
luci - git-21.044.30835-34e0d65-1
luci-app-adblock - git-21.079.58580-41ab871-1
luci-app-firewall - git-21.044.30835-34e0d65-1
luci-app-opkg - git-21.044.30835-34e0d65-1
luci-app-unbound - git-21.079.58580-41ab871-1
luci-base - git-21.044.30835-34e0d65-1
luci-compat - git-21.079.58580-41ab871-1
luci-lib-ip - git-21.044.30835-34e0d65-1
luci-lib-jsonc - git-21.044.30835-34e0d65-1
luci-lib-nixio - git-21.044.30835-34e0d65-1
luci-mod-admin-full - git-21.044.30835-34e0d65-1
luci-mod-network - git-21.044.30835-34e0d65-1
luci-mod-status - git-21.044.30835-34e0d65-1
luci-mod-system - git-21.044.30835-34e0d65-1
luci-proto-ipv6 - git-21.044.30835-34e0d65-1
luci-proto-ppp - git-21.044.30835-34e0d65-1
luci-theme-bootstrap - git-21.044.30835-34e0d65-1
mtd - 24
nano - 5.6.1-1
netifd - 2021-01-09-753c351b-1
odhcp6c - 2021-01-09-64e1b4e7-16
odhcpd - 2020-05-03-49e4949c-3
openssh-sftp-server - 8.0p1-1
openwrt-keyring - 2019-07-25-8080ef34-1
opkg - 2021-01-31-c5dccea9-1
ppp - 2.4.7.git-2019-05-25-3
ppp-mod-pppoe - 2.4.7.git-2019-05-25-3
procd - 2020-03-07-09b9bd82-1
rpcd - 2020-05-26-67c8a3fd-1
rpcd-mod-file - 2020-05-26-67c8a3fd-1
rpcd-mod-iwinfo - 2020-05-26-67c8a3fd-1
rpcd-mod-luci - 20201107
rpcd-mod-rrdns - 20170710
swconfig - 12
tcpdump-mini - 4.9.3-2
terminfo - 6.1-5
uboot-envtools - 2018.03-3.1
ubox - 2019-06-16-4df34a4d-3
ubus - 2019-12-27-041c9d1c-1
ubusd - 2019-12-27-041c9d1c-1
uci - 2019-09-01-415f9e48-4
uclient-fetch - 2020-06-17-51e16ebf-1
uhttpd - 2020-10-01-3abcc891-1
unbound-control - 1.13.1-1
unbound-daemon - 1.13.1-1
urandom-seed - 1.0-1
urngd - 2020-01-21-c7f7b6b6-1
usign - 2020-05-23-f1f65026-1
wireless-regdb - 2020.11.20-1
wpad-mesh-wolfssl - 2019-08-08-ca8c2bd2-7
zlib - 1.2.11-3

1 Like

@Tekagi01 Ok will do a comparison, good list, thanks! I'm also using batman-adv , will check if it works with ct+mesh_fwding combination.

Ok my problem is exactly this one: Ath10k-ct with 802.11s - #8 by mustafan

Batman adv does not work in mesh mode using ath10k ct drivers! This user also got the sicsioffflags invalid and busy error in the log like me until he switched to ath10k non ct to get it work.

Wave 1 devices, those with the QCA988x abbreviation, can't do (encrypted) 802.11s mesh with the ath10k-ct driver. (I think to remember that unencrypted was possible but wouldn't try it.)

1 Like

Ok noted. Will the non-ct driver be part of the longterm OpenWrt future?

Yes, my first setup back in 2020 was with unencrypted mesh using the ct driver.

It does work with SAE in 19.07. (I only tried pure 802.11s, no batman-adv) Maybe it didn't in older releases, but now it does for sure.
I used the firmware which came with the official sysupgrade package.
I meshed with MT76 using the same SAE enabled config, so there should be no hidden encryption "downgrade" by the firmware.

I won't disagree, I was misleading with saying the "ct-driver". I did mean the ct-firmware, the 10.1 ct-firmware exactly, which usually comes together with the ct-driver and won't work with encrypted mesh. (In my experience there is no reason to mix it anyway, its getting worse than ct-only ...and the hassle with rawmode=1 for unencrypted mesh is not really a gift)
What mt76 wifi adapter do you have? Have you disabled Amsdu in the mt 76 driver? I can't mesh an ath10k adapter with a mt7612 adapter without getting <56k modem speeds.

How to know which devices are Wave-1 or Wave-2?

https://www.candelatech.com/ath10k.php


* [Ath10k CT 10.1 Firmware (QCA9880, QCA9882, QCA9887 chipsets)](https://www.candelatech.com/ath10k-10.1.php)
* [Ath10k CT 10.4 Firmware (QCA9980, QCA9984, QCA9886, QCA9888 wave-2 chipsets)](https://www.candelatech.com/ath10k-10.4.php)

10.1 ist for wave-1 and 10.4 is for wave-2

or checking the deivce information for "Wi-Fi Standards: 802.11ac Wave 2,..." @

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.