In short, we have problems with ath10k driver on 5GHz when there are more than 60 (approximately) devices connected to the AP based on IPQ4019 SoC.
We tried to use ath10k-ct driver and the firmware. We did not see any problems there, but we noticed that this driver only allows up to 32 devices to connect.
You need to use the fwcfg logic to increase number of peers. Keep active-peers == peers. Possibly decrease vdevs since you probably don't need 16 and that will save some RAM. Check 'iram' and other firmware ram usage using 'dmesg | grep iram' as you tune things so that you can know when you are close to the limits.
@greearb : as far as I understand, the actual limit on the number of peers is determined by ath10k-ct driver and firmware. Is there an easy way to tell how many peers my current device supports?
The main limit is RAM usage in the firmware, so if you decrease vdevs, then you can have more peers, etc. There is not a straight-foward way to calculate exact limits. If you have a specific target in mind and would like us to do the work to figure out the best configuration for you and test it, please contact me directly: greearb@candelatech.com
@greearb I tried what you suggested, but when I load the ath10k_pci kernel driver, I get the firmware crash. This is what I have put into /lib/firmware/ath10k/fwcfg-ahb-a800000.wifi.txt
It is almost certainly crashing due to OOM in the firmware. You likely don't need 8 vdevs, so make that 4 perhaps? And, you can decrease tx-descriptors. And if 64 works but 128 doesn't, then bisect to find what will load and run OK. You can grab a 'diet' version of the firmware from my page too, it compiles out a lot of un-used cruft and saves RAM (grab the beta version if you do this).
After upgrading to 19.07, half of my devices stopped working.
Despite the DHCP limit of 100, the driver now reports:
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.454824] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.456237] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA dc:4f:22:97:2d:a5 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 68:c6:3a:99:3b:18 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.468893] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.472669] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 60:01:94:70:1f:6f IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA dc:4f:22:db:d8:60 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.495499] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA dc:4f:22:de:4e:4f IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.523805] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 5c:cf:7f:a3:c3:e5 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.526317] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 84:f3:eb:c9:6e:06 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.555084] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 80:7d:3a:69:25:5d IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.630285] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 84:f3:eb:c9:6e:f1 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.644110] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 2c:f4:32:17:88:e4 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [ 77.700225] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
over and over. The previous version didn't have this issue. I'm using a R7800.
This seems like a big step backwards to me.
EDIT: Fixed it by "myself".
The key is using another driver. Probably the same which has been used by the previous version (18.XX). I was almost downgrading, but then found out that we can simply swap out drivers, hooray (I guess!?)
Waited for the command to complete without errors, then "reboot".
Done, all devices now connect again. Limit of 32 is gone. "Candela Technologies (ct)" should remove that limit, since only their driver has this, the "generic"/default one hasn't it. Even if it comes with it's own issues, maybe make a driver with and one without that limit. Ease users lives letting them choose between them!
Change both the firmware and the kmod when switching to the mainstream ath10k driver.
Realize that the commercial incentive for @greearb CT development is targeting big systems where it is possible to optimize speed at the expense of RAM. It's great that Ben is making his work available to OpenWrt even though it may not be a perfectly optimized drop-in solution for our use case.
Look for the text: "9984 firmware, full htt-mgt build, configured as AP supporting 98 connected stations."
for example that supports 98 on the ath10k-ct default OpenWRT firmware.
It's 2020. With a few smart bulbs, phone, laptop, PC, and smart speakers that's 30 alone. Then add in a wifi printer, vacuum, any other household smart lights, smart speakers, smart doorbells, smart thermostats, smart plugs, smart switches, computers, laptops, tablets, phones, eReaders, smart TVs, smart vehicles in the drive way, etc and you're at 64 for a semi-regular big family when everybody is home. Have a bunch of guests over for Christmas or something and a single big "smart" house could go over 100!
Needless to say that they weren't "happy" with the 32 limit at all.
Together with all phones, notebooks in the house, Fire TV Sticks, Tablets, Gateways and other things I'd say that I am probably reaching 80 or even a little bit more.
Nice, thanks! 98 is better. I've also seen 173 there. What's exactly the difference or downsides? It mentions "trimmed", but which effects do this have in practice?
Also, why 2 less than specified inside "peers" and "active peers"?
Jeez…! I struggled for months to get WPA3 working on my Omnia (with two QCA9880 cards) because the firmware would crap out with 802.11w enabled, after a couple of days. I eventually tried ath10k-ct with the HTT firmware and it's rock-solid (thanks a bunch, @greearb).
And now I wonder if I could return to the non-HTT firmware by tweaking this a bit…
I'm quite lost with drivers and stuff. I have some knowledge in linux, but when it comes to specific things, especially networking, I get lost pretty quickly.
What's exactly the difference between them? HTT, non-HTT, full htt-mgt build, trimmed htt-mgt build...?
greearb:
Look for the text: "9984 firmware, full htt-mgt build, configured as AP supporting 98 connected stations."
for example that supports 98 on the ath10k-ct default OpenWRT firmware.
Nice, thanks! 98 is better. I've also seen 173 there. What's exactly the difference or downsides? It mentions "trimmed", but which effects do this have in practice?
Also, why 2 less than specified inside "peers" and "active peers"?
Because internal firmware magic.
The trimmed build compiles out a bunch of cruft that I do not need for building
test systems and/or stuff that is not supported in ath10k driver anyway.
This gives a lot more resources for the firmware to have more peers.
I'll fix up my scripts to make that available for OpenWRT when I get
a chance.
Simplified, they are "touch sensitive relays with WiFi" that are run with an ESP chip, which you can flash custom firmware on it. I used ESPHome, configured it to do all sorts of nice stuff (short press, long press, etc) and connected it wirelessly to "Home Assistant", which is an Open Source Home Automation Platform. This also allows me to turn on/off all kinds of devices with my phone or pc. It's really nice.
But since the house it quite big, counting all rooms, closets, restrooms, etc, the number of devices racks up quickly. There are Sonoff Basic's which turn on/off water for the garden, the pool and a lot of other stuff. ESP8266 devices which measure temperature and water-tank level, etc.
I have installed your htt-mgt firmware some time ago (thank you for it!) but only now found info about incompatibility. How can I tell there is compatibility issue? Would it just not work or is it more subtle?
I am using hnyman build that comes with: kmod-ath10k-ct 4.19.98+2019-09-09-5e8cd86f-1