Ath10k-ct driver supports only 32 devices

Hi there,

I have posted a question to the OpenWRT mailing list:
http://lists.infradead.org/pipermail/openwrt-devel/2019-July/017937.html

In short, we have problems with ath10k driver on 5GHz when there are more than 60 (approximately) devices connected to the AP based on IPQ4019 SoC.
We tried to use ath10k-ct driver and the firmware. We did not see any problems there, but we noticed that this driver only allows up to 32 devices to connect.

Any reason for this?
Can this be increased?

BR,
Matej

Maybe @greearb knows

1 Like

Hello,

You need to use the fwcfg logic to increase number of peers. Keep active-peers == peers. Possibly decrease vdevs since you probably don't need 16 and that will save some RAM. Check 'iram' and other firmware ram usage using 'dmesg | grep iram' as you tune things so that you can know when you are close to the limits.

https://www.candelatech.com/ath10k-10.4.php#config

3 Likes

@greearb : as far as I understand, the actual limit on the number of peers is determined by ath10k-ct driver and firmware. Is there an easy way to tell how many peers my current device supports?

The main limit is RAM usage in the firmware, so if you decrease vdevs, then you can have more peers, etc. There is not a straight-foward way to calculate exact limits. If you have a specific target in mind and would like us to do the work to figure out the best configuration for you and test it, please contact me directly: greearb@candelatech.com

@greearb, thank you for the explanation.
I'll try what you suggested and will let you know how it worked.

Otherwise, I'll contact you on your mail and we can discuss, how you can help us.

Thanks and BR,
Matej

@greearb I tried what you suggested, but when I load the ath10k_pci kernel driver, I get the firmware crash. This is what I have put into /lib/firmware/ath10k/fwcfg-ahb-a800000.wifi.txt

# cat /lib/firmware/ath10k/fwcfg-ahb-a800000.wifi.txt
vdevs = 8
peers = 128
active_peers = 128
stations = 128

And when I load the module, I get:

root@OpenWrt:/# modprobe ath10k_pci
[ 3018.479993] ath10k_ahb a000000.wifi: Direct firmware load for ath10k/fwcfg-ahb-a000000.wifi.txt failed with error -2
[ 3018.480335] ath10k_ahb a000000.wifi: Falling back to user helper
[ 3018.539443] firmware ath10k!fwcfg-ahb-a000000.wifi.txt: firmware_loading_store: map pages failed
[ 3018.540070] ath10k_ahb a000000.wifi: Direct firmware load for ath10k/QCA4019/hw1.0/ct-firmware-5.bin failed with error -2
[ 3018.547442] ath10k_ahb a000000.wifi: Falling back to user helper
[ 3018.628657] firmware ath10k!QCA4019!hw1.0!ct-firmware-5.bin: firmware_loading_store: map pages failed
[ 3018.629424] ath10k_ahb a000000.wifi: Direct firmware load for ath10k/QCA4019/hw1.0/ct-firmware-2.bin failed with error -2
[ 3018.637013] ath10k_ahb a000000.wifi: Falling back to user helper
[ 3018.704125] firmware ath10k!QCA4019!hw1.0!ct-firmware-2.bin: firmware_loading_store: map pages failed
[ 3018.704959] ath10k_ahb a000000.wifi: Direct firmware load for ath10k/QCA4019/hw1.0/firmware-6.bin failed with error -2
[ 3018.712447] ath10k_ahb a000000.wifi: Falling back to user helper
[ 3018.784404] firmware ath10k!QCA4019!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
[ 3018.787239] ath10k_ahb a000000.wifi: qca4019 hw1.0 target 0x01000000 chip_id 0x003b00ff sub 0000:0000
[ 3018.792373] ath10k_ahb a000000.wifi: kconfig debug 1 debugfs 1 tracing 0 dfs 1 testmode 0
[ 3018.806846] ath10k_ahb a000000.wifi: firmware ver 10.4b-ct-4019-fW-012-e8020273 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-d
[ 3018.860732] ath10k_ahb a000000.wifi: board_file api 2 bmi_id 0:20 crc32 bcebe54c
[ 3020.170720] ath10k_ahb a000000.wifi: 10.4 wmi init: vdevs: 16  peers: 48  tid: 96
[ 3020.170772] ath10k_ahb a000000.wifi: msdu-desc: 2500  skid: 32
[ 3020.221972] ath10k_ahb a000000.wifi: wmi print 'P 48/48 V 16 K 144 PH 176 T 186  msdu-desc: 2500  sw-crypt: 0 ct-sta: 0'
[ 3020.222380] ath10k_ahb a000000.wifi: wmi print 'free: 56576 iram: 23480 sram: 35968'
[ 3020.395706] ath10k_ahb a000000.wifi: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 32 raw 0 hwcrypto 1
[ 3020.652698] ath10k_ahb a800000.wifi: fwcfg key: vdevs  val: 8
[ 3020.652744] ath10k_ahb a800000.wifi: fwcfg key: peers  val: 128
[ 3020.657471] ath10k_ahb a800000.wifi: fwcfg key: active_peers  val: 128
[ 3020.663268] ath10k_ahb a800000.wifi: fwcfg key: stations  val: 128
[ 3020.670005] ath10k_ahb a800000.wifi: Direct firmware load for ath10k/QCA4019/hw1.0/ct-firmware-5.bin failed with error -2
[ 3020.676074] ath10k_ahb a800000.wifi: Falling back to user helper
[ 3020.888815] firmware ath10k!QCA4019!hw1.0!ct-firmware-5.bin: firmware_loading_store: map pages failed
[ 3020.889590] ath10k_ahb a800000.wifi: Direct firmware load for ath10k/QCA4019/hw1.0/ct-firmware-2.bin failed with error -2
[ 3020.897141] ath10k_ahb a800000.wifi: Falling back to user helper
[ 3020.973741] firmware ath10k!QCA4019!hw1.0!ct-firmware-2.bin: firmware_loading_store: map pages failed
[ 3020.975021] ath10k_ahb a800000.wifi: Direct firmware load for ath10k/QCA4019/hw1.0/firmware-6.bin failed with error -2
[ 3020.982095] ath10k_ahb a800000.wifi: Falling back to user helper
[ 3021.051164] firmware ath10k!QCA4019!hw1.0!firmware-6.bin: firmware_loading_store: map pages failed
[ 3021.051664] ath10k_ahb a800000.wifi: qca4019 hw1.0 target 0x01000000 chip_id 0x003b00ff sub 0000:0000
[ 3021.059129] ath10k_ahb a800000.wifi: kconfig debug 1 debugfs 1 tracing 0 dfs 1 testmode 0
[ 3021.072188] ath10k_ahb a800000.wifi: firmware ver 10.4b-ct-4019-fW-012-e8020273 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-d
[ 3021.127350] ath10k_ahb a800000.wifi: board_file api 2 bmi_id 0:21 crc32 bcebe54c
[ 3022.440354] ath10k_ahb a800000.wifi: 10.4 wmi init: vdevs: 8  peers: 128  tid: 96
[ 3022.440465] ath10k_ahb a800000.wifi: msdu-desc: 2500  skid: 32
[ 3022.446994] ath10k_ahb a800000.wifi: firmware crashed! (guid n/a)
[ 3022.452636] ath10k_ahb a800000.wifi: qca4019 hw1.0 target 0x01000000 chip_id 0x003b00ff sub 0000:0000
[ 3022.458733] ath10k_ahb a800000.wifi: kconfig debug 1 debugfs 1 tracing 0 dfs 1 testmode 0
[ 3022.471382] ath10k_ahb a800000.wifi: firmware ver 10.4b-ct-4019-fW-012-e8020273 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-d
[ 3022.485300] ath10k_ahb a800000.wifi: board_file api 2 bmi_id 0:21 crc32 bcebe54c
[ 3022.504769] ath10k_ahb a800000.wifi: htt-ver 0.0 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 128 raw 0 hwcrypto 1
[ 3022.513165] ath10k_ahb a800000.wifi: firmware register dump:
[ 3022.522284] ath10k_ahb a800000.wifi: [00]: 0x0000000B 0x000015B3 0x0A007089 0x00975B31
[ 3022.527909] ath10k_ahb a800000.wifi: [04]: 0x0A007089 0x00060530 0x0000001D 0x00000000
[ 3022.535650] ath10k_ahb a800000.wifi: [08]: 0x004137F0 0x0041DC64 0x0040564C 0x0041DB3C
[ 3022.543548] ath10k_ahb a800000.wifi: [12]: 0x00000009 0x00000000 0x009C2758 0x009C275C
[ 3022.551460] ath10k_ahb a800000.wifi: [16]: 0x00971238 0x009C1A4A 0x00000000 0x00000000
[ 3022.559335] ath10k_ahb a800000.wifi: [20]: 0x409C82E1 0x0040552C 0x00400000 0x00000000
[ 3022.567249] ath10k_ahb a800000.wifi: [24]: 0x80984C14 0x0040558C 0x004188B8 0xC09C82E1
[ 3022.575147] ath10k_ahb a800000.wifi: [28]: 0x809C892E 0x004055FC 0x00413808 0x0040563C
[ 3022.583042] ath10k_ahb a800000.wifi: [32]: 0x809805CC 0x0040562C 0x00405FE8 0x00400000
[ 3022.590944] ath10k_ahb a800000.wifi: [36]: 0x809C8AA3 0x0040571C 0x00405FE8 0x0041105C
[ 3022.598830] ath10k_ahb a800000.wifi: [40]: 0x8098011C 0x0040574C 0x00405FE8 0x00000007
[ 3022.606742] ath10k_ahb a800000.wifi: [44]: 0x809C8DDA 0x0040577C 0x00405FE8 0x00000007
[ 3022.614640] ath10k_ahb a800000.wifi: [48]: 0x8098072E 0x004057AC 0x00405FE8 0x00000000
[ 3022.622541] ath10k_ahb a800000.wifi: [52]: 0x809C9098 0x004057CC 0x00000000 0x0000A000
[ 3022.630439] ath10k_ahb a800000.wifi: [56]: 0x809B447A 0x004057FC 0x000E5ED8 0x00466A80
[ 3022.638329] ath10k_ahb a800000.wifi: Copy Engine register dump:
[ 3022.646250] ath10k_ahb a800000.wifi: [00]: 0x0004a000  14  14   3   3
[ 3022.652067] ath10k_ahb a800000.wifi: [01]: 0x0004a400   4   4  14  15
[ 3022.658643] ath10k_ahb a800000.wifi: [02]: 0x0004a800   1   1   0   1
[ 3022.665085] ath10k_ahb a800000.wifi: [03]: 0x0004ac00   2   2   3   2
[ 3022.671499] ath10k_ahb a800000.wifi: [04]: 0x0004b000   0   0  40   0
[ 3022.677910] ath10k_ahb a800000.wifi: [05]: 0x0004b400   0   0   0   0
[ 3022.684427] ath10k_ahb a800000.wifi: [06]: 0x0004b800   0   0   0   0
[ 3022.690812] ath10k_ahb a800000.wifi: [07]: 0x0004bc00   1   1   1   1
[ 3022.697202] ath10k_ahb a800000.wifi: [08]: 0x0004c000   0   0 127   0
[ 3022.703656] ath10k_ahb a800000.wifi: [09]: 0x0004c400   0   0   0   0
[ 3022.710028] ath10k_ahb a800000.wifi: [10]: 0x0004c800   0   0   0   0
[ 3022.716464] ath10k_ahb a800000.wifi: [11]: 0x0004cc00   0   0   0   0
[ 3022.723919] ath10k_ahb a800000.wifi: debug log header, dbuf: 0x418810  dropped: 0
[ 3022.730313] ath10k_ahb a800000.wifi: [0] next: 0x418828 buf: 0x414c20 sz: 1500 len: 52 count: 2 free: 0
[ 3022.737794] ath10k_ahb a800000.wifi: ath10k_pci ATH10K_DBG_BUFFER:
[ 3022.745975] ath10k: [0000]: 00000745 13FC0007 60002070 00000008 00000080 004E0001 00000745 17FC0001
[ 3022.752229] ath10k: [0008]: 0A007089 000015B3 000015B3 0040541C 91104569
[ 3022.761162] ath10k_ahb a800000.wifi: ATH10K_END
[ 3022.769112] ath10k_ahb a800000.wifi: [1] next: 0x418810 buf: 0x415210 sz: 1500 len: 0 count: 0 free: 0
[ 3027.470853] ath10k_ahb a800000.wifi: wmi unified ready event not received
[ 3027.491512] ath10k_ahb a800000.wifi: could not init core (-110)
[ 3027.491767] ath10k_ahb a800000.wifi: could not probe fw (-110)
[ 3027.541091] ath10k_ahb a800000.wifi: cannot restart a device that hasn't been started

If I decrease the number in fwcfg to 64 (all of them) it load and initializes successfully.

If you need any more info let me know.
Also, should I move this thread to ath10k-ct github issues page?

Thanks,
Matej

It is almost certainly crashing due to OOM in the firmware. You likely don't need 8 vdevs, so make that 4 perhaps? And, you can decrease tx-descriptors. And if 64 works but 128 doesn't, then bisect to find what will load and run OK. You can grab a 'diet' version of the firmware from my page too, it compiles out a lot of un-used cruft and saves RAM (grab the beta version if you do this).

After upgrading to 19.07, half of my devices stopped working.

Despite the DHCP limit of 100, the driver now reports:

Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.454824] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.456237] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA dc:4f:22:97:2d:a5 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 68:c6:3a:99:3b:18 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.468893] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.472669] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 60:01:94:70:1f:6f IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA dc:4f:22:db:d8:60 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.495499] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA dc:4f:22:de:4e:4f IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.523805] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 5c:cf:7f:a3:c3:e5 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.526317] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 84:f3:eb:c9:6e:06 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.555084] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 80:7d:3a:69:25:5d IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.630285] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 84:f3:eb:c9:6e:f1 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.644110] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)
Thu Jan 30 23:14:33 2020 daemon.notice hostapd: wlan1: STA 2c:f4:32:17:88:e4 IEEE 802.11: Could not add STA to kernel driver
Thu Jan 30 23:14:33 2020 kern.warn kernel: [   77.700225] ath10k_pci 0001:01:00.0: refusing to associate station: too many connected already (32)

over and over. The previous version didn't have this issue. I'm using a R7800.

This seems like a big step backwards to me.


EDIT: Fixed it by "myself".

The key is using another driver. Probably the same which has been used by the previous version (18.XX). I was almost downgrading, but then found out that we can simply swap out drivers, hooray (I guess!?)

For anyone wondering, I logged into SSH, typed:

root@OpenWrt:~# opkg remove ath10k-firmware-qca9984-ct && opkg install ath10k-firmware-qca9984

Waited for the command to complete without errors, then "reboot".

Done, all devices now connect again. Limit of 32 is gone. "Candela Technologies (ct)" should remove that limit, since only their driver has this, the "generic"/default one hasn't it. Even if it comes with it's own issues, maybe make a driver with and one without that limit. Ease users lives letting them choose between them!

What did it cost? Everything.

1 Like

Change both the firmware and the kmod when switching to the mainstream ath10k driver.

Realize that the commercial incentive for @greearb CT development is targeting big systems where it is possible to optimize speed at the expense of RAM. It's great that Ben is making his work available to OpenWrt even though it may not be a perfectly optimized drop-in solution for our use case.

2 Likes

I was under the impression that upstream driver had the same 32 station limit,
so I have not yet made any attempt to bump up the default number.

It is easily enough done with the fwcfg option:

http://www.candelatech.com/ath10k-10.4.php#config

Look for the text: "9984 firmware, full htt-mgt build, configured as AP supporting 98 connected stations."
for example that supports 98 on the ath10k-ct default OpenWRT firmware.

Thanks,
Ben

1 Like

It's 2020. With a few smart bulbs, phone, laptop, PC, and smart speakers that's 30 alone. Then add in a wifi printer, vacuum, any other household smart lights, smart speakers, smart doorbells, smart thermostats, smart plugs, smart switches, computers, laptops, tablets, phones, eReaders, smart TVs, smart vehicles in the drive way, etc and you're at 64 for a semi-regular big family when everybody is home. Have a bunch of guests over for Christmas or something and a single big "smart" house could go over 100!

3 Likes

I, for example, have over 50 Sonoff/ESP devices.

Needless to say that they weren't "happy" with the 32 limit at all.

Together with all phones, notebooks in the house, Fire TV Sticks, Tablets, Gateways and other things I'd say that I am probably reaching 80 or even a little bit more.


Nice, thanks! 98 is better. I've also seen 173 there. What's exactly the difference or downsides? It mentions "trimmed", but which effects do this have in practice?

Also, why 2 less than specified inside "peers" and "active peers"?


How so?

Jeez…! I struggled for months to get WPA3 working on my Omnia (with two QCA9880 cards) because the firmware would crap out with 802.11w enabled, after a couple of days. I eventually tried ath10k-ct with the HTT firmware and it's rock-solid (thanks a bunch, @greearb).
And now I wonder if I could return to the non-HTT firmware by tweaking this a bit… :thinking:

I'm quite lost with drivers and stuff. I have some knowledge in linux, but when it comes to specific things, especially networking, I get lost pretty quickly.

What's exactly the difference between them? HTT, non-HTT, full htt-mgt build, trimmed htt-mgt build...?

Thanks in advance!

1 Like

I'm just curious: What are you doing with over 50 Sonoff/ESP devices?

greearb:

Look for the text: "9984 firmware, full htt-mgt build, configured as AP supporting 98 connected stations."
for example that supports 98 on the ath10k-ct default OpenWRT firmware.

Nice, thanks! 98 is better. I've also seen 173 there. What's exactly the difference or downsides? It mentions "trimmed", but which effects do this have in practice?

Also, why 2 less than specified inside "peers" and "active peers"?

Because internal firmware magic.

The trimmed build compiles out a bunch of cruft that I do not need for building
test systems and/or stuff that is not supported in ath10k driver anyway.

This gives a lot more resources for the firmware to have more peers.

I'll fix up my scripts to make that available for OpenWRT when I get
a chance.

Thanks,
Ben

Every room has a couple of Sonoff T1 devices.


Simplified, they are "touch sensitive relays with WiFi" that are run with an ESP chip, which you can flash custom firmware on it. I used ESPHome, configured it to do all sorts of nice stuff (short press, long press, etc) and connected it wirelessly to "Home Assistant", which is an Open Source Home Automation Platform. This also allows me to turn on/off all kinds of devices with my phone or pc. It's really nice.

But since the house it quite big, counting all rooms, closets, restrooms, etc, the number of devices racks up quickly. There are Sonoff Basic's which turn on/off water for the garden, the pool and a lot of other stuff. ESP8266 devices which measure temperature and water-tank level, etc.

A quick peek, not yet 100% working, but still.


Anyways, I'm getting off topic, but basically that's it.


Haha, okay.

This basically means that for the normal/average user (me?) there isn't any benefit? If so, I'll switch to the trimmed version, once available.

Nice :smiley:

Thanks a lot!

2 Likes

I have installed your htt-mgt firmware some time ago (thank you for it!) but only now found info about incompatibility. How can I tell there is compatibility issue? Would it just not work or is it more subtle?
I am using hnyman build that comes with: kmod-ath10k-ct 4.19.98+2019-09-09-5e8cd86f-1

Please be specific. HTT-MGT should work great with ath10k-ct driver, but it will not work with stock driver.
Is that what you mean?