TLDR: https://github.com/yanko-yankulov/openwrt-mr80x/commit/ee7747f48999cae66e686d464dceaf838398309e works for me ![]()
Hi there, and thanks for all the work.
I bought two "Mercusys MR80X v2.20" ( ipq5018 256MiB + qcn6122 ) by mistake, and decided to give it a go into figuring out what is the actual issue with the memory usage of the upstream ath11k driver vs the vendor ones.
After some tracing the main user turned out to be the allocated skbs for the RX ring(s). The driver, at least in my case would allocate 3 rings of 4096, 1024 and 4096 of a little over 2KiB skbs. Of those 3 rings only the first one is an actual RX ring, the second I have no idea what is for actually, and the third one is probably for "monitor" mode. However I've so far seen only the first one getting used and replenished.
Turned out that just changing the constants is enough to properly configure the rings and this immediately resulted in an usable system. The patch above does just that. I also run the vendor's firmware with wifi_3_0.ko verbose logging and I found only a single ring allocation with size of 512.
Currently, one of the routers is used as Wi-Fi "extender" - all the interfaces bridged together, no NAT or any other processing. Both wifi interfaces are enabled and there are a few clients connected:
root@OpenWrt:~# free
total used free shared buff/cache available
Mem: 184112 94260 41328 276 48524 40880
Swap: 0 0 0
The free memory will fluctuate between 30-something and 40-something MiB but otherwise stays pretty stable
The system is has the default luci collection installed and working. I've been doing some preliminary tests and haven't noticed stability issues or memory decrease so far.
Performance wise, also no significant change. Over the 5G wifi it will push ~600MiB/s of raw traffic ( iperf .. -u -b 700M -l 1200 ) and 70-80kpps with small packet size. The bandwidth test was basically identical with the performance on the vendor's firmware. The vendor's pps were of course way higher, but this is most-likely due to the offloads. I didn't notice any significant difference with the driver runnig with 4K rx ring size.
Now, my "test" setup is pretty small. E.g. I haven't tested performance with a number of clients attached, but still this looks very promising. It basically covers exactly what I bought the routers for.
I also added another patch to track how empty was the ring on each replenish. (next one in the repo) Both 512 and 4096 sized rings will get completely empty from time to time under extreme load, so this didn't bring much valuable information, but at least the ring is getting empty and nothing crashes.
Not sure where I am going with this patch. It is obviously not suitable for upstreaming in the current state. But it seemed best to share here.
Cheers