I am not having the best of the results after a while. On my MT6000 after hours I get 'persistent Wireless's slowdowns.
Dumping the stats shows this:
cat /sys/kernel/debug/ieee80211/phy*/aql*
:
1
AC AQL pending
VO 0 us
VI 0 us
BE 4960 us. <---- this has become stuck, does not change after multiple dumps
BK 4 us
BC/MC 0 us
total 4964 us
24000
AC AQL limit low AQL limit high
VO 5000 5000
VI 5000 5000
BE 5000 5000
BK 5000 5000
BC/MC 50000
I have a 802.11s BATMAN + AP setup on SNAPSHOT, r26740.
I believe the tight AQL limits in my environment is exposing something.
I will revert to AQL defaults to monitor behaviour.
How come the TUF 6000 is performing significantly better? It’s not a secret that I’m a TUF 6000 fan-boy, but now I’m actually curious? Compare the equivalent graph for the MY-6000 and it’s a BIG difference in performance or am I missing something?
I think this needs to be revisited. I'm not at all trying to put down the TUF AX6000 or TUF AX4000, but the graph @Gingernut posted doesn't seem to me to indicate a significant performance increase over the MT6000.
If you look at some of the more recent graphs I posted in another MT6000 thread for @pesa1234's build, we have been working hard to tune some AQL settings and achieved some pretty respectable results:
The graph that @Gingernutposted here seems to indicate some significant latency struggles especially in the combined test (third test on the graph). Those latency spikes have largely been smoothed out with tuning in @pesa1234's build.
Thought #1 - I should read the OpenWrt forums more often.
Thought #2 - what a difference a clean channel (100?) makes!!, and also how hardware retries muck with things
Thought #3 - I wish crusaders graphs were as directly comparable as flent's. In particular I care that the AQM get tested and this test does not do that. (well, the slope of the crusader staggered start test sort of does). A packet capture can help too, with the RTT plot from Wireshark (tcptrace -G, and an xplot.org also)
But to me the lowered AQL values seem to indicate goodness across the board.
Regrettably I don't have the mental energy to understand all else that under test? If you care (as I do) that a gaming wifi station do better with multiple stations present, flent's rtt_fair test is good, and anything consistently less than 20ms is good.
Using 2500/5000 again now but on updated SNAPSHOT, r26912 for the MT6000. Looks better & stable now and nothing seems to be getting stuck after 16 hours.
Yea I've been keeping low the same and messing with larger high settings. Currently doing AQL of 2500/8000 for a little more peak throughput with nice results. Running the latest snapshot on 6.6.38 with some various tweaks.
You know @phinn, I have enjoyed this thread and it turns out one can go beyond just the AQL hi/lo limits.
I see in my environment cases where AQL tracking hit out of memory and stop for short bursts impacting wireless bufferfloat, the live stats can be dumped further with:
grep . /sys/kernel/debug/ieee80211/phy*/aqm
You will see fq_overmemory and fq_overlimit counters.
Given the beast this MT6000 is I have settled on a lil more complex override now such as this:
They are for the wireless AQM specifically, check this commit for more info about what it does.
:
This patch implements an Airtime-based Queue Limit (AQL) to make CoDel work
effectively with wireless drivers that utilized firmware/hardware
offloading. AQL allows each txq to release just enough packets to the lower
layer to form 1-2 large aggregations to keep hardware fully utilized and
retains the rest of the frames in mac80211 layer to be controlled by the
CoDel algorithm.
:
This is independent of the SQM you do on WAN, and it helps with wireless bufferfloat by tons.
Nice link added to that to the MT6000 doc page. It's interesting to see the original TXQ limits and threadhold are essentially unchanged since the initial commit. Of course they are sensible values for throughput so not knocking it, but we can clearly do a little better with latency.