That's great feedback--thanks! Any time you do extra testing and can drop the results here, that will help the cause
2.4 GHz:
Before: https://www.waveform.com/tools/bufferbloat?test-id=9974d900-12c7-47db-a012-cfae7ab43e1f
After: https://www.waveform.com/tools/bufferbloat?test-id=5886f290-f90e-45c7-bcc2-270c28f149c0
So perhaps the proposed change seems a bit drastic. 3000 3000
looks like a useful middle ground: https://www.waveform.com/tools/bufferbloat?test-id=5d1cf942-c01f-4ec6-bc8c-df16e0cf1031
5 GHz
Before: https://www.waveform.com/tools/bufferbloat?test-id=91da24bb-1023-4ffa-9be8-ed5c7df9a75d
After: https://www.waveform.com/tools/bufferbloat?test-id=d0652b0e-52d4-4f6c-bc1f-42e9eb1d62b4
In other words, the 300 Mbps line is the limiting factor, so the test is not conclusive.
Thanks for the feedback! Interesting results, indeed
I wish I could convince more people to try the patch I put in this post (based on suggestions from @dtaht):
I think you could get more audience if you somehow convert this into a binary patch that changes the values used in the assignments, so that people with only a stock build could try them too. I can try to work on this if you want.
Absolutely! Anything you are willing to take on to make this easier to try would be fantastic!
@_FailSafe i think you might have switched snapshot and 22.03 commands: phy*
and wl*
in your first comment. Maybe you could edit it.
I tried
cat /sys/kernel/debug/ieee80211/phy*/aql_txq_limit
for ac in 0 1 2 3; do echo $ac 1500 1500 > /sys/kernel/debug/ieee80211/phy1/aql_txq_limit; done
on my D-Link dap-x1860 wifi repeater, running snapshot.
Topology: internet > Fritzbox 7490 (stock OEM) > wifi 802.11ac (80 MHz) > D-Link DAP-x1860 > wifi 802.11ax (80MHz) > Windows client with Intel AX200.
My ISP provides 63 Mbps, which I shape via fritzbox 7490 to 58 Mbps, but throughput is sometimes slower than that.
Before: 5000 / 12000 (default settings):
- https://www.waveform.com/tools/bufferbloat?test-id=b524da22-0656-4784-ac37-38737106e04b
- https://www.waveform.com/tools/bufferbloat?test-id=9a774c14-e57b-4154-b719-8554c0486b9e
After: 800 / 1000: https://www.waveform.com/tools/bufferbloat?test-id=c50c1c6d-9c9a-4323-99d6-8dcb4a363dbd
After 800 / 1500: https://www.waveform.com/tools/bufferbloat?test-id=ae4cce64-78df-4a86-9e52-a93a6f0e8624
After 1500 / 1500: https://www.waveform.com/tools/bufferbloat?test-id=aa29c0b2-4ef3-4f24-b223-aa69f38023e5
After 1500 / 3000: https://www.waveform.com/tools/bufferbloat?test-id=8e0c7743-2669-4f9d-9c57-5512a8755365
After 3000 / 8000: https://www.waveform.com/tools/bufferbloat?test-id=4f093f2b-2045-45f0-99d1-b55babaf5c69
After 3000 / 12000: https://www.waveform.com/tools/bufferbloat?test-id=7e7374c4-0d4f-4ba5-a14d-bcbe733df052
Don't be shocked by some high max latency packets even with low values of tx queue lenght. In total, my wifi has to cross 3 walls and I can't control stock OEM of the Fritzbox. There is high variance in my tests, even with same settings, multiple tests yield different results.
Edit: I have not tried the other patch, because I have yet to compile OpenWrt on my own. I am currently using images provided by the firmware selector.
Bumping this to see if anyone else has interest in testing and reporting results for the good of the community. Thanks!
I can run this test for you.
My setup is rpi4 router with rt3200 access point through a ue300 adapter.
On which device do you want me to run the commands?
These would be for the RT3200. Just looking for some benchmarks prior to testing these changes, then after as well. Thanks!
Does it matter that my rt3200 is setup as a dumb access point? So no native SQM...
Not at all. This has nothing to do with SQM directly.
Looks better. But still no A+?
MacBook Air, 5Ghz before config change
Canned with your config - (no reboot)
root@OpenWrt:~# cat /sys/kernel/debug/ieee80211/phy*/aql_txq_limit
AC AQL limit low AQL limit high
VO 5000 12000
VI 5000 12000
BE 5000 12000
BK 5000 12000
AC AQL limit low AQL limit high
VO 1500 1500
VI 1500 1500
BE 1500 1500
BK 1500 1500
root@OpenWrt:~#
I have notice my MacBooks intro large amounts of latency even directly on Ethernet. Is that normal?
Does AQL still work if WED (Wireless Ethernet Dispatch, only downlink for this router) is used or is it similar to HWO vs SQM where only can be used?
Adding my own 2 cents here - I'm seeing essentially no difference with the 1500 AQL settings vs default.
This is on the RT3200, with AX, 160Mhz, using WED + bridger on 23.05-RC3 in Dumb AP mode. I was twisting the settings from 5000 to 1500 down to 500, and I saw basically no difference in throughput or latencies in my waveform, flent, and crusader tests. I can even provide graphs if people are interested, but they're rather boring since there's no major differences between any of them.
I'm getting A+ Waveform tests running default: https://www.waveform.com/tools/bufferbloat?test-id=e8314684-0a52-498e-90a3-0ad9ff66bbba
root@OpenWrt:~# cat /sys/module/mt7915e/parameters/wed_enable
Y
root@OpenWrt:~# cat /sys/kernel/debug/ieee80211/wl*/aql_txq_limit
AC AQL limit low AQL limit high
VO 5000 12000
VI 5000 12000
BE 5000 12000
BK 5000 12000
AC AQL limit low AQL limit high
VO 5000 12000
VI 5000 12000
BE 5000 12000
BK 5000 12000
Now, it could be that these settings have an effect with WED disabled. I'd have to run another round of tests to confirm.
Sorry for the late response. That's a great question and I would love to get a definitive answer to this myself. I'm not sure who can best answer it. Perhaps @tohojo, @nbd, or @daniel?
This is the first time I'm hearing about WED, so don't really know anything about how it works in detail. But since it's a hardware flow offload thing, my guess would be that it does indeed bypass AQL...
@Anteus I was finally able to dedicate some time in testing this yesterday and I concur with Toke. There is a stark difference in latency with WED on and WED off.
With WED off, there is an unquestionable (and reasonably predictable) impact in latency vs. bandwidth that occurs with modifications to aql_txq_limit
. However, with WED enabled, I am unable to see any real impact upon latency/bandwidth regardless of the aql_txq_limit
values.
Thank you for testing!
For anyone still following along and interested in this, I have been testing with the following instead of 1500
for aql_txq_limit
as I previously had been using:
aql_txq_limit=2500
for ac in 0 1 2 3; do echo $ac $aql_txq_limit $aql_txq_limit > /sys/kernel/debug/ieee80211/wl0/aql_txq_limit; done
for ac in 0 1 2 3; do echo $ac $aql_txq_limit $aql_txq_limit > /sys/kernel/debug/ieee80211/wl1/aql_txq_limit; done
Here's why I landed on that number based on some testing I've been doing. My base WiFi latency to my router is ~3ms. So the figures below include my base latency. Therefore, latency increase under load = latency avg - 3ms
. For example, at a limit of 12000, the net latency increase under load is 14ms.
Running iPerf3 for 60 seconds per test with my router as the sender and my MacBook (@5ghz) as the receiver:
aql_txq_limit | latency avg (ms) | avg throughput (Mb/s) | retries |
---|---|---|---|
12000 | 17ms | 650 Mb/s | 4 retries |
5000 | 13.4ms | 652 Mb/s | 120 retries |
3500 | 13.4ms | 597 Mb/s | 268 retries |
2500 | 9.9ms | 650 Mb/s | 316 retries |
2000 | 9.8ms | 598 Mb/s | 470 retries |
1500 | 8.8ms | 565 Mb/s | 557 retries |
500 | 6.4ms | 419 Mb/s | 1382 retries |