AQL and the ath10k is *lovely*

how do you use native fq_codel instead of sqm?

This is wifi driver level. Forget sqm/QoS you're just confusing yourself.

@dtaht is this different to airtime fairness? I don't think I've followed this one.

1 Like

ATF regulates airtime between stations, but needs short queues to work properly. AQL keeps the firmware from overbuffering, and more important is the fq_codel algo on top that which then keeps queues short and intermixes packets better. This was the last major piece of the make-wifi-fast solution before it ran out of time, volunteers and money.

3 Likes

When the Bufferbloat team first started using these tools, it took me a while to understand what these "RRUL Charts" were showing, exactly how bad the first plot was, and how it compares to the second (good) one. Let me give more detail about what's going on. (There's lots more detail about RRUL charts at bufferbloat.net.)

  • These charts plot two things: ping/latency between the endpoints, and the data rates of each of several data connections.
  • Each test starts with five seconds of idle, then 60 seconds of full-rate transfer for the connections, followed by another five seconds of idle, for a total test time of 70 seconds.

The first chart shows bad performance: latency like this would make the link basically unusable.

  • The ping times (green) start low for the first 5 seconds, then ramp up to over 4000 msec (4 seconds) within 20 seconds. They plummet again at 40 seconds, then spike up again. At 65 seconds, the data transfer stops, so by the end, ping latency is low again.
  • But worse than that... The plots of the individual data connections show wildly chaotic rates. After the initial five-second interval, most of the transfers can't continue, or are very slow. One connection, the blue plot, ultimately manages to use about half the12 mbps link.
  • Anyone trying to use this network would think that "the internet was broken."

The second plot shows way better control, with way better performance, even when hammering the wi-fi link with multiple full-rate transfers.

  • Pings (green plot again) vary between 20 and 60 msec.
  • Each of the data connections hums along around 4 mbps. This "good sharing" means that each connection gets a fair share of the bandwidth.
  • Using this network would be a pleasure.
7 Likes

Can I get this just by building a snapshot? Or is there more code that I would have to patch in? Thanks. Shout out dtaht I have bin following your work for years now. Some good stuff.

Snapshot and be either using ath10k-ct 5.4 or ath10k vanilla. Of course, this currently only works on ath10k (and maybe mt76?).

Or...check if your WiFi driver has the following flag:

wiphy_ext_feature_set(ar->hw->wiphy, NL80211_EXT_FEATURE_AQL);

The snapshots have this now. (or go ahead, build one. :slight_smile: )

ath9k and mt76 have had the fq_codel and ATF stuff for a long time. However we needed to invent "AQL" in order to make things work well with the ath10k. (the lwn link pointed to our first successful version but it went through a few iterations on the way to mainline).

It is my hope that this new API can be easily applied to other drivers such as the intel ax200 and older iwl chips, and perhaps even marvell and broadcom, one day. I'm spinning up a project to tackle the ax200 in particular...

It's a bit more complicated than just adding wiphy_ext_feature_set(ar->hw->wiphy, NL80211_EXT_FEATURE_AQL); to the driver, but...

But along the way we had all kinds of trouble with the ath10k firmware itself, which made it difficult to test. I hope we've got it right for all the ath10k hw now, (the -ct firmware has come a long way, h/t to everyone that's worked on it.... but there's so much hw that uses it and so many variants of that chipset out there... and so many routers today exhibiting the kind of broken behavior I pointed to above....

So far for me I've tested the ubnt uap mesh, mesh pro, and am going to do up an archer c7v2 as soon as I make a few other long-desired tweaks to the algorithm (lowered codel target on 5ghz, some other stuff that I tested long ago)... I'm setting up my old testbed... so I'd love it if more checked "their stuff".

9 Likes

What about ath11k? It was branched off ath10k as far as i understand so in theory the same principles apply?

I sure as f**k hope the ath11k folk are paying attention. I have not reviewed their code. Keep hoping someone from there will send me a couple chips (and a contract)

Wifi 6 brings new challenges. Ideally an api for those has per station scheduling in the firmware itself and exposes that API to linux. It's been the only sane way to deal with DU I can think of. (Without having hardware to play with)

Entries in the hardware table that support 19.07.2 and use ath10k

May I ask how?

We do kind of need to document how we are going to go about adding it to more drivers!!! First up is an attempt on the ax200 chip.

1 Like

@dtaht how does your /sys/kernel/debug/iee*/phy*/aqm look like?
I see:

root@turris:~# cat /sys/kernel/debug/ieee80211/phy0/aqm 
access name value
R fq_flows_cnt 4096
R fq_backlog 0
R fq_overlimit 0
R fq_overmemory 0
R fq_collisions 0
R fq_memory_usage 0
RW fq_memory_limit 16777216
RW fq_limit 8192

But I have a hunch that this OpenWrt 19.07.2 based turris omnia does not even have the AQL patches installed. How should this look?

Yes, I can confirm that AQL patches are not included in OpenWrt 19.07. You might want to ask @nbd, who commited it to OpenWrt master, if he would consider to backport it to OpenWrt 19.07.

@nbd given the more variable timeframe for OpenWrt 20, is there a chance of getting the two AQL patches for (ath10K and aath10K-CT) into 19.07 somehow? These are truly nice improvements that Dave showed in the first post, so I am eager to get this into the hands of our users (myself included :wink: ) soonish?

5 Likes

It's really nice to see good things about OpenWrt after all the bashing it has got over that dam opkg bug!

1 Like

Yay Dave! (and hardworking crew)

Been waiting for this for a long time. I hopefully will be one of your early C7V2 testers, once I screw up enough courage to try a bleeding edge snapshot on the single family AP, and have a window of time to make the switch...

1 Like

So by having trunk version from yesterday (in my case Unify AC PRO and Archer C7 v2), this patch is implemented ?

Eg. to see it is, following is enough as prove it's enabled ?

find /sys/kernel/debug/iee*/phy* -name aqm

and

cat /sys/kernel/debug/ieee80211/phy0/aqm
access name value
R fq_flows_cnt 4096
R fq_backlog 0
R fq_overlimit 0
R fq_overmemory 0
R fq_collisions 1
R fq_memory_usage 0
RW fq_memory_limit 16777216
RW fq_limit 8192
RW fq_quantum 300

This implement "kind of" Airtime Fairness (or memory buffering?) on both, 2.4GHz and 5GHz ?
And the change itself, is thanks to the patch incorporated eg. in kmod-ath10k, kmod-ath10k-ct and kmod-ath10k-smallbuffers or somewhere else (eg. firmware-ath10k-firmware-any) ?

Sorry if those are silly questions :wink:
Thank you.

Not AQM, it would be:

cat /sys/kernel/debug/ieee80211/phy0/netdev:wlan0/stations/*/aql

And only for ath10k for now.

1 Like

Nitpick - '4.19 master' isn't a thing. Either 'master' or '4.19 branch', where the branch has split off FROM master and hence is (in theory) more stable/less active from the continuing development work on master.

There are fairly continuous builds based on master (wherever it happens to be), called snapshots that contain the latest (b)leading edge work BUT they're not even guaranteed to boot or be available for long.