AQL and the ath10k is *lovely*

I did not explicitly set tcp_bbr, but I do have the kmod-tcp-bbr package included in the image and looks it it is setting BBR as default using sysctl https://github.com/openwrt/openwrt/blob/openwrt-22.03/package/kernel/linux/files/sysctl-tcp-bbr.conf

What should net.ipv4.tcp_congestion_control value be?

net.ipv4.tcp_allowed_congestion_control = reno cubic bbr
net.ipv4.tcp_available_congestion_control = reno cubic bbr

There's also iPhone SE (1st Gen), iPhone 8 Plus and an Apple Watch (not sure which exact model). I don't think AWDL can even be disabled in those devices.

Openwrt switched to BBR????? As sort of a grand experiment in wifi, I figure that would trigger powersave, at least, and the lack of any reasonable response to packet loss... oy, vey...

net.ipv4.tcp_congestion_control=cubic

2 Likes

Not base OpenWrt by itself, but the sysctl config file included in kmod-tcp-bbr module package switches to BBR.

No way!!! That's not what I meant. I was just comparing with previous version of the driver. It's more than clear that it works, Lynx. And, this is under a lot of stress on a specific device, might be something else in play here.

1 Like

Let me reprhase, @dtaht, clearly I made a wrong choice of words. I'm blaming it on English being my third language and my lack of mastery on it. I'm in no way saying this is "meh" in general, just that compared with previous iterations it performs a little bit worse. I'm more than happy with current results, honestly.

I thought the file names inside the shared folder were self explanatory.

Local == Test Server is Router/AP
Remote == Test Server is mumbai.starlink.taht.net

Being able to do a comparison plot before and after is feasible now with the data you have. (flent-gui Data->Add other open files") How many ms of "meh" do we have on a cdf plot, for example? I am actually going to take this as a slogan - "22.03.1 - Much less meh!" "Less meh, more multiplexing!" :slight_smile:

But to me we have a few other problems... ideally getting a couple more APs running the current codebase, (could the up/down disparity be hardware specific?) would be good. seeing that 9000s test not fade as it did...

I actually have a mikrotik hap ac lite still at my PO box (that I ordered last week)

btw, so far as I know a rpm 3500 over wifi is about 700 rpm better than what ubnt achieves.

2 Likes

Let me redo it. BTW, I think I've got a small RE450 v2 (3x3 MIMO) around, which I reckon is a ath10k-ct, might be able to repurpose it temporary for testing. And we can measure how much less "meh" it has. :wink:

Ref: AQL and the ath10k is *lovely* - #737 by ka2107

Belkin RT3200 updated to (includes AQL multicast fix):

OpenWrt 22.03-SNAPSHOT r19541-ec9f82fa18 / LuCI openwrt-22.03 branch git-22.167.28394-8a4486a

/etc/sysctl.conf in Belkin RT3200

net.ipv4.tcp_congestion_control = cubic
## 5 GHz
% /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport -I
     agrCtlRSSI: -73
     agrExtRSSI: 0
    agrCtlNoise: -101
    agrExtNoise: 0
          state: running
        op mode: station 
     lastTxRate: 39
        maxRate: 144
lastAssocStatus: 0
    802.11 auth: open
      link auth: wpa2-psk
          BSSID:
           SSID: XXXX_5GHz
            MCS: 4
  guardInterval: 800
            NSS: 1
        channel: 36
## 5 GHz - AirDrop and AWDL - ENABLED
% networkQuality -v
==== SUMMARY ====                                                                                         
Upload capacity: 8.420 Mbps
Download capacity: 5.024 Mbps
Upload flows: 20
Download flows: 20
Responsiveness: Low (57 RPM)
Base RTT: 42
Start: 14/Jul/2022, 05:55:57
End: 14/Jul/2022, 05:56:17
OS Version: Version 12.4 (Build 21F79)
## 5 GHz - AirDrop and AWDL - DISABLED
% networkQuality -v
==== SUMMARY ====                                                                                         
Upload capacity: 12.409 Mbps
Download capacity: 22.101 Mbps
Upload flows: 12
Download flows: 12
Responsiveness: Medium (212 RPM)
Base RTT: 41
Start: 14/Jul/2022, 05:57:04
End: 14/Jul/2022, 05:57:15
OS Version: Version 12.4 (Build 21F79)
## 2.4 GHz
% /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport -I
     agrCtlRSSI: -68
     agrExtRSSI: 0
    agrCtlNoise: -92
    agrExtNoise: 0
          state: running
        op mode: station 
     lastTxRate: 145
        maxRate: 144
lastAssocStatus: 0
    802.11 auth: open
      link auth: wpa2-psk
          BSSID:
           SSID: XXXX_2.4GHz
            MCS: 15
  guardInterval: 800
            NSS: 2
        channel: 11
## 2.4 GHz - AirDrop and AWDL - ENABLED
% networkQuality -v                                                                           
==== SUMMARY ====                                                                                         
Upload capacity: 30.151 Mbps
Download capacity: 23.092 Mbps
Upload flows: 20
Download flows: 12
Responsiveness: Medium (533 RPM)
Base RTT: 42
Start: 14/Jul/2022, 06:00:36
End: 14/Jul/2022, 06:00:52
OS Version: Version 12.4 (Build 21F79)
## 2.4 GHz - AirDrop and AWDL - DISABLED
% networkQuality -v
==== SUMMARY ====                                                                                         
Upload capacity: 30.746 Mbps
Download capacity: 25.620 Mbps
Upload flows: 16
Download flows: 16
Responsiveness: Medium (604 RPM)
Base RTT: 41
Start: 14/Jul/2022, 06:01:47
End: 14/Jul/2022, 06:02:01
OS Version: Version 12.4 (Build 21F79)

@dtaht tcp_nup (4 streams) and tcp_ndown (4 streams) tests re-run

  • with AQL multicast fix
  • TCP Congestion Control algorithm set to CUBIC (not BBR) in Router
  • AirDrop and AWDL DISABLED
1 Like

Still something majorly funky every 10s. sysctl -a | grep tcp_congestion shows cubic?

to set it by hand

sysctl -w net.ipv4.tcp_congestion_control=cubic

On the up it also is periodic (no bbr there?) so this spike is coming from somewhere else. Can you do a packet capture of a single tcp_nup?

btw the flent logging facility could also be causing you problems and you can turn that off (this is via rdp?).

I am done for the day, see y'all tomorrow.

2 Likes

Yes, it is set to "cubic".

I will do it tomorrow U.S. time.

Over SSH. Not connected over VNC or RDP.

thx!! gnight! thx for testing so hard

This should only matter for TCP flows that terminate on the router... If the tests actually go from a station to a server that does not sit on the router (like e.g. for Apple's RPM test) the router's CC algorithm should not matter, no?

2 Likes
Flattering, but the bash script is pretty much your baby, started by your post and by your ideas, it was fun to ping-pong ideas and theories, but this is not my conception (if you must call it a team effort, as everything on that thread, even the lua implementation somehow contributed; but without someone (aka you) taking these ideas, implement and test them nothing would have happened at all).

My limited understanding is, that the complication is that e.g. the newer ath WiFi chips want to do more inside their own firmware and less in the driver, making it much harder to do proper air time accounting and balancing from the driver side (and the firmware does not offer that itself). Not sure about mt76?

3 Likes

Trying to understand what happens in the driver, firmware, and the mac80211 code wrt ATF and AQL is complex (for me) and depends on what device and drivers/firmware are in use. @Lynx: nbd's comments embedded in the mac80211 patches (recent and older) help a lot so read those to get an idea about the issues with the VTBS vs the RRS.

WRT to ATF and the ath10k-ct driver/firmware my (layman's and incomplete) understanding is that airtime measurements are made in the driver (with some stats info taken from the firmware) and then supplied to mac80211 system where the airtime is utilized.

For ath10k (non ct), the airtime is calculated in the firmware and then the (non ct) driver supplies that to mac80211 system. If the ath10k-ct driver is using the not ct firmware (this is possible), the ath10k-ct driver takes the airtime measurement from the (not ct) firmware.

And just to make this all a little more complicated, my r7500v2 using ath10k-ct driver does not natively support ATF (it does support AQL - the non ct driver/firmware does not support ATF, not sure about AQL). In this case, I believe the r7500v2 uses the RRS and not the VTBS and hence why I think I never observed quarky's symptoms.

But thanks to a comment by the ct dev greearb and a patch from castiel652 to the ath10k-ct driver, the r7500v2 can get ATF. This patch apparently uses mac80211 code to calculate the airtime with data supplied from the ath10k-ct driver. My understanding is that this is analogous to how older ath9k/mt76 devices implement ATF.

EDIT: Even before all the recent changes, I've been trying to evaluate castiel652's patch. Not sure if it's still relevant given the recent changes but I think it is. As soon as I can, I'll load a build with nbd's latest patch and report back (this might take a few days). In the mean time, I'm looking at using iperf to generate multicast traffic and see if I can reproduce amteza's symptoms. I'm not ready to put my router on a spit and use dtaht's rotisserie of death tool (rtod) :wink:.

Fun stuff.

3 Likes

Thank you for summarizing all that. Does anyone (@nbd ? Know anyone?) have a grip on how the ath11k and mt79 are attempting to implement mu-mimo and DU?

2 Likes

Yes thank you @anon98444528. Slightly intimidating.

1 Like

If ever you want to read 16,000 pages of 802.11 specs let me know. In terms of atf QoS, and packet scheduling, it's actually only a few hundred, but the way they are written is each succeeding standard references the prior ones.

1 Like