AQL and the ath10k is lovely

vochong · July 13, 2022, 5:09am

@amteza: did you mean 21.02.1?

AFAIK, 21.02.1 was the last version using the original round-robin scheduler for ATF. Starting from 21.02.2 and until 22.03.1 RC4, ATF used the virtual time-based airtime scheduler for which many people complained about excessive latency. Starting from 22.03.1 RC5, @nbd changed it back to the round-robin scheduler with some other enhancements. I asked nbd to commit the fix to the 21.02.x branch as well so 21.02.4 will be good again, but so far no such commit has been done to the 21.02.x branch.

But based on your finding, it looks like the additional enhancements may have caused some problem when uPNP is enabled.

nbd · July 13, 2022, 5:18am

Thanks for the clue about multicast causing the latency spikes, I think I understand the issue now. I will post a fix shortly.

amteza · July 13, 2022, 5:24am

Yes, I keep doing it, right? @nbd just replied. Looks like he got what's going on. I'm warming up my compile engines.

You're very welcome, @anon11662642 is the man!

nbd · July 13, 2022, 5:53am

Please try this commit along with multicast traffic:
https://git.openwrt.org/?p=openwrt/staging/nbd.git;a=commit;h=a01864235ecddbaa56b0c998c0ccb3b1925a2d16

amteza · July 13, 2022, 6:30am

So far so good! Looks like it's working. I need to do some more testing, but it has to wait until a few hours later. I'll keep this new version with the commit running.

Back to normal in basic tests:

@reaper$ ➜  ~ networkQuality -v
==== SUMMARY ====
Upload capacity: 36.681 Mbps
Download capacity: 215.913 Mbps
Upload flows: 20
Download flows: 12
Responsiveness: High (3428 RPM)
Base RTT: 11
Start: 13/7/2022, 5:14:19 pm
End: 13/7/2022, 5:14:33 pm
OS Version: Version 12.4 (Build 21F79)

Continuous ping every 200 ms to IPv4 and IPv6 router addresses:

192.168.1.1       : [7838], 64 bytes, 3.70 ms (3.52 avg, 0% loss)
fd57:11da:b11c::1 : [7838], 64 bytes, 3.52 ms (3.18 avg, 0% loss)
^C
192.168.1.1       : xmt/rcv/%loss = 7839/7838/0%, min/avg/max = 2.50/3.52/63.3
fd57:11da:b11c::1 : xmt/rcv/%loss = 7839/7839/0%, min/avg/max = 1.98/3.18/113

Note: Network under heavy use, YouTube in a mobile device, Citrix session encapsulated in a SSH tunnel, Teams video conference, Facetime call and Roblox game at full steam. And UPNP active and working (I can see Parsec entries). No hiccups!

moeller0 · July 13, 2022, 7:36am

That could just mean that the station and AP are not acquiring airtime equally aggressively. The side with the higher throughput seems to have an edge in getting airtime... same as I saw in my old RRUL test over WiFi where the macbook was eating the AP's lunch in spite of both using all four ACs...

amteza · July 13, 2022, 8:17am

I saw this happening in another testing where connectivity was:
Macbook Pro --1 Gbps USB Ethernet dongle--> NanoHD --4x4 MIMO WiFi connection--> NanoHD --1 Gbps ethernet--> Router

I might be able to run a rrul_be in this scenario a bit later today.

uniqe13 · July 13, 2022, 8:17am

Just for curiosity's sake and to check if the patch I published on the forum works, instead of patches 330-337 I applied (once again) patches 330-333 but with the mentioned patch.
Results:

Speed from a greater distance is not very different from that with the latest patches. - It does not block clients after time.
Pings do jump a bit more, but it is also not that annoying.
Does not throw errors that block or degrade the link.

Edit: For the record, the @nbd patches (330-333) alone blocked customer access in the short term, after the additional patch they no longer do this.
In contrast, the latest ones, i.e. 330-337 are as OK as possible with me.

moeller0 · July 13, 2022, 8:21am

Hrm, how does this look when you bridge via ethernet cable over the WiFi link? The question is, is that imbalance caused by the WiFi link or maybe macosx has some sort of prioritization going on for marked egress packets?

amteza · July 13, 2022, 8:23am

Sorry, I don't understand, do you mind to elaborate a bit further?

Lynx · July 13, 2022, 8:38am

BTW @amteza does this just mean performance is back to the level it was before the airtime fairness change or does this mean that performance may now actually beat what was there before? And hats off to you and everyone here for the tenacity!

moeller0 · July 13, 2022, 9:00am

Sure instead of measuring over:
Macbook Pro --1 Gbps USB Ethernet dongle--> NanoHD --4x4 MIMO WiFi connection--> NanoHD --1 Gbps ethernet--> Router

measure over:
Macbook Pro --1 Gbps USB Ethernet dongle--> ethernet cable--> Router

so take the WiFi link out of the equation.

dtaht · July 13, 2022, 3:21pm

This is the tool we used to validate airtime fairness, but it requires an aircap.

If the two stations show equal airtime, then the mcs rate being used by the AP is probably lower than the client. With an aircap in hand, we can see the negotiated mcs rates.

There are other possible issues in packing aggregates at the AP, achieving the same mcs rate but not the same airtime, there is also the number of "won elections" for airtime that can be pulled from that.

dtaht · July 13, 2022, 3:41pm

I am very pleased to see your network that stable. Hopefully hundreds, then thousands, then millions more.

(But I'd settle for an ath9k and ath10k result. Anyone got an off the shelf broadcom product worth benching also?)

And a huge thx to @nbd. Very elegant solution!

One "feature" of all this code is that latency with more than one active station tends to get better for a while as we service the sparsest stations first. Due to hardware limitations we had to have "one TXOP in the hardware", and "one ready to go", which means (so long as only the BE queue is in use), a maximum of 11.4ms latency buried in the 802.11 stack that cannot be FQed.

With one machine running the rrul at full throttle, and another just doing voip, we can end up with one small txop in the hardware (say, 250us) and one large one "ready to go" (5.7ms) for the rrul one. three stations, you might have two 250us txops outstanding, so the rrul test experiences just that latency in that round. most normal traffic does not drive wifi as hard as these tests do, but the revolution (dare I say it) of the fq ATF scheduler is getting the most lowest latency requiring packets to stations first.

One old technique that I would like us to try to further reduce jitter and improve multiplexing in the future is further tightening the announced wmm parameters in the beacon as well as on the AP to as low as 1.3ms.

amteza · July 13, 2022, 5:31pm

Here you go, Belkin USB-C (F2CU040) connected to onboard RPi4 Ethernet.
flent rrul_be -H openwrt.lan -l 300

Note: Not the best USB-C dongle as I can see, but good enough to discard is the problem.

amteza · July 13, 2022, 5:45pm

Nope, under stress (rrul_be) I seem to face the same disparity between upload and download and latency is a bit meh still:

Click me to download flent test data

MCS info from Macbook Pro:

MCS info from AP router connection:

@dtaht, @moeller0, I guess we can discard the MCS differences.

ka2107 · July 13, 2022, 6:01pm

@amteza Where in macOS Monterey can I see this info?

amteza · July 13, 2022, 6:02pm

Click on your wireless status bar icon while pressing Option key.

ka2107 · July 13, 2022, 6:03pm

Any idea how to do this over VNC with Windows QWERTY Keyboard?

amteza · July 13, 2022, 6:04pm

Sure, press Alt key if it is Windows logo keyboard.

AQL and the ath10k is *lovely*

AQL and the ath10k is lovely