Reducing multiplexing latencies still further in wifi

I do not think 40 is AC_VI. As far as I understand flent accepts and prints the TOS values, so 40 would be TOS 40/4=10, which still maps to AC_BE... However, flent might print decimal DSCP values while requiring decimal TOS values for configuration, so you might have done the right thing and I am just confused...

But the fact that all flows get the same throughput indicates that the marking might not be as intended.

(Why is TOS = DSCP * 4? Because this essentially is a shift by two bits to get from 6bit DSCP to 8 bit TOS with those two ECN bits zero by default)

2 Likes

You are right! The lack of sleep is affecting me clearly. Going to redo them. Argh.

Update: it should be fixed now, I'm going and mind my Saturday and have another coffee to see if I can wake up properly!

2 Likes

thank you, esp for showing, in particular, how badly the BE queue performs under contention vs VI. BK oughta be worse! There is a lot of traffic mismarked as CS1 out there, in the vain hope that background actually means what L3 protocol designers meant as background, where 4seconds of delay with only one station on the case is well beyond what we meant as background. Trying to spit tcp through there, which has typical timeouts at 250ms, 1s and 2s essentially, means we end up sending more packets in a somewhat futile manner. If we could force upon application designers the idea that the BK queue might be delayed 10s of seconds, and restrict usage of it to just those apps, that would be great.

Back when we were thinking about 802.11e, the problem as then (2003!) seen was that VOIP really really really wanted a 10ms interval (now it's 20ms), we didn't have good jitter buffers, and ulaw and gsm encodings were the law of the land. So a limited number of VOIP phones on an AP worked better - ship it! (and again, this was a client, not as much AP, option at the time. The APs were supposed to figure out how to schedule responses, and many (enterprise) APs actually did do some of the right things here...

VI ended up as a bucket for where videoconferencing was to go. It seemed to make sense... to some...

but the complexities of 802.11e's bus arbitration don't make a lot of sense, period. IMHO.

After 802.11n showed up with aggregation which was vastly superior in terms of fitting packets into a txop (if you managed the queues right)... and atheros sold out to qcomm... most of the detailed AP knowledge began to fade from the field.

I turned off mappings via qos-map almost entirely (EF-only) years ago, and have in general not looked back. WMM is required that it work to pass the wifi alliance's tests! and thus it's on by default for nearly everybody else still, and the effect on real traffic, well... I'm in general thankful that so few applications have tried to use it to date. Used carefully from certain kinds of STAs still seems to be a decent idea. Note "carefully". There are a few wifi joystick-game controllers that use VI or VO....

Despite my opinion, I'd never got sufficient data from real world usage to convince enough people I was right.

With enough data, perhaps we can convince the openwrt folk to obsolete qos-map into the bk queue, at least. The VI queue isn't looking all that good either. Scheduling smartly, and intelligently reducing txop size under contention seemed the best strategy to me (in 2016)

I've sometimes hoped we could find another use for the 4 hardware queues. Or that they would work better in a mu-mimo situation. I keep hoping we find a benchmark that shows a demonstrable benefit for some form of real world traffic for the VI and VO queues for some generation of wifi.

I should probably note also that there are all sorts of other possible sources for observing the 4sec spikes on icmp seen here...

3 Likes

Hence my principled objection against the harebrained idea of making "NQB" inhabit AC_VI... clearly nobody in the IETF WG bothers to look at actual data...

I thought 802.11e was finalized in 2005?

You convinced me; am I not enough people? :wink:

Oh I think there is, but it requires that you have <= 4 different levels of priority traffic and are willing to accept that higher priority, if not rate-limited sufficiently*, will severely choke lower priority traffic.

*) it is not that rate limiting on a variable rate link like WiFi is conceptually all that "simple"

I'd worked on wifi, 1998 - 2005 - http://the-edge.blogspot.com/2010/10/who-invented-embedded-linux-based.html - as well as on various voip products like asterisk and the sophia SIP stack. I tapered off after 2005. So I was aware that what became 802.11e was kind of a brain damaged idea, except for voip. I didn't really grok the real damage of 802.11n packet aggregation until 2012? 2013? All I really understood is that sometime around 2008 or so, wondershaper had stopped working worth a darn. Looking back in history (now) txqueuelen's had grown to 1000 packets and GSO and GRO had become a thing, and nobody else had noticed, either (and I was still doing things like SFQ by default and vegas, not realizing nobody else was doing that. I didn't get out much) After I believed jim enough to repeat his experiments in 2010? 2011?

I didn't get how big the problem was for everyone, either. I just thought it was my tin cans and string connecting me to the internet.

Anyway, a little more data on VI vs BE regarding just a BE flow competing with a high rate irtt -i3ms --dscp 160 - really need a test to integrate that sort of thing directly in flent - plotting irtt loss and marks -

My hope was that a test downloading via the VI queue exclusively would have, oh, no more than 4-8ms observed latency on this chipset. 20ms seems really excessive, and must be coming from ... AQL? the hardware? don't know.

2 Likes

A great deal of the testing I'd wanted to do on this thread took place over here: AQL and the ath10k is *lovely* - #859 by dtaht

I'd prefer to try and close out the aql and ath10k over there and move to here.

So, @dtaht, what feedback do you have about that ath10k bug?

My ath10k is in a storage unit 200 miles from here as are the remains of my lab. On my little boat I am using an ath9k/lte device, and recently picked up a starlink. I'm tempted to hack into the starlink and fix it ( https://www.youtube.com/watch?v=c9gLo6Xrwgw ) Anyway, the best I can do is help analyze tests, at the moment, until I find a cheap place to have a lab on land... or get a bigger boat.

Yeah, all right, I interpreted incorrectly your post. I'll test rrul_be later. I reckon it was mostly fine. Most of my "toys" are 18,000 km from here too. :wink:

@dtaht, see below a quick rrul_be test with the new topology, as promised in the ath10k thread (this is VHT80).

Any future test will do on HT20. I think it will help with the connection stability, right?

1 Like

really lovely. 4x1 bandwidth disparity. I'm really puzzled as to this in general. I felt after looking over ac and later in 2015 ack-filtering was going to be needed, but never got around to it: https://github.com/dtaht/sch_cake/blob/master/sch_cake.c#L1254

generic rrul, if it blows up, you can fix by quashing the qos_map. I have limited joy in seeing it blow up, but...

Do you mean by this porting it from cake to fq_codel?

I would do it a bit later during lunchtime as my network is under heavy use right now. By the way, I'm using the next qos_map_set in my network, i.e., re-mapping UP 6 and UP 7 to UP 5:

option iw_qos_map_set '1,1,8,1,18,3,20,3,22,3,24,4,26,4,28,4,30,4,32,4,34,4,36,4,38,4,40,5,44,5,46,5,48,5,56,5,0,63,255,255,255,255,255,255,255,255,255,255,255,255,255,255'

cake was in parallel development with fq_codel. The intent was to try out some ideas in cake first, and then compare with fq_codel, and merge the best of them. Over time cake grew to eat wayyyy too much cpu to want to port over to the wifi stack, and fq_codel runs faster on gigE and higher interfaces.

So it may be we go nuts and try to port most of cake over to the wifi stack (which might solve the 802.11e problems), or pieces, but until the last 6 months, most of our efforts were directed at very different stuff, like the L4S vs SCE fight in the ietf. In my case politics, outreach and mikrotik, apple, and now, libreqos, eat most of my time.

the ack-filter port would be so much easier if the fq_codel implementation hadn't grown hairy include files.

Another thing we needed in wifi was the drop-batch facility that's in the main qdisc... or cobalt... and increasingly GSO splitting seems sane. I'd come up with an fq_codel that was saner in a couple respects over here:

https://lists.bufferbloat.net/pipermail/cake/2018-September/004345.html

before getting sucked into the sce thing.

Other wifi factors besides ack-filtering dominate, as we've discussed, dealing with powersave, multicast, rssi, buggy drivers, have been most of the real latency and jitter inducing problems we've faced.

1 Like

Just want to say thank you all for all the effort you guys put into this. Upgraded my access point and router to 22.03 and everything seems to be performing even better than before.

2 Likes

thx!

Me being totally OCD, I sit here and obsess over where the three outliers on the upload come from. P99.99999 would be nice. Not there yet...

do you have an osx netperf binary you could upload somewhere for this person?

In the waveform test I routinely see outliers that I do not see in e.g. flent rrul tests, so I argue that these might be artifacts of using a browser to perform these measurements*, especially after all these side-channel mitigation efforts that decreases temporal resolution of browser time measurements.

*) In support of this I see considerably fewer outliers when using firefox than safari (both under macos) so the browser can add its own delay in this test.

Sidenote: with the relative low number of latency probes P99.99999 will be essentially identical to the reported maximum. And you can download a CSV file of each session, which IIRC contains the individiaal RTTs so one can look at different distribution parameters or generate a CDF/PDF.

2 Likes

Are we like, measuring the TLS exchange? that would cost some time...

Can you share details on your setup? ISP type, router & AP model? SCM settings?

I actually do not know, I guess we/you could invite waveforms maker to another RPM meeting and ask :wink: