Bufferbloat, it's not just for WAN connections anymore

Most of his network is stuff to bring a cable to an AP and other devices, so yeah everyone that does not live in a tiny house like me where I cover the house and the whole garden with a single AP sitting above the server rack may face similar problems.

You are really missing the point by a country mile here. The issue is not the equipment but the cables.
Can you easily rip out walls or install eth cables all over the building? Do you have the money to do it properly? Can you actually do it (you own the property or are renting?

In many countries it's not as easy as in the US where you can literally break "walls" with a strong kick because they are wooden frames with gypsium panels over them, nor the houses last so little like in the US where you need to tear down everything every 50 years.

In many countries the walls are made of brick or concrete and if your home's existing internal tubes for cables (if you even have that) are already filled with electrical wire there is no other way, either you put external cables (and go for the "industrial look") or use powerline. Nobody is breaking hundreds of meters of wall (and sometimes floors) to install a new cable tube for ethernet.

And for some puzzling reason, I see a lot of very new homes that still lack any kind of provisioning for ethernet, so again people is stuck just like those in older homes. That's one of the reasons why the Mesh wifi setups are so popular for consumers nowadays, you don't need to work on the walls to pull wire all over.

Because if you can just throw gigabit cables all over the place his point is indeed moot, you just rewire everything to reach a single large switch in the center of the house and boom, done. No more problems. But this is not easy in many cases as I said.

I mean, one of the main reasons 10Gbit isn't more popular is that it requires ripping out and replacing A LOT of existing cables that were installed years and years ago

4 Likes

+1 and credit for using expression: 'by a country mile'. I totally don't get why people are so dismissive about use of WiFi extension means and blasé about installing cables. A ton of users will use WiFi extenders, WDS, mesh and the like because it is more practical.

Also the best WiFi rates I can manage with my WIDS setup are around 350 to 450 Mbit/s, but it varies between say 50 Mbit/s and that maximum depending on where I am located in the house, and how many concurrent WiFi connections, etc. Say if I am sitting on a bench in the garden with my laptop perhaps it fluctuates between 50 Mbit/s and 250 Mbit/s. And then I move to another bench or the outdoor table. Then it changes again.

1 Like

Have only one question. Do you talk about US mile or Swedish mile or some other mile?

I don't understand how a test is decent if its inconsistent and doesn't really test what you're trying to prove?
There are quite a few claims here but very little to actually backup it up in terms of hard numbers. I'm sure there is some but to the extent where it matters / makes a practical difference?

1 Like

What I think flygarn12 is getting at is that some of these theoretical setups are quite far fetched at least if you're going to target your average user. They most likely exist but more likely being a minority than a majority and does bufferbloat matter in those cases?

1 Like

Well, in my experience the dslreports speedtest for example is mostly self consistent when used with the same browser in similar conditions, and will for example show a clear increase in latency under load when I disable my traffic-shaper/AQM versus enabling that (shaped 100/36 versus unshaped 116/37). Since these increases are well reproducible I tend to trust that test, but I also do not put too much stock in its bufferbloat "grade" but always loot at the time resolved latency plots and compare those between different conditions. I also took care to properly configure that test in the first place.

Well my question is still open:

Sorry, I do not understand that question, could you maybe rephrase that?

I am sure I have no clue about the average user's set-up, so I do not mind threads like these that are tailored to some specific settings. At least over here PLC adapter are (unfortunately) quite popular and so the painted scenario does not sound outlandish to me. Sure, I would assume (not know) that many more users are bottlenecked by their WiFi link or WiFi mesh, but that still leaves the true bottleneck inside their home network. Sometimes these bottlenecks are properly debloated already (Openwrt with fq_codel in the WIFi stack and/or airtime fairness patches applied), some times these are stock proprietary solutions that range from competently done and bloat-free to painfully over-buffered and under-managed.

As I said in the second post, quite early on bufferbloat has been detected in switches and WiFi, which is not too surprising, as queueing will happen where ever there are speed transitions in a nework and unless these queues are properly sized and managed bufferbloat will show up if that link becomes the relevant bottleneck.
Whether bufferbloat matters or not is mostly a policy question each network admin needs to decide for herself, but I think it important to inform folks about bufferbloat and its consequences so these can be educated decisions.

@amteza can you post data from script run on your mesh setup and we can actually graph what happens? Maybe this might help? I think it might be helpful to show the effect of bufferbloat originating from a local bottleneck, and ideally also the effect of addressing that either with fixed bandwidth or variable bandwidth CAKE. Incidentally, LTE connections are set to increase to 1 Gbit/s too. Everyone wants more bandwidth!

Latency is only slowly being picked up upon. One of my friends has children that keep downloading stuff and gaming whilst he is trying to use Zoom. He imagines just getting more bandwidth will be the solution to the problem.

I think Joe Bloggs understands bandwidth, but not latency, and certainly not bufferbloat. This always makes me smile:

@moeller0 I am curious about the similarities and differences between bufferbloat originating say with multiple LTE towers serving simultaneous connections and multiple WiFi access points also serving simultaneous connections. Clearly in the home environment we have more control over what is going on.

1 Like

Mmmh, LTE uses a central controller instance (in the basestation I believe) that arbitrates/schedules transmissions to the remote user equipment, and that also grants these remote stations permission to send (so is aware and in control over who can transmit at what time), while WiFi in home environments is typically without a central scheduling instance but relies on listen-before-you send and collision detection. So at least in this dimension WiFi offers less control than LTE (from the LTE-operator's point of view, for end users there is not much one can change on an LTE link as far as I know). How these differences translate into the individual rate allotted to a specific station and how quickly that rate changes, I have no strong prediction. But this topic might be better discussed in another thread, since this is more about "wired" connections (including re-/ab-using existing power lines), no?

that's still wrong to assume that "average user" has a single router with wifi and that's all we should aim for imho. There are plenty of users that ask about setups with one or multiple APs or mesh in this forum.

In the pure consumer space I've seen a staggering amount of non-tech-savyy people I know buying proprietary Mesh wifi systems, just because they are very easy to set up (in most cases they have buttons to push for pairing or an app, or something like that) and work so_much_better than older gen wifi repeaters, both for wifi roaming (for devices) and bandwith/latency.

Powerline systems have not disappeared either, there are offerings from all major manufacturers and they are all faster than older gens so it's fair to assume they do sell enough to justify their existence and the R&D to improve the next gen

Is there a decent way to test this? Imho the first step in these things is to create a good test suite or test procedure.
Without that, this is all hearsay and idle chat

Well, according to my experience tools like:

  1. a properly configured dslreports speedtest
  2. waveform's bufferbloat test
  3. or a manual combination of fast.com (set to at least 60 seconds test time) and concurrent mtr
  4. netperf/flent
  5. iperf2 and concurrent mtr
  6. iperf3 and concurrent mtr

will all work (the last three suffer from the issue that one needs to find proper servers on the internet that are publicly accessible and capable to saturate the bottleneck link). All of these obviously need some testing and control measurements to confirm that they allow to differentiate between known bloated and known unbloated conditions. None of this is "rocket science" including the validation, but it is also not a single click and forget kind of affair either.

1 Like

If concerned about local bottleneck can't you just ping main router from local access point and observe increase associated with saturation of that local connection between main router and access point? So everything just local? I thought that is what @amteza did on his mesh with the script to vary CAKE bandwidth.

Does it need to be >1 WiFi access point / powerlan, etc.? I am still getting to grips with much of this but say you have a 1Gbit internet connection and just one router with a WiFi that can offer 400Mbit/s. If one client downloads a file whilst another client seeks to use Zoom, will the saturation of that 400Mbit/s link lead to bufferbloat? If not, why not? I must be missing something pretty fundamental because this seems to be taken as a given from the above. So I am keen to learn!

1 Like

I read that an rj45 ethernet cable can also cause high latency Is this true ? because it slows down

Probably if you measure it enough many times you will get a result worse than you want.

But there are a lot of different cables also.
But almost all of them that you will find today is a lot faster than the fastest wifi.

And you can only run it for 150m.

1 Like

Yes, but you still need to saturate the bottleneck link. Depending on your networking chops and devices in the home network, one can do bufferbloat testing purely within one's own network, but the entry hurdles are IMHO higher than directing a browser towards dslreports' of waveform's test sites.

Yes, purely local is possible, but not necessarily trivial enough to have uncle Herbert or other layman relatives perform that on their own without guidance.

If both share that link, yes it can result in latency issues for the videoconference (but it does not need to). This is exactly where competent AQM can shine, in this case in the AP, but if our hypothetical bulk transfer goes in the other direction, so a 1Gbps upload, things become trickier, because now each client that might upload the data needs to be considerate about airtime access. (This is where careful prioritization of the VC data can help, as the higher wifi ACs have a better chance to acquire an tx-opportunity than the lower ones, the devil is however in the details).

For example if the AP is an recent OpenWrt device with either an ath9k, ath10 or mt76 it might already employ fq_codel in the wifi stack and that (in the case of ath10K together with AQL) can already help to make the bulk download not clobber the VC. Note that for uploads that is much harder as there is no real information about potential other uploads available to the stations (this is where LTE has an advantage, the central controller in the basestation? also assigns upload slot to the stations, so can also arbitrate uploads, plus LTE typically uses different frequencies for uploads and downloads unlike WiFi, where an overly aggressive upload can pummel concurrent downloads noticeably).

Well, queueing delay can happen whenever there are transitions from fast to slow and that can happen with wires or wireless.

Mmmh, I was under the impression that the limits for individual ethernet segment were <= 100m (depending on combination of speed and cable type). What am I missing?

I agree, AP manufacturers hide this fact often by advertising using the maximum gross rates a device can achieve (even though with wifi often the throughput is in the range of 70-50% of the gross rate) and to add insult to injury, they often also simply add up all the maximal rates of all radios, so a dual band device is advertised by the sum of its highest 2.5 GHz rate and the highest 5GHz rate, even though it is pretty unlikely that a single client will ever see this aggregate rate in reality.

Unless you have a particularly good implementation of WiFi with queue management built in, then yes you'll get delays and packet loss. This is probably 99% of people these days since still the vast majority of people are using either something given by their ISP or something like an EERO mesh or a wifi extender coupled to an ISP router or etc.

I know of at least one family where the mother taught medical school anatomy students and was very much in danger of being fired due to her lack of ability to effectively stream 2 streams of video (her and her microscope camera) + 1 audio to the students, this from the fastest available XFinity cable connection. The WAN wasn't the issue. I spent several days on the phone reorganizing their network step by step. In the end they had basically 100Mbps of buffering free networking everywhere in the house, and didn't lose half their families income. This can be a very serious issue for people.

On the other hand, my own kids teachers had TERRIBLE home network setups during the "remote video schooling" of 2020 and early 2021 and were resistant to any sort of help (afraid I was "hacking" them and would only take instructions from official school district technicians (both overloaded and not competent to handle this issue)) and the result was that my kids just didn't attend their video classes and spent their days on Kahn Academy. It was too painfully bad to force them to submit to that garbage.

2 Likes

I typically don't even try to help unless asked, as a) my area of expertise is pretty limited (and that only covers a subset of what the rest of my family would require) and b) I find that it is hard to make people understand the core of the issue unless they are already "mentally ready" (aka suspecting that something is not how it should and could be).

1 Like

Sure can, I will do it a bit later today as it's an early start for me today and I cannot disable AQL/airtime fairness in my WiFi setup as I'll be back to back in Zoom/Teams calls my whole day.

Note: @moeller0, I've got fed up with latency in my mesh and ended using a swiss knife to open the NanoHDs to install OpenWrt utilising a serial cable.

3 Likes

And that improved the latency under load? Respect, with a swiss knife (and maybe some duct tape for closing them up again), I would say you McGyver'd it :wink: