CAKE w/ Adaptive Bandwidth [October 2021 to September 2022]

Just keep in mind, that by changing our shaper we only indirectly affect the actually load-generating flows, so we need to not over do things by changing too quickly or in too big steps (mostly down) to avoid creating oscillations/resonances. A slower/dampened control loop is IMHO less likely to experience such effects (says my gut-feeling).

1 Like

agreed... unless other metrics are factored into said changes...

Happy to implement this if you think worthwhile for some. Might another use case be those rather weak LTE connections where more latency has to be sacrificed to get bandwidth so decaying down to minimum would mean too little bandwidth.

So I understand this correctly it will decay down to this value and stay there until load is detected at which point it will increase the rate as usual? So if this rate is set at maximum the effect would be steady state unless say the ISP becomes exceptionally congested and then will scale down to address that. What happens when load is dropped after that scaling down following RTT spike? It decays upwards rather than downwards?

That sounds painful, give that posix shell has no real arrays... But I guess one could do the array manipulations in AWK and construct strings as inputs and outputs, so that shell only ever sees single strings... but that would still require some nasty code to massage said strings (as these would need to carry both AWKs result, as well as the last sate of the array)

1 Like

Well, that depends, if the last set rata was below the default it will technically not "decay" but grow up, but always with the same long term alpha....

Exactly, and if it is set to the minimum it wil give the current behaviour.

Yes, it will either decay or grow depending on where it was when the load ceased, we could/should implement two independent alphas here....

1 Like

Not just a case of alpha sign flipping but also the rate? Like would we want to decay up and down towards steady state at same rate of different rates?

Flipping sign would be easy I suppose and maybe you still want to return to steady state slowly. So rate same?

1 Like

Mmmh, you are right, let's start simple....

Hehe, ingress shaping has gotten rather good I'll admit, but on low bandwidth links like the ADSL line we had, CAKE on ingress couldn't handle things like Steam downloads that fire up multiple connections without RTT going through the roof. I likely could've gotten ingress mode to work alright, but that would mean sacrificing a big percentage of the bandwidth.

They did, and they also switched automatically between fastpath and interleave modes depending on line conditions, so when that happened I could lose/gain a fair bit of bandwidth. I should add that we're 4.5 km from the central office, so it was a miracle really that it worked as well as it did.

Could very well work, if it does it'd save me a lot of time figuring out another way : )

I do note that as long I'm not limited by radio conditions the PGW does a pretty alright job shaping the link to the subscription rate, obviously not a FIFO there, so as long as the signal is good, I could probably get away with just running CAKE on ingress. Should be easier if I could get a modem that doesn't mess me about with changing LTE bands by itself

Is that when using TCP? Vodafone UK applies 10Mbit/s throttling on port 443 and otherwise I think many LTE providers employ various forms of traffic shaping in times of congestion. Since I don't want to be subjected to restrictions like the 10Mbit/s on port 443 I use a VPN (NordVPN). But WireGuard is UDP only. So I think I am in this weird situation where I need to use VPN to circumvent overly restrictive / dubious throttling practices by my ISP and then my own SQM to address increased bufferbloat associated with UDP and thus not benefitting from TCP's congestion control? That stated, I think I'd still want to use SQM anyway because although the latency increase without VPN was not as bad, it was still there.

How do you signal to your sender the bandwidth for it to use?

Also why don't you try sprout?

No I haven't tested that. I currently run everything over WireGuard, so my ISP only sees UDP.
I would've done more testing without a tunnel, but I've had some weird issues where connections start failing after a while if I do.

You'd still benefit from TCP congestion control though, no? Sure the encapsulated packets won't directly, but everything using TCP inside the tunnel still would. I might be wrong, but my understanding is that it shouldn't act any different than if it wasn't encapsulated.

But you are of course subject to any buffering WireGuard might do, not sure if it does anything that could affect latency.

If you mean the rate for CAKE on the VPS, I currently do not.
Since I'm mostly hitting the limit for the PGW shaper on the ISP side, I can just set it to the subscription rate and it'll for the most part work fine.

The problem I'm seeing is mostly that the rate can be too low for it to ramp up speed sometimes, so sqm-autorate should be great for this, but I'd have to change it to allow for only egress. And of course, ideally it should be able to see if it's download or upload rates than needs changing, haven't read the whole discussion on that.

I might've missed it, but I haven't seen anything that's "ready to use"?
Is one of the examples in https://github.com/keithw/sprout/tree/master/src/examples something worth testing? Didn't see any documentation anywhere on what's what.

@moeller0 any thoughts? I found on my connection by not going over WireGuard I didn't get such high bufferbloat. It is true that my ISP seems to do a lot of funky management like the 10Mbit/s throttling per 443 stream, which presumably could actually help with bufferbloat for those individual streams (not that I am happy with that because it limits downloads and I think even OneDrive activity). But I think there was more to it than that which I do not understand. Maybe the traffic management that ISP puts in has a good component and an evil component, and that by removing the evil component by using VPN I lose out on the good component.

I am not sure. It is just the MIT paper and video looked promising. But it might just be me being gullible. As @moeller0 points out, why didn't sprout get widely adopted and stay as obscure as it has done. Then again it could still be amazing. They set MIT students the task of addressing this problem and found a frontier of latency/bandwidth tradeoff, with sprout sitting in a nice place:

not really... as discussed...;

  • no multiple interface handling
  • no sqm iface awareness (up, down, rate change etc)

feel free to unpack my(well based on your) tar.gz and give it a try... it has enough comments and such and most of the foundation logic to work around all these things...

be aware tho' it is far from finished... but it's enough to grapple with whats involved...

1 Like

This is a case where @Lynx's autorate script should work very well, since you only need your own home router as ICMP reflector and that you can control well (disable rate-limiting)...

Simple hack, just configure the min and max rate for ingress statically to your desired rate...

Again since you are testing against your home network you should be able to get reliable one-way delay measurements going which would allow to assess bufferbloat for each direction individually.

Maybe this can help?

Throttling individual TCP flows to 10 Mbps will certainly help to keep bufferbloat lower, albeit at a throughput cost.

Yes, definitely one of the advantages of doing it this way.

Not sure we're talking about the same thing. If I run @Lynx's autorate script on the VPS, there won't be any download (no IFB or veth) device for it to set rates on. I could just create a dummy interface of course, and set sqm-autorate to use that for download, thus avoiding having to modify the script.

I probably misunderstood earlier posts, I thought changes to the script were discussed so that it could better detect if if it's egress or ingress causing bufferbloat.

Thanks!
So this repo is what we need to look at: https://github.com/anirudhSK/alfalfa
I see there's also a fork with some examples for how to use it added to the readme as well: https://github.com/HenkPoley/alfalfa

Not surprisingly, it doesn't compile at the moment. Need to update it to compile with newer OpenSSL versions at the very least.

make[3]: Entering directory '/home/lochnair/Downloads/alfalfa/src/crypto'
  CXX      base64.o
base64.cc: In function 'bool base64_decode(const char*, size_t, char*, size_t*)':
base64.cc:48:40: error: invalid conversion from 'const BIO_METHOD*' {aka 'const bio_method_st*'} to 'BIO_METHOD*' {aka 'bio_method_st*'} [-fpermissive]
   48 |   BIO_METHOD *b64_method = BIO_f_base64();
      |                            ~~~~~~~~~~~~^~
      |                                        |
      |                                        const BIO_METHOD* {aka const bio_method_st*}
base64.cc: In function 'void base64_encode(const char*, size_t, char*, size_t)':
base64.cc:101:40: error: invalid conversion from 'const BIO_METHOD*' {aka 'const bio_method_st*'} to 'BIO_METHOD*' {aka 'bio_method_st*'} [-fpermissive]
  101 |   BIO_METHOD *b64_method = BIO_f_base64(), *mem_method = BIO_s_mem();
      |                            ~~~~~~~~~~~~^~
      |                                        |
      |                                        const BIO_METHOD* {aka const bio_method_st*}
base64.cc:101:67: error: invalid conversion from 'const BIO_METHOD*' {aka 'const bio_method_st*'} to 'BIO_METHOD*' {aka 'bio_method_st*'} [-fpermissive]
  101 |   BIO_METHOD *b64_method = BIO_f_base64(), *mem_method = BIO_s_mem();
      |                                                          ~~~~~~~~~^~
      |                                                                   |
      |                                                                   const BIO_METHOD* {aka const bio_method_st*}

Edit: Didn't take much to get it to compile: https://github.com/Lochnair/alfalfa/commit/514ae63267b46610bb6fb8e451f6e7fdbf2c4196

2 Likes

I'm excited...

Ah, okay, clearly the script needs to learn to enable interfaces individually.

Yes, one-way delay measurements are under discussion. In the generic case this is hard because there are only few nodes out there that will give reasonably precise time measurements that can be used for OWD measurements, but in your case you can set one up in your internal network, so that should work better....

most of the issues that you may experience with cable/docsis (Virgin Media, UPC) are causes by the shared access infrastructure. If you're in an overloaded CMTS cluster, Theres not much you can do. Of course you can have a small cable cluster that let's you always hit your full bandwidth no problem. Then you're lucky, but that won't happen often. And That's the nature of it, never was designed to transmit individual users internet data.

Neither was phone wire, but with xDSL (FTTC/H/B) you only have shared core network, which rarely is a problem. With dedicated twisted pair you are better off when it comes to stability and reliability of the internet access imho.

@Lynx ISPs do cache things like YouTube and Netflix and can limit bandwidth per session, I have notice that Zen (UK ISP) limts there to about 20Mbps even at 4K.

Living in the Scottish Highlands I don't have much choice. It's just a 6 Mbit/s max copper ADSL line or LTE. Happily I live next to a Vodafone cell tower that doesn't seem to be terribly loaded and get healthy enough bandwidth such that with SQM I can get a decent amount that is low latency.

The sqm-autorate script seems to do a pretty good job for my connection in terms of recovering a lot of otherwise lost bandwidth for my file transfers whilst keeping latency low. I now have it running all the time from service file and seems to work fine on my RT3200.

Still room for a lot of improvement. And my hope is that ultimately everyone with variable connections wanting to use CAKE can benefit from an adaptive solution that just works, whether that is based on the shell script or something else I don't mind. It's just that for long enough there has not been one go to solution (only a few sketchy DIY hacks), which is why I started this thread.

About Zen what ever happened to net neutrality? At least you can purchase VPN for not very much.

But now given @Lochnair's posts above I am rather curious about using a VPS.

And @Lochnair I am extremely curious to see if you get sprout to work. Please keep us posted.

No idea, I think they either limit to 20Mb per sec to stop it overloading there CDN servers or they think someone streaming Netflix and they dont need full speed to download as is just streams and buffering in background, Not an issue with Akamai as that flats out my VDSL connection speed.

Mobile/Cell data can be VERY hard to do! Peak times especially