CAKE w/ Adaptive Bandwidth [October 2021 to September 2022]

moeller0 · November 8, 2021, 8:58pm

VoIP typically employs some de-jitter buffers at the receiver and will tolerate quite some jitter... the buffers however cause additional latency and the longer the latency between mouth (on one side) and ear on the other the side the less fun conversation is going to be. According to the ITU people tolerate one-way m2e delays up to 150ms reasonably well (the smaller the more natural it feels), but somewhere higher (IIRC ~500ms?) people become irritated and start talking over each other (at that latency one needs HAM in-band signaling code words, like "over", but people are not used talking like this)...

Lynx · November 8, 2021, 9:01pm

Thanks for that. I found that without CAKE on my LTE connection any time a download started I would literally see freezes on my Teams/Zoom calls (RTT increase >= 100ms or higher). So to avoid that I set CAKE to worst case scenario. If the script were to default to max, it would mean every new loading from no load would interrupt calls, e.g Netflix/Prime/phone update - dire. Rather the safe thing to do is to cautiously ramp up as the demand for load is there and increases. Since with the script it seems possible to ramp up with only small RTT increases, e.g. <=10ms.

It's like the turbo function of a CPU. You don't set turbo on unless it is needed !

dlakelan · November 9, 2021, 12:48am

Just for historical reasons, it's useful to know that the original person who wanted the erlang script was on a slow connection and preferred to get as much bandwidth as possible while tolerating a period of bufferbloat, which was why it had that design.

On the other hand, the original script decayed the bandwidth exponentially, and I believe this script does so linearly? So convergence should be faster with exponential decay. This is something you could test out with this script. Rather than a linear ramp here, do something exponential

            next_rate=$( echo "scale=10; $cur_rate - $cur_rate_adjust_RTT_spike * ($cur_max_rate - $cur_min_rate)" | bc )

such as:

downscale=90
next_rate = $(echo "scale=10; $cur_rate * $downscale/100" | bc)

Though you might want to round off to the nearest say 10kbps or something.

moeller0 · November 9, 2021, 7:09am

So the rationale for linear decay is that it is faster.... for the same initial scaling factor....

Say we use 80%:
linear: 100, 80, 60, 40, 20, 0
exponential: 100, 80, 64, 51, 41, 33, ...

I guess the question is how fast does the true bottleneck rate fluctuate and in what increments, that is what time constant for the decay is acceptable....

I am not saying the exponential decay might not be okay as well, but I do not see that it will be clearly superior.... and we actually can do integer subtractions in shell (I want to reduce the calls to bc somewhat, the fewer binaries the script needsto call the less load it will introduce and the shorter the minimal cycletime can be.

dlakelan · November 9, 2021, 9:50am

I guess that's one way to look at it. The other way is suppose you have max speed of 70Mbps and min of 10 and your linear ramp has say 20 steps of 3Mbps each. Suppose you're at 70 and need to drop below 30 to stop bufferbloat. You will get there in 9 steps of factor 0.91 (which also has 20 steps to get to 10Mbps) but 14 steps of 3Mbps linear ramp.

moeller0 · November 9, 2021, 10:11am

Yes, IMHO we should scale down aggressively, so in at most 8-10 steps, maybe even just 5... not sure what the current number of steps is, but I guess more than I like...

But my main goal is to get rid of as much bc as possible, at least in the "fast path". Which with fixed reduction steps is a piece of cake (pun intended)
And I would love if sh had real arrays*.... (not a big fan of posix sh, its restrictions are probably justified, but not being a CS person I have a hard time working around the limitations)

*) A number of your proposals, like keeping ~ 1 day memory of average rates per quarter hour would be easy with arrays. But for the use case "run on any OpenWrt router" we need to do with the small things... and say what you will, posix shell is ubiquitous...

dlakelan · November 9, 2021, 11:35am

Agreed that shell is both a least common denominator, and a pain in the ass .

I do think most of the math can actually be done in shell using $((A + B/C)) type notation. It will be integer arithmetic so to do things like multiply by 0.91 you approximate by $((A*91/100))

dlakelan · November 9, 2021, 11:48am

Hey I just noticed that the chicken scheme interpreter is available and about 1.2MB so that's a fairly nice alternative...

But yeah its hard to get away from the shell.

moeller0 · November 9, 2021, 11:58am

Yes, switching from ratios to percent or promille, and for the RTT maybe from milliseconds to microseconds might make the truncation/rounding errors acceptable... (for rates I simply hope that nobody with single/double digit kbps links tries this script )

moeller0 · November 9, 2021, 12:00pm

I guess on truly small flash routers 1.2 Mb is already prohibitive and on routers without flash scarcity something else might be more appropriate...

dlakelan · November 9, 2021, 12:19pm

Since Lua is still installed on most routers it is probably the best choice for something that isn't shell. But this shell script is doing pretty well. Keeping a history might be doable in a file and use awk or something to rummage through it... I dont even know is there a busybox awk? Or is that a separate install?

moeller0 · November 9, 2021, 12:34pm

OpenWrt defaults to busybox awk as far as I can see, and ISTR that awk is used by some system scripts as well, so should be pretty reliably available.

That said, implementing history is not really urgent (and I am talking just about what I want to play with, @Lynx will have his own ideas for the official version of this script).

yelreve · November 9, 2021, 8:16pm

@Lynx, @moeller0 I've heavily refactored the methods used in the the lua service script, first to incorporate the ping baseline comparisons discussed above, and also made some substantial changes to how the steady state metrics are used to determine probability for reducing egress/ingress (this differs from your step based approach) - I've found this works better than the steps approach under loaded conditions at maximising bandwidth in each direction whilst keeping on top of bufferbloat induced ping spikes.

The script monitors the output of a single oping process rather than instigating a new process for each reading, as a result it takes the minimum ping of all hosts pinged during each 0.5s interval. This seems to work well enough and avoids ping bursts that might get flagged by your reflector.

@Lynx @moeller0 I've added in the ability to log the output both to file and user console in the same format as your shell script, it has 4 additional columns (rx rate, tx rate, rx baseline comparison, rx decrease chance, tx baseline comparison, tx decrease chance).

Below is a graph plotting logs taken during a DSL Reports SpeedTest after a cold start of the service - note that this is worst case conditions where the SQM rates for ingress and egress have been configured much higher than stable achievable bandwidth.
The cold start test scored a B, with an A achievable under loaded conditions once the service has calibrated its steady state metrics.

With all that said, I'm keen to prototype a direct lua implementation of @Lynx's shell script so we can compare performance of the two under the same language.
If that prototype proves better then it would be easy to utilise the existing config handling code, hotplug and init scripts for starting and stopping the service for multiple wan interfaces.

I would really appreciate it if you can both test and compare to this under your conditions alongside the current shell implementation.

Note that you don't need to configure min/max bandwidth or other parameters aside from an ingressDevice if it isn't the default ifb4 for the interface.
Optionally you can tweak the starting rtt, ingressTarget and egressTarget ratios (please start with the defaults for the last two!).
The rest is self determined by the service from steady state variables.

The rtt config setting is used to set the initial baseline which helps calibrate the service upon start-up (it will self adjust using the same EWMA method found in the shell script).

Log just adjustments to the user prompt:

lua /usr/sbin/wanmonitor.lua -i wwan -c

Log all intervals to the user prompt:

lua /usr/sbin/wanmonitor.lua -i wwan -c -v

Log all intervals to the user prompt and log file:

lua /usr/sbin/wanmonitor.lua -i wwan -c -v -l /tmp/wanmonitor.log

Make sure you have a section for the interface in your wanmonitor config file:

config interface 'wwan'
	option enabled '1'
	option ingressDevice 'br-lan'
	option reconnect '1'
	option autorate '1'
	option rtt '50'
	option egressTarget '0.8'
	option ingressTarget '0.7'

jeverley/wanmonitor: A lua service providing WAN interface monitoring and SQM autoscaling for OpenWrt routers. (github.com)

dlakelan · November 9, 2021, 9:06pm

This looks outstanding!

Lynx · November 9, 2021, 9:33pm

Terrific work. I hope to do so soon, and also to think further about your comments and input above. I am sorry that I have not done so properly until now. I've been swamped with my patent work and am longing for some more free time.

I would encourage readers to try @yelreve's script. He has clearly put time and effort into this, and the more testing we can get the better.

anon89577378 · November 10, 2021, 12:55am

A question...

Is this script used in conjunction with a queue setup script (piece of cake, etc.) or as a replacement?

dlakelan · November 10, 2021, 12:57am

In conjunction. the SQM setup will create the queues etc. This will simply modify the bandwidth settings of those queues.

anon89577378 · November 10, 2021, 1:03am

Is there a recommended queue?

Any congestion control change needed?

Currently, the Archer C7 v2 is using Cubic, but Reno is available.

dlakelan · November 10, 2021, 1:06am

I think it has to be CAKE

It's a misunderstanding to think that those settings affect anything. They only affect traffic to/from the router itself, so if it's downloading a package or similar.

anon89577378 · November 10, 2021, 1:09am

Right.

What I was asking is what queue setup script?

piece_of_cake.qos
simple.qos
simplest_tbf.qos
layer_cake.qos
simplest.qos