CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

Lynx · December 5, 2022, 6:41pm

@moeller0 and @patrakov one more change before the version change - just to output delta_ewmas for plotting. That's coded up now too. Just need to verify it's working.

Does this look OK to you @patrakov and @moeller0? The idea is to facilitate tracking and plotting of the deltas per reflector using ewma and alpha=0.095.

moeller0 · December 5, 2022, 7:16pm

Just one observation, when adding/changing fields in the middle all log files need descriptive headers. If you want forward compatibility you need to add new fields always at the end. Personally I like the "always with header" approach better.

Lynx · December 5, 2022, 7:25pm

Yes me too - I prefer to keep the order logical albeit at the expense of compatibility. Any chance you could code up plotting of the delta ewmas for version release? I don't actually know if the delta ewma code is correct or not. But I think it'd be great to see the delta ewmas?

patrakov · December 5, 2022, 7:33pm

There were changes in the # comments (not only ###) - you haven't picked them up. Is this intentional?

Lynx · December 5, 2022, 7:41pm

Ah, didn't see those. I will add them in.

Otherwise @moeller0 and @patrakov the delta ewma tracking works I think - see:

moeller0 · December 5, 2022, 8:35pm

Not a big deal, which branch produced that data? Adding to the timecourse plot should be quite easy...

Done, see PR, but since I did not have a logfile with the EWMA fields the PR is not fully tested. If you would share a logfile, I could do a quick sanity check...

Lynx · December 5, 2022, 10:03pm

Should the pull request not be on 'testing'? Since ultimately that's all about to get merged and assigned new version.

BTW I wonder - is the only significance in respect of the ewma of the delta in comparing reflectors? At the very least it looks cool in the graph below:

We see nice and clear spikes broaching our adjusted threshold.

This is with alpha set to 0.095 (20 point SMA). Not sure how good a default that is. Maybe we should rather have 0.048 (more like 40 point SMA).

moeller0 · December 5, 2022, 10:32pm

Good question, did we add plotting code unique for testing so far? This should be pretty self contained since it only touches (at least should only touch) the octave code. I tested withbold logfiles where the results look okay....

I can have look tomorrow at your l9gfile, so maybe wait with pullinhg this?

Lynx · December 5, 2022, 11:15pm

Thanks @moeller0. Maybe you could pull on testing since the revised code is all there and that will all get merged to main.

Lynx · December 5, 2022, 11:25pm

Hey everyone, I wrote a very simple new utility: 'cake-qos-simple' to:

a) classify DSCPs on upload;
b) conntrack restore DSCPs on download; and
c) set up instances of cake on upload (wan egress) and download (wan ingress).

So this way DSCPs can be set by LAN clients and/or by router on upload and then automatically applied on download for each tracked connection, and the cake instances will see the DSCPs thereby leveraging the diffserv functionality in cake.

cake-qos-simple includes:

an init.d service file: 'cake-qos-simple'; and
an nftables script 'cake-qos-simple.nft'.

It is very simple and lightweight in that it only requires the following additional packages in 22.03.2:

tc-tiny
kmod-ifb
kmod-sched
kmod-sched-core
kmod-sched-cake
kmod-sched-ctinfo

It is available here:

And of course can be used in conjunction with cake-autorate.

@rb1 this will provide the DSCP capability you asked about above.

patrakov · December 6, 2022, 2:58am

Shouldn't this be split in its own topic?

moeller0 · December 6, 2022, 6:38am

Did that, but I think that git allows you to do essentially the same from your end, pull into a new or existing branch. Or pull into main and then pull main from testing. However since plotting changes are self contained and orthogonal to your changes all of these options work equally well.

moeller0 · December 6, 2022, 6:40am

Question, does procd aitomaticlly add hotplug capability? If no, maybe add a hotplug script as well, otherwise stuff like daily renegotiated PPPoE links will need to be restarted manually every day....

Lynx · December 6, 2022, 7:14am

That's been provided too:

github.com

lynxthecat/cake-qos-simple/blob/master/11-cake-qos-simple

#!/bin/sh

[ -n "$DEVICE" ] || exit 0

[ "$ACTION" = ifup ] && /etc/init.d/cake-qos-simple enabled && /etc/init.d/cake-qos-simple start

[ "$ACTION" = ifdown ] && /etc/init.d/cake-qos-simple enabled && /etc/init.d/cake-qos-simple stop

Great thanks - I have merged the pull request now. Did it work with my log file?

So I just need to further add @patrakov's comments that I missed.

And otherwise perhaps this is cake-autorate 1.2.

So many changes. Wasn't our plotting capability added between 1.1 and 1.2? And adjustment to the delta calculation. And multiple instance handling. And ping prefix string. And delta ewma tracking and plotting. And reflector randomisation.

Three questions:

@patrakov doesn't like return to base functionality and presumably has to perpetually rewrite that code part. So why don't we just make that a toggle. Would this be useful for you @patrakov?
shall we make reflector randomisation togglable as well?
we initialise baselines to 1s and delta ewmas to 1.5s. Is that OK?

moeller0 · December 6, 2022, 7:40am

Great!

Don't know as I did not test that yet.... I just saw your request to change the pull destination branch and implemented that (unlike testing I can play with pull requests on my phone).

Sure, I think we are overdue with a new version since some time....

Puzzled, all he needs to do is to set it to the same value of either the Minimum or the Maximum... we could make a special case if set to 0 just leave the rate as is, that is only increase or decrease if the specific condition triggers. Personally I would not use such a mode but it might work for @patrakov? (However that very similar to simply setting a loooong interval for the baseline steps)

And resort to round-robin replacement instead? I think conceptually random replacement is better so would not make this a toggle, users can already reduce the full set to the number of concurrent delay probes to opt-out of replacement, at which point random order is essentially irrelevant, no?

Lynx · December 6, 2022, 7:58am

Just an easy toggle enabled by default. I quite liked knowing that the top four would be a certain set and that the spillover would be another one.

Yes I've never entirely understood why he doesn't like it. But yes I think he just wants on cessation of load for the previous rate go be maintained presumably on the basis that that was the last known good rate. Now I don't think that's ever a good idea for links with a big variable rate span because that would mean coasting at say 70 Mbit/s on mine, which would be bad on step load later down the line. In my case dropping back to 10 Mbit/s is much safer - the boat returns to the safe harbour.

moeller0 · December 6, 2022, 8:05am

I think all he needs to do is either fix the alpha for baseline adjustments to either 1 or zero (not sure which without looking at the code) or set the base adjustment interval to something like 1 day... my point is the existing toggles seem sufficiently expressive to allow his desired configuration already, so no need for an additional toggle.

moeller0 · December 6, 2022, 8:13am

For this I liked the idea of having two sets an optional fixed order set and a second set for random replacement. But I am also fine with always just starting with the first N in the set and simply pick replacements randomly...

moeller0 · December 6, 2022, 8:28am

So here is the plot, the darkest brown/teal are the EWMAs. As you can see over all reflectors there is some inter-reflector variability that hides what we expect to see. I guess it is maybe time to also plot per reflector timecourses (but with our larger default set of reflectors that is going to be a lot of plots or one really large plot). ATM it seems hard to ell whether the plot is correct or not...

Here just for 1.1.1.1:

This pretty much looks like it is doing the right thing...

patrakov · December 6, 2022, 9:00am

Regarding disabling the return-to-base functionality, it can be already disabled by setting shaper_rate_adjust_down_load_low=1.0 and shaper_rate_adjust_up_load_low=1.0. What I have to patch out is the return-to-minimum functionality after sleep:

--- a/cake-autorate.sh
+++ b/cake-autorate.sh
@@ -1105,19 +1105,19 @@ do
 
 			if (( $t_start_us > ($t_connection_stall_time_us + $global_ping_response_timeout_us - $stall_detection_timeout_us) )); then 
 		
-				(($debug)) && log_msg "DEBUG" "Warning: Global ping response timeout. Enforcing minimum shaper rate and waiting for minimum load." 
+				(($debug)) && log_msg "DEBUG" "Warning: Global ping response timeout. Stopping pingers and waiting for minimum load." 
 				break
 			fi
 	        done	
 
 	else
-		(($debug)) && log_msg "DEBUG" "Connection idle. Enforcing minimum shaper rates and waiting for minimum load."
+		(($debug)) && log_msg "DEBUG" "Connection idle. Waiting for minimum load."
 	fi
 	
-	# conservatively set hard minimums and wait until there is a load increase again
-	dl_shaper_rate_kbps=$min_dl_shaper_rate_kbps
-	ul_shaper_rate_kbps=$min_ul_shaper_rate_kbps
-	set_shaper_rates
+	## conservatively set hard minimums and wait until there is a load increase again
+	#dl_shaper_rate_kbps=$min_dl_shaper_rate_kbps
+	#ul_shaper_rate_kbps=$min_ul_shaper_rate_kbps
+	#set_shaper_rates
 
 	# Initiate termination of ping processes and wait until complete
 	kill $maintain_pingers_pid 2> /dev/null
@@ -1135,8 +1135,8 @@ do
 		t_start_us=${EPOCHREALTIME/./}	
 		get_loads
 
-		if (($dl_load_percent>$medium_load_thr_percent || $ul_load_percent>$medium_load_thr_percent)); then
-			(($debug)) && log_msg "DEBUG" "dl load percent: $dl_load_percent or ul load percent: $ul_load_percent exceeded medium load threshold percent: ${medium_load_thr_percent}. Resuming normal operation."
+		if (($dl_achieved_rate_kbps>$connection_active_thr_kbps || $ul_achieved_rate_kbps>$connection_active_thr_kbps)); then
+			(($debug)) && log_msg "DEBUG" "dl achieved rate: $dl_achieved_rate_kbps or ul achieved rate: $ul_achieved_rate_kbps exceeded idle threshold. Resuming normal operation."
 			break 
 		fi
 		sleep_remaining_tick_time $t_start_us $reflector_ping_interval_us

The second change is necessary because the load percent no longer makes sense - it was meant to be the percentage of the minimum rate, and now there is no invariant that the connection is shaped at the minimum rate.

Of course I agree that this (except maybe the second change) should be made conditional.

P.S. It would also help if you set the executable bit on all scripts in the git repository. Then I would be able to just run the code from a git checkout, even with local modifications, without getting a "mode conflict" from git pull.