CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

moeller0 · December 6, 2022, 9:24am

patrakov:

+		if (($dl_achieved_rate_kbps>$connection_active_thr_kbps || $ul_achieved_rate_kbps>$connection_active_thr_kbps)); then
+			(($debug)) && log_msg "DEBUG" "dl achieved rate: $dl_achieved_rate_kbps or ul achieved rate: $ul_achieved_rate_kbps exceeded idle threshold. Resuming normal operation."

I think I like this change... we drop into sleep when going below an absolute threshold so we might use the same logic for getting out of sleep as well, we can haggle whether to use $connection_active_thr_kbps neat or say 1.5 * $connection_active_thr_kbps or so, but the general idea seems cleaner...

For sure that certainly is a policy decision: I see three options:

current pessimistic but safe default: start from minimum
consistent start from base-rate after
proposed optimistic start from last shaper rate

Personally I would only ever set 1) or 2) but such policy decisions are subjective and I can see why someone would opt for "last shaper rate"...

Never noticed the mode conflicts, but I do not pull from my router and so far kept the modes in the repository as pulled. sqm-scripts went a different route and offers a makefile that allows a make install which takes care of the final path and mode bits.

Lynx · December 6, 2022, 9:39am

@patrakov so you'd like toggle that disables return to base logic and also to switch out to checking against idle threshold for resuming?

And how do I set x bit for files on GitHub?

And there's your comments I need to introduce too.

Anything else you'd like in?

@moeller0 any way we can make the graphs bigger like the bottom right graph below so that more space is utilised:

And choice over time span would be super nice too. It would be good if somehow we had a plot from which we could compare and rate reflectors. Any idea how we can do that?

Out of curiosity @patrakov how are you managing diffserv? I'd have thought for your LTE connection it might be relevant at low bandwidth.

moeller0 · December 6, 2022, 9:58am

We need to think over the timecourse plots some more. I just added a PR that allows to specify a subset of reflectors to plot.

That already exists:

	% octave -qf --eval 'fn_parse_autorate_log("./SCRATCH/cake-autorate.log.20221001_1724_RRUL_fast.com.log", "./output.tif", [10, 500], {"1.1.1.1"})'
	% symbolically: octave -qf --eval 'fn_parse_autorate_log("path/to/the/log.file", "path/to/the/output/plot.format", [starttime endtime], {selected_reflector_subset})'
	%	supported formats for the opyinal second argument: pdf, png, tif.
	% 	the optional third argument is the range to plot in seconds after log file start

This example plots seconds 10 to 500 from ./SCRATCH/cake-autorate.log.20221001_1724_RRUL_fast.com.log, but only delay data for reflector id 1.1.1.1.

No, that brings us back to the idea of pre-qualifying reflectors with an external script

moeller0 · December 6, 2022, 10:06am

That is, I currently look at the CDF plots, if a reflector has a large raw deviation I remove it from the set and when the sigmoid curve for the low load condition looks very unsymmetric (with a long and relatively heavy tail to the right) I also consider throwing it out as well. Not sure how to evaluate reflectors from looking at the timecourse plots though.

Lynx · December 6, 2022, 10:09am

Ah great - no matter from timecourse if it can be done with CDF. Could you elaborate on your explanation above with a picture or two? We could perhaps add something in the readme about this.

patrakov · December 6, 2022, 10:23am

I don't. Maybe I should.

Lynx · December 6, 2022, 10:30am

From my experience it only becomes an issue at bandwidth well below 10 Mbit/s, so e.g. at 5 Mbit/s, and where my work computer say is on a Zoom call and I download a PDF or upload a powerpoint presentation to OneDrive (and probably with a Netflix stream to a TV in the background). I saw the odd stutter in this situation and since using diffserv even at this very low bandwidth I no longer see the odd stutter.

Simple recipes: cake-qos-simple and cake-dual-ifb.

VPN use makes the situation far more complicated, and if you use cake-dual-ifb one would need to ascertain no conflict with mwan fwmarks. But basically nftables can be used to set the DSCPs and store DSCPs to conntrack on upload and you use tc to restore on download using e.g.:

# For each wan ingress packet conditionally restore DSCP from conntrack if available and mirror to ifb-wan
tc filter add dev wan parent ffff: protocol ip matchall action ctinfo dscp 63 128 action mirred egress redirect dev ifb-wan

Lynx · December 6, 2022, 11:05am

@moeller0 / @patrakov how should we adjust the default reflector set here:

reflectors=(
"1.1.1.1" "1.0.0.1"  # Cloudflare
"8.8.8.8" "8.8.4.4"  # Google
"9.9.9.9" "9.9.9.10" "9.9.9.11" # Quad9
"94.140.14.15" "94.140.14.140" "94.140.14.141" "94.140.15.15" "94.140.15.16" # AdGuard
"64.6.65.6" "156.154.70.1" "156.154.70.2" "156.154.70.3" "156.154.70.4" "156.154.70.5" "156.154.71.1" "156.154.71.2" "156.154.71.3" "156.154.71.4" "156.154.71.5" # Neustar
"208.67.220.2" "208.67.220.123" "208.67.220.220" "208.67.222.2" "208.67.222.123" # OpenDNS
"185.228.168.9" "185.228.168.10" "185.228.169.11" "185.228.169.9" "185.228.169.168" # CleanBrowsing

I have the feeling some of these do not belong in the default list and should be cut out.

moeller0 · December 6, 2022, 11:20am

I bet that some are not anycasted or only have anycast locations that are not well-distributed globally, but that implies that the acceptable subset depends very much on the location of the autorate instance we want to select reflectors for....

As I mentioned before I ran the following script

bash-3.2$ cat ./test_reflector_set.sh 
#! /bin/bash

# source the config file so we get access to the defined reflectors
. ./cake-autorate_config.sh
no_reflectors=${#reflectors[@]} 


# how dilligently do we want to probe
n_delay_samples=10

# where to store the output
out_file="./reflector_test.txt"
echo "Tested reflector candidates" > ${out_file}

for (( reflector=0; reflector<${no_reflectors}; reflector++ ))
    do
	cur_reflector=${reflectors[$reflector]}
	echo "Current reflector: ${cur_reflector}"
	ping -c ${n_delay_samples} -q ${cur_reflector} >> ${out_file}
    done

exit 0

which just collected the response of our default set to 10 samples and then looked at loss (no problem at all) and average delay, since my best RTTs are around 10ms and western Europe is not that large and densely networked, I then simply excluded all reflectors with an average rate > 30ms (or minRTT + 20ms) simple because 20ms allows roughly for a RTT over 2000Km (that is the remote site can be up to fiber 2000Km away). My rationale is that for my location I expect to be able to find oodles of decent reflectors closer by than 30ms, so this should still allow decent reflector diversity. If I set the threshold to say 10ms + 1ms I would mostly get reflectors really close to my ISPs PoP which I fear would result in many of them sharing considerable portions of the network path (while the goal is to get a set of reflectors that have minimal overlap in network path, ideally only shared up to and including the bottleneck*).
I also added my ISP's own DNS servers resulting in the following set, that seems OK on my link, I am pretty confident these will not be universally acceptable for all locations:

# removed all reflectors with idle RTT > 30ms                                                                                                                                                                       
reflectors=(                                                                                                                                                                                                       
"62.109.121.1" "62.109.121.2"   # O2/Telefonica DNS                                                                                                                                                                
"1.1.1.1" "1.0.0.1"             # Cloudflare                                                                                                                                                                       
"8.8.8.8" "8.8.4.4"             # Google                                                                                                                                                                           
"9.9.9.9" "9.9.9.10" "9.9.9.11" # Quad9                                                                                                                                                                            
"94.140.14.15" "94.140.14.140" "94.140.14.141" "94.140.15.15" "94.140.15.16"            # AdGuard                                                                                                                  
"64.6.65.6" "156.154.70.1" "156.154.71.2" "156.154.71.3" "156.154.71.4" "156.154.71.5"  # Neustar                                                                                                                  
"208.67.220.2" "208.67.220.123" "208.67.220.220" "208.67.222.2" "208.67.222.123"        # OpenDNS                                                                                                                  
"185.228.168.9" "185.228.168.10" "185.228.169.11"                                       # CleanBrowsing                                                                                                            
)

*) Yepp, that is not realistically achievable but as a goal that seems still desirable.

Lynx · December 6, 2022, 11:27am

Nice script - should we include it in the repository? I started the following:

root@OpenWrt:~/cake-autorate# cat select_reflectors.sh
#!/bin/bash

initial_reflector_set_url=https://raw.githubusercontent.com/tievolu/timestamp-reflectors/main/reflectors-europe.csv

select_reflectors()
{
        for ((retries=0; retries<3; retries++))
        do
                wget -O /tmp/initial_reflector_set https://raw.githubusercontent.com/tievolu/timestamp-reflectors/main/reflectors-europe.csv
                [[ $? -eq 0 ]] && break
                sleep 5
        done

        declare -A reflector_rtt

        while read reflector
        do
                [[ $reflector =~ ([0-9]+.[0-9]+.[0-9]+.[0-9]) ]] || continue
                reflector=${BASH_REMATCH[1]}
                echo "testing $reflector"
                [[ $(ping -q -c 5 -i 0.1 "$reflector" | tail -1) =~ ([0-9.]+)/ ]] && printf -v reflector_rtt[$reflector] %.0f\\n "${BASH_REMATCH[1]}e3" || reflector_rtt[$reflector]=1000
                echo ${reflector_rtt[$reflector]}
        done</tmp/initial_reflector_set
}

select_reflectors

But didn't advance it beyond that. Should we?

moeller0 · December 6, 2022, 11:48am

I think scripting something is a decent idea, my script really just collects statistics but looks horrible:

Tested reflector candidates
PING 1.1.1.1 (1.1.1.1): 56 data bytes

--- 1.1.1.1 ping statistics ---
10 packets transmitted, 10 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 10.108/10.468/10.767/0.193 ms
PING 1.0.0.1 (1.0.0.1): 56 data bytes
--- 1.0.0.1 ping statistics ---
10 packets transmitted, 10 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 10.261/10.587/11.429/0.343 ms

it clearly would make more sense to reduce this to something like:

# reflector_id; n_sent; n_received; minRTTms; avgRTTms; maxRTTms; stddevRTTms
1.1.1.1; 10; 10; 10.108;  10.468; 10.767; 0.193
1.0.0.1; 10; 10; 10.261; 10.587; 11.429; 0.343

(maybe without the header line) which then could be read in as array for further selection.
Having different options for getting the start set seems great, but I am not sure I would package the wget calls into such a script (I dislike the idea of doing largish downloads without full user consent and opt-in).

I would argue to collect at least 10 samples and for interval I would use either 1 or 0.2 seconds (I also think we should keep the per-reflector interval >= 0.2 seconds, but I think by default we already do*).

*) If we use "normal" ping intervals autorate traffic will not raise less alarms simply by being unusual.

Lynx · December 6, 2022, 11:51am

Ah yes, and it now came back to me about sorting on the different dimensions - we had all that fleshed out above. I just need the time to code it up.

moeller0 · December 6, 2022, 11:55am

Let's get the new version out, and give it some time for eventual users to find untested areas that might need some love.. In parallel we can explore reflector pre-qualification, as long as we target an independent script that will just be fine. (And I think an independent script has several advantages, if only that it easily can be run on a different machine than the router, which for storage limited devices will be helpful if we want to process large lists).

patrakov · December 6, 2022, 12:26pm

This is not possible through GitHub web interface. You have to clone the repository.

On Linux (including OpenWRT), the change is as simple as:

chmod +x script.sh
git add script.sh

On Windows, you can do this instead:

git update-index --chmod=+x script.sh

Then commit and push as usual.

Lynx · December 6, 2022, 1:34pm

As silly as this sounds it did not occur to me to install 'git' on the RT3200 and just push commits from that. This might have made pushing commits easier at times, albeit on the other hand I quite like reviewing through the edits in the online text editor before finally committing, and I think that just wouldn't look as nice from my RT3200.

moeller0 · December 6, 2022, 2:32pm

How much storage does your rt3200 have? My old WNDR3700v2 with only 16MB would not allow such usage. And even for my multiGB turris omnia I am not doing this, /root lives on hard to replace eMMC storage, which try to keep avoidable writes off of (I was not always that careful so I accumulated more writes already than I am comfortable with). Not having the scripts executable by default has the advantage that I will not accidentally try to run them on my macbook (where they would fail anyway).

Lynx · December 6, 2022, 2:43pm

The RT3200 has 128MB flash storage, 512MB ram and a two-core MediaTek MT7622BV @ 1350MHz. I think this feature set is why, given that OpenWrt 'just works' on this device, this device has been so popular.

I must admit I manually edit the files using 'vi' from the router. Perhaps I am contributing to early problems, but I have three of these and could always rotate them to give even more wear levelling. I suppose I could keep cake-autorate on a USB stick and keep on writing to that instead, but I recall USB sticks getting very hot and so perhaps the thermal cycling would also contribute to a lower life expectancy. I use ondemand to keep CPU cycles down low, together with cake-autorate sleep, so that it just ticks along without melting away overnight.

moeller0 · December 6, 2022, 2:57pm

I am not that hardcore, I use vi only if I absolutely must, on my router I default to nano or even the built-in editor of mc (which does some rudimentary syntax highlighting)...

For the normal wear and tear of a normal low-frequency repository I do expect much issues, just on my omnia I have a hunch that I already used up some of the 9 lives of this cat and so I tread carefully.

dlakelan · December 6, 2022, 3:12pm

f2fs and/or external USB drive might be worth considering.

Lynx · December 6, 2022, 3:21pm

Am I using that on my RT3200 already?

root@OpenWrt:~# df -h -T
Filesystem           Type            Size      Used Available Use% Mounted on
/dev/root            squashfs        5.0M      5.0M         0 100% /rom
tmpfs                tmpfs         241.5M     15.0M    226.5M   6% /tmp
/dev/ubi0_5          ubifs          80.5M     34.4M     42.0M  45% /overlay
overlayfs:/overlay   overlay        80.5M     34.4M     42.0M  45% /
tmpfs                tmpfs         512.0K         0    512.0K   0% /dev
OneDrive:Scanned\040Documents
                     fuse.rclone
                                    1.0T      1.0T         0 100% /tmp/run/OneDrive

BTW I think I can free up space if I want to by running 'auc -f' and getting firmware with packages I installed on top of the base 22.03.2 release firmware included in the firmware.

But that frustrates me a bit because if you want to start from afresh you then need to reflash the firmware because 'firstboot' will then just take you back to the flashed firmware with all your added packages included, some of which you probably don't need anymore.

I can always use my multi-terabyte OneDrive if I need more storage space!