CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

Ah yes, I see what @Lynx was asking now :slight_smile:

When detecting bufferbloat the decision for each OWD is binary - they're either "good" or "bad". I don't think it really matters how "bad" the OWDs are in my implementation because the OWD results themselves do not determine the magnitude of bandwidth reductions.

However, I do things differently when increasing bandwidth in response to load, when the OWD is below the "bad" threshold. I calculate an increase that's proportional to the difference between the average "loaded" OWD and the threshold.

2 Likes

Can any bash expert help me understand why the redirection here:

nevertheless results in an error message for @gadolf when the read fails because the fping_fifo was deleted?

Does that redirection not apply given the global redirection:

Or there is something else that I'm missing. @colo?

I adapted the script to my needs, that is, output a csv-like file.
It is inefficiently coded, but it works (at least, for the first reflectors from my big list).
This way, I can easily import the results into Calc and sort them out quickly.
If you want to save the results to a file just redirect its output.

Requirements:


#! /bin/bash

# source the config file so we get access to the defined reflectors
. ./cake-autorate_config.primary.sh # adjust to match your config file
no_reflectors=${#reflectors[@]} 


# how dilligently do we want to probe
n_delay_samples=10

for (( reflector=0; reflector<${no_reflectors}; reflector++ ))
    do
        cur_reflector=${reflectors[$reflector]}
        delay=$(/usr/bin/ping -qc ${n_delay_samples} ${cur_reflector}  2> /dev/null \
                | /etc/snmp/ping-to-json/ping_to_json.sh  2> /dev/null \
                | jq '.rtt_statistics.avg.value'  2> /dev/null \
                | tr -d '"' 2> /dev/null)
        if [ $? -eq 0 ]; then
                echo -e "${cur_reflector},$delay"
        fi
    done

exit 0

EDIT:

  1. Where it shows
/etc/snmp/ping-to-json/ping_to_json.sh

you should write ping_to_json's path

  1. The output is like ip address comma avg delay. If avg delay is blaked out, it means ping couldn't resolve.
1.1.1.1,11.711
1.0.0.1,12.575
8.8.8.8,11.270
8.8.4.4,12.365
badfa272.virtua.com.br,
speedflashnet.flashnetprovedor.com.br,119.365
dns2.cruiser.com.br,128.352
45-225-76-62.linkten.com.br,
186-229-124-49.ded.intelignet.com.br,
messalina.netway.psi.br,13.719
zeus.netway.psi.br,13.331
1 Like

@gadolf maybe you can help me with my question above? I'm trying to understand why the redirect didn't prevent your error message above. Does redirect with exec take precedence over subsequent local redirect?

Honestly, my scripts are based on Internet copy/paste (just look at the example above), not real bash knowledge.
But let me see if I spot something wrong.

2 Likes

Fixed here (I think):

Now let's try and optimize your parameters. What do your plots look like now I wonder? Could you paste on here?

Here are mine with new settings I am experimenting with.

Timecourse:

Raw CDFs:

Delta CDFs:

1 Like

Many thanks and sorry not having helped you with something I was using in my own script... :face_with_open_eyes_and_hand_over_mouth:

Here are mine.
Timecourse:

Raw CDFs:

Delta CDFs:

Four Ookla speedtests (speedtest-cli), the first two, simultaneously with a 4K youtube video.

EDIT: Around timeframe 350 I restarted the service to use your fixed version. The log still shows the error, but before restarting the service.

Immediate impressions are:

  • what kind of connection is this again?

  • some reflectors probably need to be taken out

In the future we are going to add in automatic reflector rotation to take out reflectors with baseline much higher than the others or with higher delta (if I am remembering correctly). We've fleshed it all out in the posts above, but I haven't implemented it yet.

@moeller0 we discussed the above here:

For the record please could you propose concrete criteria for rotating? As a starting point to tweak, how about:

  • every 1 min rotate a reflector if either:
    -- baseline > (min baseline + 20ms)
    -- EWMA of delta_delay > 0.9*delta_thr
  • every 10 minutes randomly choose a reflector and rotate it

Cable

Right, working on that.

Ah yes, so totally different to LTE. So the defaults (configured for LTE) will need a lot of adjusting. Looks like you have already set a low delta, so that looks OK. The RTT spikes look pretty bad in those graphs above. If you set CAKE bandwidth manually to say 5 Mbit/s are those removed? And maybe you could post your config file here? I imagine @moeller0 might be able to advise better settings than me because he will be far more familiar with your connection type than me.

My config is in here, in the subfolder with today's date.

As for the spikes, I'll clean up the reflectors list, then do a new test and depending on results, tweak cake bandwidth as you suggested.
Interesting to note is that during the video play, only 2 frames were dropped, each one during the speed tests. Given the bad spikes, I'd say that the video experience would be more laggy, although it wasn't

After using only reflectors with rtt avg up to 50 ms.
I think this is a much better result.

Timecourse

Raw CDFs

Delta CDFs

Same test case as before.

Now, one (almost non) issue I saw is that it took three speed tests (the last ones, with nothing else consuming bandwidth) to reach closer to 480 Mbps, which is cake's default DOWNLINK setting.
EDIT: Ookla's results:


EDIT1: In their last test, the higher latency under load measurement doesn't seem to be represented by the graphs...

@Lynx Since I cleaned up the reflectors list and sorted it ascending on rtt, I switched off randomize_reflectors
To my surprise, things got bad again:

I switched it on and things got back on track:

Log files here (subfolder randomize on off).

Can we only accumulate the deltaDelay EWMA for loads below say 50%, so we are judging reflectors not when we now they will have increased delay?

The 1 minute schedule seems OK, for the slow rotation I would say let's make this even slower, say 30 or even 60 minutes?

This is obviously winging it, so we get starting values that we then can tweak to find

1 Like

Mmmh, that means either the randomization code is not prrfect (or rather the fall back to round robin replacement) or IMHO more likely one or more of the early reflectors in the list is problematic. If my hunch is going in the right direction it should be one of the reflectors that is only in the round-robin set but not in the shufled set...

It seems the 9.9.9.10 is the problem:
Here plots only for 9.9.9.10:

As long as it is in the set we see RTT spikes in excess of 200ms

The CDFs reveal that this is not a generally elevated RTT or variance, but that ~10% of samples are delayed for a long time. (the "kink" in the line at ~90% where the mostly vertical line turns mostly horizontally).

@Lynx, this looks like when rotating reflectors we need to not only look at the average delay but your intuition was right and we should also look closer at the distribution...

Not sure how to track this easily, maybe your idea of keeping a count of the total number of samples per reflector as well as a count of samples above threshold could be leveraged, by say, recycle reflector quickly-ish if the ratio of over-threshold/n_samples is say 5 times larger than the minimum of over-threshold/n_samples for the current set of reflectors (and probably only include reflectors for this calculation that have say >= 100 samples). But this all tastes rather complex, maybe a better idea is instruct users to occasionally look at the CDFs and to weed out problematic reflectors by hand.

P.S.: My best guess is that @gadolf's 9.9.9.10 instance is loaded enough that either:
a) rate-limiting/de-prioritization of ICMP kicks in
b) the network path to 9.9.9.10 has some fluctuating congestion (far away from cake-autorate and hence outside of what we can realistically control for)

The remedy, IMHO is to manually exclude 9.9.9.10 from @gadolf's reflector set.

2 Likes

Nice work in identifying the issue. And this bad reflector detection stuff is tricky. I suppose with the reflector rotation then if we have a big enough list of good reflectors and our reflector rotation logic discussed above then we offer some resilience.

I had a thought last night that I'd like to run past you.

Right now we retain an array of 1s and 0s based on whether the delta threshold was exceeded. So let's say we have window size of four, then if all reflectors are below the threshold by 1ms, we end up with:

0 0 0 0,

which seems a bit dumb.

So I have been thinking about a good way to trigger a detection in such a situation without one bad reflector nevertheless triggering a detection based on the same logic. Because you could have one reflector 5x over just because it is a bad reflector.

So what if we retain array of deltas, with each element capped by the delta threshold.

So for example, say the threshold is 50ms and you have:

49ms, 25ms, 49ms, 5ms

Under the old regime this would not trigger because it would give 0, 0, 0, 0.

Under the new regime you end up with window array:

49ms, 25ms, 49ms, 5ms

You then take the sum total and see whether factor * threshold has been exceed. This factor is like X out of the last Y. So taking the factor 2.0 for our example above, we need sum total to be greater than 2.0*50ms=100ms.

And indeed we see from 49ms+25ms+49ms+5ms that this is greater than 100ms so we trigger.

Other examples:

25ms, 25ms, 25ms, 25ms

This would also trigger.

2000ms, 0ms, 0ms, 0ms

After capping, we end up with:

50ms, 0ms, 0ms, 0ms

So we do not trigger.

What do you think? This seems to make more sense given that cliff edge per reflector can distort the picture where several reflectors are very close to the threshold but just not quite over it - so the sum total tips the balance, even if no individual reflector tips its own balance.

There might be another better solution that makes more sense, but it feels like there may be some fruit here.

Why? If you have a tight threshold set, then all falling below that threshold is not necessarily diagnostic for congestion. In my case I set the thresholds to 7.5ms, but if a cake shaper is handling traffic at capacity the average delay easily falls into the 5-10ms range for each direction, why is sort of the best case scenario cake running at capacity not accumulating undue bufferbloat.

Yes, this another advantage of the consensus method to classify "congestion" we become tolerant against a minority of outliers (depending on policy set in the config file).

This assumes that our hypothesis holds true that individual delay samples can be interpreted as a graded signal of congestion. The problem here is not only, does delay include information about congestion magnitude, but also is that information robust ad reliable or do we need to average to get to this information. I think we should first see whether we can establish that before making big changes...

I would argue that if that situation is common enough that enough reflectors are just below threshold that is a sign that the threshold is simply set too high, no?

But the point for me is more that setting a threshold is somewhat arbitrary. It doesn't make sense to be forced to set a specific delay in which 1ms beyond that and there is evidence of bufferbloat and 1ms below that and there is evidence of no bufferbloat.

It makes more sense to think more of a generalised increase across multiple reflectors. The situation is more grayscale.

See what I mean? And so having some way to make it less binary and allow more of a collective contribution rather than individual cliff edge seems to make sense?

Like it doesn't make sense to me to be forced to specify threshold 50ms and then even:

49, 49, 49, 49, 49, 49

Wouldn't trigger.

Isn't there a way we can address this to be less localised cliff-edgy and more collective synergisticy?

Well my mental model is that we have two delay distributions per reflector, one with and one without congestion. These distributions likely overlap (here is a link to a widget that can be used to visualize what I describe for gaussians: https://demonstrations.wolfram.com/SignalDetectionTheory/) so the challenge is to decide whether a given sample belongs to the congested or uncongested distribution, the less overlap the easier this gets the more overlap the trickier it becomes. IMHO a simple threshold is a decent enough tool here. And I aim to not set this arbitrarily but based on actual detection performance. So if all we have is the uncongested distribution we can select a threshold such that we get a selectable false congestion detection rate, if we also have the actual congested delay distribution we can decide which false positive and false negatives we are willing to accept.
So I argue that a threshold (not that our current eye-balled thresholds do not already work reasonably well :wink: ) can be put on more solid footing than "thin air" if we so desire.

Well that is the nature of the beast, either we go for categorical congestion detection (what we are doing right now) where we get a binary classification or we go for a graded/proportional controller, but for that we need a reliable measure of congestion magnitude that is quickly available. Because even if we try to use a more complex consensus method in the end we will have situation where 1ms more or less in one of the reflectors decides whether we classify as "congested -> reduce rate" or as "non-congested -> keep or increase rate".

Sure, but we are forcing a binary classification at the end anyway, no?

That is a question I can not answer without data and analysis, similar to the follow-up question even if such a graded approach would work (better) than the current approach, would it be worth the effort?

Why that? This is exactly what our threshold plus X out of Y method predicts. Your formulation IMHO just re-defines "threshold" to some degree (modulo the "capping" idea) in that the effective threshold will be lower than the nominal threshold, something I dislike because it will make it harder to figure out whether the code works as expected...

As I said, for this to be useful we need to convince ourselves first that sub-threshold delay increases carry a robust and reliably "pre-congestion/congestion" signal. Once we do that we can think about some proportional control law that tries to match estimated congestion magnitude with shaper reduction magnitude somehow. And whether that complexity is actually worth it (but for that we still need to test it).

2 Likes

Fair enough regarding the above. Makes sense to me.

At present the monitor reflector response instances don't have visibility of the shaper rates and loads. We do write out achieved rates to files that could be read in though. Might not just judicious selection of the alpha for the ewma be enough to address the problem of increasing delta ewma?