CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

And here is a tiny further release to fix the documentation URL for the setup script:

To install this version just issue:

wget -O /tmp/cake-autorate_setup.sh https://raw.githubusercontent.com/lynxthecat/cake-autorate/v3.0/setup.sh
sh /tmp/cake-autorate_setup.sh

For any who just want to try cake-autorate, an uninstaller has now also been provided to remove the few files that are added by the installer.

There is some code to migrate a previous 2.0.0 install and config, but I'd recommend backing up any previous config(s) before installing version 3.0.1.

There is now rather strict config file validation that checks any entry against defaults.sh to verify: firstly, that the key relates to a configurable parameter; and secondly, that the value is the correct type out of integer, float, string, array, etc. Guidance is given to help the user identify and fix any problematic entries.

The values set for dl_delay_thr and ul_delay_thr now need to be of type float. Just add '.0. to the end of any entries for these from previous configs in which these were set as integers. So, for example, '30' would become '30.0'.

2 Likes

I know this might be a bit off topic to the previous post... but I've just recently discovered my isp has a variable overhead... of 7-14% of the packet size at any given time.( im not entirely sure what they meant exactly but.. from word of mouth idk if they're just trying to get me to drop the excessive ping spikes that even cake cant seem to manage [up to around 3000ms now, they also appear to "DNS hijack" so setting custom dns appears to not help as it will still have a dns leak to the isp provided servers])

This is with FTTH-GPON-MAPe(with a draft-ietf-softwire-map-00) connection

My speed appears to sometimes go from 128kbps up to even 1.5gbps would it be possible that the script is capable of setting overhead to a % of the set rate from the autorating to see if it would~ improve the latency a bit more

I'll defer to @moeller0 here.

That is not really how that works, the per packet overhead tends to be independent of packet size, so perentual overhead inceases when packet size decreases... That said, well possible that you will see larger overhead for ip4 versus ip6.

Now, I admit, I lack first hand experience with both GPON and MAP, so your ISP might well be correct (except the part that overhead is a constant fraction of packet size.

That said your ping spikes are these for the outgoingegress or incoming/ingress direction?

Maybe post the content of:
cat /etc/config/sqm # assuming you use sqm-scripts
tc -s qdisc
as well as a screenshot of a speedtest at:

I will try to help but am traveling ATM away from real computers so my responses will likely be delayed and not well researched....

2 Likes

I wish I could say that, the results I've gotten on the speedtest is 24/7 but sadly its not, and when it spikes to 3000ms its.... not fun.

config queue 'eth1'
        option enabled '1'
        option interface 'eth1'
        option download '450000'
        option upload '450000'
        option qdisc 'cake'
        option script 'piece_of_cake.qos'
        option qdisc_advanced '1'
        option ingress_ecn 'ECN'
        option egress_ecn 'ECN'
        option qdisc_really_really_advanced '1'
        option itarget 'auto'
        option etarget 'auto'
        option linklayer 'ethernet'
        option debug_logging '0'
        option verbosity '5'
        option squash_dscp '0'
        option squash_ingress '0'
        option iqdisc_opts 'nat dual-dsthost diffserv4 ingress ack-filter'
        option eqdisc_opts 'nat dual-srchost diffserv4 egress ack-filter'
        option overhead '56'
        option linklayer_advanced '1'
        option tcMTU '2047'
        option tcTSIZE '128'
        option tcMPU '84'
        option linklayer_adaptation_mechanism 'default'

now its normally 3000ms on the ingress... its variable depending... on time of day

I would not use ACK filter for the download direction, but this certainly is not related to your issues...

About 3 second spikes, sure that is poison for anything more interactive than correspondance chess....

I'm not familiar with fiber, but with what little i've heard about, is it possible that the endpoints receiver is taking too long to separate the channels regardless of pppoe/ipoe connections?(they claim that ppoe causes the latency spikes, however it appears to be there in both ipoe and pppoe... just ipoe has faster consistent speeds)
Without ack-filter on the ingress, i get weird errors related to... gaming (like stuttering, weird error corrections... im not entirely sure if its the cause of acks or something else)

That would a broken receiver... For download all CPE/ONTs/modema receive the same packets but only the correct/targeted ONT will have the correct key to decrypt the data, and for upload each ONT needs to first request transmit slots from the OLT before it can send data (the request grant traffic to request those timeslots is organized as a second logical channel using the same frequency IIRC.
Sure PPPoE is more damanding than IP as it essentially is a tunneling protocol that requires transformation of data frames on send and receice, but that is something that can be 'solved' by using a router with powerful enough CPU(s) and/or PPPoE offloading capabilities.

In my limited experience PPPoE if configued competently enough will not cause reliable latency spikes in the multi second range by itself.

ACK filter really should not help noticeably on ingress if it does something else seems off.... maybe it is time to get some packetcaptures?

@moeller0 curious issue here.

We have two televisions: an LG and a Samsung. Both are connected via the 2.4GHz guest network provided by wirelessly connected extension APs, and from testing it seems that provides circa 60Mbit/s download for clients.

Now for some reason when cake with autorate is enabled (I'm not sure if the autorate is relevant yet but I think not since I didn't see throttling kicking in and cake set to 60Mbit/s download seems problematic), when watching one particular series on prime that requires higher than normal bandwidth (bursts of circa 40Mbit/s), on the LG Prime Video reduces the streaming quality providing cake is enabled. But this doesn't happen on the Samsung. The WiFi signal strengths of both the LG and Samsung are both excellent. When cake is disabled the series reliably streams at full quality on the LG.

Is there some kind of interaction between cake and WiFi when the bandwidths of both are around the same level? Is there some kind of buffering issue that's relevant here?

To be clear, it seems when the episode starts there is an initial sort of testing of the connection whereupon the viewing quality for the remainder of the session is determined. On the LG cake seems to interfere with this such that the quality is downgraded. But this doesn't happen on the Samsung!

Any idea what's happening here? An obvious thing to try seems to be to switch the LG to the much higher bandwidth 5GHz WiFi and see if that makes the issue go away. But if it did I wouldn't understand why since no cake on 2.4GHz works fine.

I can only speculate... The way modern adaptive streaming mostly works is that the receiving device request some segment of data (at a given bitrate and quality) and then monitors how long it takes for that data to arrive in its buffers. Based on the low buffer state and potentially the temporal dynamics of the transfer the receiver then decides what bitrate/quality to request for the next segment. There seems to be no standard about the actual numerical critetia used by endpoints to decide what bitrate to request next, so this might simply be a more cautious configuration by LG compared to Samsung, but I have no way to confirm this hypothesis... (I assume that both streams are marked with the same DSCPs and hence use the same WiFi access class).

My gut feeling is that this question might get more responses if posed in a new thread compared to being placed at the end of a huge thread, especially since I agree that it does not really look like autorate is involved?

Hey, new here and really interested in getting this script going.
Has there been any discussion (I haven't read the 3000 messages in this topic) about making this script and its dependencies available via the package manager with perhaps a simple Luci interface to manage the main settings? I think it would reach a lot more people and make this super easy (a few clicks) to get going..

I admit I'm completely ignorant on how to make that happen, but I know other scripts have this kind of capability.

For long enough cake-autorate has been so rapidly changing that it didn't seem apt. I kept adding new features, altering existing ones and restructuring based on my own testing and feedback from others.

Only very recently have things settled down and with version 3.0.1 things seem very stable now. I'm pretty content with this latest version of cake-autorate; I have it running 24/7 in the background and it works well.

So packaging cake-autorate up might be warranted now. But until a LuCi interface is added the benefit of doing so seems somewhat limited given the efficacy of the installer script and that by its nature a fair degree of understanding and configuration on the part of the user is necessitated.

Therefore it seems an important next step might be to work on a LuCi interface and possibly integration with theUCI system. But I have no familiarity with either and don't feel particularly motivated to learn. So development in this area would likely hinge on others stepping up and tackling these aspects.

If anyone out there has any interest in working on a. LuCi interface for cake-autorate or integrating it with the UCI system, or has any thoughts on the same, please chip in.

I haven't ever developed a Luci interface but I can start looking at it. I have to become familiar with this tool also which I have to get installed on one of my devices. I'm glad I wrote the comment and read your reply because the github page says the present version is 1.2... I'll need to read up on the install instructions for 3.01 and dependencies...

It will take me a while before I have anything useful... :frowning:
A lot to learn...

@moeller0 what's the optimal monitor_achieved_rates_interval_ms? I've been experimenting with changing this value from 200ms to 400ms for my connection and the rate changes seem more stable in the sense that multiple HD/4K streams work without stutters.

Hard to say... and it depends on the actual achievable rate (to some degree). A larger interval will smooth over the achieved rate peaks and valleys a bit so can result in a less 'trigger_happy' control loop. But if this is too large the control loop can become sluggish and if it is too small it does not average enough. For the latter think an interval smaller than it takes to transfer a single full-MTU packet, the value will hence mostly flip between 0 and 100% (with 100% being roughly:
1000 ms/sec * full-MTU-in-Bits / monitor_achieved_rates_interval_ms).

My gut feeling is that we might want to continue to sample at a high rate and add a configurable EWMA to the achieved rate as well if we want some smoothing here. I currently do not see a nice closed form formula to express a suitable, let alone an optimal monitor_achieved_rates_interval_ms, sorry.

I wonder whether the 'without stutter' part comes from the fact that with a longer window the achieved rate likely stays higher and we do not drop down the hard as aggressively and the streams buffering absorbs the shock of the short excursions to lower rates well. The upshot likely is that you sacrifice latency under load especially if a rate drop is not just transient but stays for a longer duration, but that is speculation...

1 Like

I always so enjoy reading and thinking about your insights.

Please can you elaborate on this? I can't quite follow it.

Yes I think this is spot on. What I've been observing is that on bufferbloat the sampling of achieved rate can often result in what seems to be an artifically low bandwidth.

OK yes this makes sense to me.

How about this:

That is, keep an achieved_rates_ewma_kbps[dl/ul] based on a configurable alpha_achieved_rate, and use this instead of the achieved rate in update_shaper_rate(). And also print out the ewmas in the log lines.

What would a good alpha be?

Taking:

λ=1−2/(n+1), where λ is the ewma λ and n is the SMA number of samples

If we set n to '2' then alpha would be around 0.3? I say '2' because from my empirical testing 400ms seemed to work well for the interval. So this way I stick with 200ms, but increase the window by a factor of x2 (virtual window size of 400ms?) for making shaper rate adjustments on bufferbloat.

Actually from testing this ewma cuts both ways. On load ascent it can result in a reduced bandwidth getting chosen. Maybe this helps with smoothing over bursty traffic.

Once the sample interval is below the duration required to receive a full packet, then samples will likely 'catch' either zero or the full packet size (as packets are received as units and not bit by bit). This is part of the reason why we need some form of aggregation...
However the larger the aggregation interval (either by our sampling interval as we essentially calculate the average rate over that) or the more tuned towards padt values in an EWMA the less sensitive we become towards real rate changes... and the less conceptual sense is left in using the achieved rate as new shaper rate target...

Yes the control loop getsa more sluggish. Depending on the frquency and magnitude of bottleneck link rate that can result in smoother performance... but it also results in larger/longer-duration latency spikes when the true rate drops. I wonder whether we do not already have enough toggles to make the controller less reactive?

Also, I wonder whether there is not an experiment waiting there: running the log with the shaper setting disabled and a relative short achieved rate interval with some organic loading of the link and then look at the distribution of the achieved rates to get an idea about the 'normal' rate dynamics (obviously this only works for the download direction).

Do you think we should use two alphas or just one alpha? How about a guess for a good alpha value?

Also I just realised that the default threshold for classifying load as high (and thus for increasing the shaper rate on load) was too high - it was set as 0.75, but really 0.5 seems more appropriate. Do you agree?