CAKE w/ Adaptive Bandwidth [October 2021 to September 2022]

Lynx · October 12, 2021, 1:15pm

Would you mind providing an outline summary of how it works? I personally liked the simplicity of @Bndwdthseekr's bash script and the overall concept in the ICMP approach written by @dlakelan, albeit the latter seemed a little complicated.

I don't think I really care about inconsistent ICMP performance because 8.8.8.8 inconsistency is surely peanuts compared to the huge bufferbloat issues I see on my LTE connection. I am not looking for perfection, just something that works to a sufficient degree. I don't mind some bufferbloat creeping in, so long as pings stay below 100ms from a baseline of 50ms. Since then in my experience everything will work fine. The problem comes when pings start to shoot up to more like 500ms and beyond, which is what happens when I disable SQM entirely.

Hopefully this will allow users like me to claim back some more bandwidth rather than having to unduly sacrifice bandwidth to get CAKE to work properly.

Update: I have looked into @dlakelan's script more now. Actually it doesn't seem as complicated as I thought. My RT3200 already has all the erlang dependencies. How would I go about using this script? Is it as simple as running something like:

erlang sqmfeedback.erl

After editing the lines at the bottom:

monitor_ifaces([{"tc qdisc change root dev eth0.2 cake bandwidth ~BKbit diffserv4 dual-srchost overhead 34 ", 4000, 6000, 8000},
		    {"tc qdisc change root dev ifb4eth0.2 cake bandwidth ~BKbit diffserv4 dual-dsthost nat overhead 34 ingress",15000,30000,35000}],
    ["dns.google.com","one.one.one.one","quad9.net","facebook.com",
     "gstatic.com","cloudflare.com","fbcdn.com","akamai.com","amazon.com"]),

No point in me reinventing the wheel. The more I look at this code the more I like it.

dlakelan · October 12, 2021, 3:15pm

My Erlang script was at least as much about me learning Erlang as about solving the problem, the price of me doing it in my spare time... I do think it works reasonably well, and it doesn't require burning tons of bandwidth.

If you can run it on your router, then go for it.

Lynx · October 12, 2021, 3:15pm

Cool, thanks. I got it up and running. Not what I'd call a piece of cake but I'm glad I got there in the end.

For the benefit of others, you need to install the 'erlang', 'erlang-compiler' and 'iputils-ping' packages.

I see:

root@OpenWrt:/etc/init.d/erl# erl
Erlang/OTP 23 [erts-11.0] [source] [64-bit] [smp:2:2] [ds:2:2:10] [async-threads:1]

Eshell V11.0  (abort with ^G)
1> c(sqmfeedback).
{ok,sqmfeedback}
2> sqmfeedback:main().
ping cloudflare.com with results: [48.9,50.7,52.5,52.7,53.4]
ping akamai.com with results: [49.4,49.4,50.4,57.5,57.6]
ping one.one.one.one with results: [56.4,56.4,56.5,56.8,57.5]
ping google.com with results: [73.2,74.2,74.2,74.3,82.9]
ping gstatic.com with results: [66.7,68.6,69.5,69.8,70.1]
ping amazon.com with results: [113.0,116.0,117.0,118.0,119.0]
ping quad9.net with results: [185.0,186.0,187.0,187.0,187.0]
ping facebook.com with results: [265.0,266.0,267.0,267.0,268.0]
ping fbcdn.com with results: [271.0,272.0,273.0,276.0,282.0]
ping cloudflare.com with results: [51.7,52.2,52.6,55.2,55.3]
ping akamai.com with results: [45.4,49.9,52.7,52.9,55.3]
ping quad9.net with results: [179.0,180.0,187.0,188.0,189.0]
ping gstatic.com with results: [64.4,64.5,66.4,66.5,67.1]
ping one.one.one.one with results: [45.5,46.5,47.4,54.5,56.5]
ping amazon.com with results: [117.0,124.0,125.0,126.0,126.0]
ping akamai.com with results: [44.9,45.3,45.8,46.3,47.7]
ping google.com with results: [60.0,69.0,70.0,70.3,77.0]
ping fbcdn.com with results: [260.0,261.0,268.0,270.0,272.0]
ping facebook.com with results: [268.0,269.0,270.0,270.0,270.0]
ping cloudflare.com with results: [44.7,46.6,47.7,48.1,57.0]
Checking up on things: 1634056326
Full Delayed Site List: [{[102,98,99,100,110,46,99,111,109],14.0,1634056315},{[103,111,111,103,108,101,46,99,111,109],12.900000000000006,1634056315},{[111,110,101,46,111,110,101,46,111,110,101,46,111,110,101],10.100000000000001,1634056310}]
Recent Delayed Site List: [{[102,98,99,100,110,46,99,111,109],14.0,1634056315},{[103,111,111,103,108,101,46,99,111,109],12.900000000000006,1634056315},{[111,110,101,46,111,110,101,46,111,110,101,46,111,110,101],10.100000000000001,1634056310}]
tc qdisc change root dev wan cake bandwidth 29237Kbit flows nonat nowash no-ack-filter split-gso rtt 50ms noatm overhead 70
tc qdisc change root dev veth-lan cake bandwidth 29237Kbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 50ms noatm overhead 70 ingress

Also @dlakelan for now I have placed the .beam file in /etc/init.d/sqmfeedback-erl/ and created a new sqmfeedback service:

root@OpenWrt:/etc/init.d# cat sqmfeedback
#!/bin/sh /etc/rc.common
# Copyright (C) 2007 OpenWrt.org

export PATH=/usr/sbin:/usr/bin:/sbin:/bin

START=51
STOP=4

start() {
        erl -pa /etc/init.d/sqmfeedback-erl -eval 'sqmfeedback:main().' -noshell -detached
}

stop() {
        pgrep erl | xargs kill -9
}

Does this seem sensible?

lantis1008 · October 13, 2021, 12:25am

On initialisation, it unloads the link (stops traffic) and starts to transmit pings to the nominated server. After 5 seconds (stabilising period) it starts measuring those pings for 10 seconds to find an average. The average returned latency is referred to as the "link entitlement" and is a measure of what your link can achieve under the best circumstance. After the 15 seconds has elapsed, the ping is terminated and traffic is resumed on the link.
From memory the link entitlement has a small amount added to it for hysteresis, and also the user can specify that they want to increase the entitlement by Xms to target greater utilisation as a tradeoff for latency.
The maximum bandwidth entered by the user is stored as the max link rate, and 75% of this value is used as the initial fair link rate.

When the link goes above 10% utilisation, the pinger is turned on and the latency is actively monitored.
As long as the latency is under the target, and there is more demand for the link, the fair link limit is increased towards 100% of the line rate. If the latency goes above the target, the fair link limit is reduced towards 15% (but no lower).
The algorithm it uses to adjust the bandwidth drops more aggressively depending on how much the latency has exceeded the target. For recovering back to the line rate it steps up in some amount (can't remember).

There is also a realtime mode which reduces the latency target to the raw entitlement value (no additions) and certain traffic flows can trigger this mode.

If you want more explanation than that, i'd encourage you to read the source which i linked previously. It is actually well commented (a nice surprise).

This is what my link looks like at idle (two of us working from home, 1 on a video call at the moment)

I then started a download and got this

The latency is below 56ms so it didn't shape the link at all.
I'm not willing to actually soak the line at the moment (l'est i be yelled at )

Note that my link is actually capable of 9ms of latency but i've added 20ms to the entitlement. I found that when the ACC was targetting 9 i wasn't able to utilise my full speed, and for my use cases speed was better than latency. Plus i don't mind gaming at 29ms of latency, perfectly manageable.

Lynx · October 13, 2021, 8:20am

@lantis1008 many thanks indeed. That seems rather different to @dlakelan's script and it is very helpful to see what yours does because of the many positive reports on this forum about it.

Lynx · October 13, 2021, 11:09am

@dlakelan how do I get inbound traffic destined for 192.168.1.1 to go through my veth-lan so that it gets prioritised by CAKE? I think only the outgoing ping request is caught on upload by CAKE on wan but the incoming response is not caught by CAKE on veth-lan. I don't know how to route traffic destined for the router itself through veth-lan :

These IP rules catch all inbound LAN traffic but unfortunately not traffic destined for 192.168.1.1:

14000:  from all iif wan lookup veth-lan
14000:  from all iif vpn lookup veth-lan

Even setting priorities to '0' doesn't catch traffic to 192.168.1.1.

How do I fix this? Without this, I presume the ping responses are not going to get prioritised as they should and so this will give a bleaker picture than reality?

I think this may explain why your script is overly throttling my download down to the minimum level I set even though I know it can handle a larger bandwidth without bufferbloat using CAKE.

@shm0, sorry, I should have referenced your script too:

@shm0 do you presently use one of these rate adjusting scripts? Did you tweak yours any further than as posted above? I'm keen to try it too.

The trouble is that with the exception of @lantis1008's solution (which has received a lot of positive reports, but would require a lot of work to adapt for OpenWrt), there seems to be no tried and tested solution with positive reports readily available on OpenWrt to this problem, and yet a lot of users seem to be very much in need of one.

I wonder why @lantis1008's solution has received as much positive feedback, and whether this means that this is the horse that should be backed in terms of getting something that can be made to work well for users on OpenWrt.

dlakelan · October 13, 2021, 2:04pm

I don't think you can, as there is no further routing of packets destined for the device. However this should mean that the ping is LESS delayed, after all cake can only delay or drop a packet. So I don't think this is the reason for your issue.

Lynx · October 13, 2021, 2:10pm

Thanks a lot for your response. Pity on the one hand, since it means I can't also shape traffic to router, but that's no biggie. For your script - ah, good point. So by bypassing CAKE it should get top priority anyway. Albeit if CAKE knew about it then would it not reduce the priority of everything else? Maybe that's irrelevant.

Any idea why running your script with a download it just throttles down to the minimum bandwidth I set (20Mbit/s) even though if I disable script and restart sqm so it gets set to 30Mbit/s I still get A+ on the waveform bufferbloat test?

root@OpenWrt:~# tail /etc/init.d/sqmfeedback-erl/sqmfeedback.erl

    monitor_ifaces([{"tc qdisc change root dev wan cake bandwidth ~BKbit", 25000, 30000, 35000},
                    {"tc qdisc change root dev veth-lan cake bandwidth ~BKbit",20000,30000,50000}],
    ["google.com","one.one.one.one","quad9.net","facebook.com",
     "gstatic.com","cloudflare.com","fbcdn.com","akamai.com","amazon.com"]),
    receive
        _ -> true
    end.

Should I set the ping difference to higher than 10ms or the number of delays to higher than 2? Any suggestion what to try?

Delay > 10.0 -> 
Rpid ! {delay,Name,Delay,erlang:system_time(seconds)};

if length(RecentSites) > 2 ->

On waveform bufferbloat I think I get good performance so long at 95% percentile is less than 80ms. My normal unloaded ping is between 45-55ms to ISP.

As an aside the change commands only need to specify bandwidth, not all the arguments since they are already set, right?

dlakelan · October 13, 2021, 2:27pm

If you want to be less stringent then yes try 15 or 20 MS and 3 sites maybe

moeller0 · October 13, 2021, 2:38pm

Just as a datapoint under saturating load cake will allow 5ms standing queue per direction, so when both legs of your link are running at capacity you can expect that your RTT (going through cake) increases by 10ms at the very minimum. In my observations often the average/median RTT increase is closer to 20ms under such conditions, so I would certainly try to set something larger than 10ms...

Lynx · October 13, 2021, 2:56pm

Having set delay to 20.0ms and number of sites to > 3, things are looking rather better now. The script has converged on 31Mbit/s upload and 44Mbit/s download. This seems reasonable. And waveform reports A+ bufferbloat despite very heavy traffic.

So so far so good.

Shortly I expect my ISP to get congested and so hopefully I will see these converge down to perhaps 25MBit/s download / upload or so. Fingers crossed.

Will report back.

@moeller0 in the meantime, do you see any issue with the upload part of the ping going through CAKE on wan but the response bypassing CAKE on veth-lan? Also, should I have those pings go through the VPN interface since most of my traffic uses VPN? Or would you say just go through WAN?

moeller0 · October 13, 2021, 3:03pm

As far as I can tell the congestion you want to measure happens on the LTE link, so it should not matter much where you inject the ICMP probe packets into the link. IMHO bypassing cake (for something as low size as a few ICMP packets every few seconds) seems about the right thing to do, after all you are concerned about the LTE link and not what happens inside of cake or your VPN.

Lynx · October 13, 2021, 3:10pm

FYI, with no load it scales up to max and then when I start a waveform bufferbloat test it dials back quickly - see the difference between this:

root@OpenWrt:/etc/init.d# tc qdisc ls
qdisc noqueue 0: dev lo root refcnt 2
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
qdisc noqueue 0: dev lan1 root refcnt 2
qdisc noqueue 0: dev lan2 root refcnt 2
qdisc noqueue 0: dev lan3 root refcnt 2
qdisc noqueue 0: dev lan4 root refcnt 2
qdisc cake 8011: dev wan root refcnt 2 bandwidth 35Mbit besteffort flows nonat nowash no-ack-filter split-gso rtt 50ms noatm overhead 70
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc cake 800b: dev veth-lan root refcnt 2 bandwidth 50Mbit besteffort triple-isolate nonat wash ingress no-ack-filter split-gso rtt 50ms noatm overhead 70
qdisc noqueue 0: dev veth-br root refcnt 2
qdisc noqueue 0: dev vpn root refcnt 2
qdisc noqueue 0: dev wlan0 root refcnt 2
qdisc noqueue 0: dev wlan1 root refcnt 2
qdisc noqueue 0: dev wlan1-1 root refcnt 2

And this:

root@OpenWrt:/etc/init.d# tc qdisc ls
qdisc noqueue 0: dev lo root refcnt 2
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
qdisc noqueue 0: dev lan1 root refcnt 2
qdisc noqueue 0: dev lan2 root refcnt 2
qdisc noqueue 0: dev lan3 root refcnt 2
qdisc noqueue 0: dev lan4 root refcnt 2
qdisc cake 8011: dev wan root refcnt 2 bandwidth 29868Kbit besteffort flows nonat nowash no-ack-filter split-gso rtt 50ms noatm overhead 70
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc cake 800b: dev veth-lan root refcnt 2 bandwidth 42668Kbit besteffort triple-isolate nonat wash ingress no-ack-filter split-gso rtt 50ms noatm overhead 70
qdisc noqueue 0: dev veth-br root refcnt 2
qdisc noqueue 0: dev vpn root refcnt 2
qdisc noqueue 0: dev wlan0 root refcnt 2
qdisc noqueue 0: dev wlan1 root refcnt 2
qdisc noqueue 0: dev wlan1-1 root refcnt 2

I am pretty happy with this 'impulse response', since this is a brutal test, namely going from zero load to maximum load, and seeing how quickly it reacts. Here it reacts quickly enough to act before the end of the waveform test.

From my observation Netflix traffic is rather spikey. I wonder how well this script will deal with that, i.e. traffic oscillating between 2Mbit/s for an active zoom call and then max bandwidth for each spike of Netflix traffic. In the worst case each spike would result in some periodic lag on the zoom call. I need to protect against that. Not sure how the script would handle this situation. Any thoughts?

I guess each spike will hopefully induce delays caught by the script, which will dial down the bandwidth, but it depends on the relationship between the spikes and the periodicity of the pings. Maybe I need very frequent pings and monitoring. Can we dial those up or is it already about maximum?

BTW on the RT3200 this + SQM + VPN is fine. Plenty of spare ram and CPU.

moeller0 · October 13, 2021, 6:11pm

Yes, most streaming services do not sent at a fixed bit rate with nicely spaced packets, but instead send data for say a second play-out as quickly as possible only to fall silent until the next burst transmission. By monitoring the relative internal buffers they then know when to request the next batch and given the relative fill and consumption rate whether to request higher or lower quality (higher quality correlates with a higher data-rate). So most streaming services are spikey.

The script should not care if configured correctly... that is you will have the same issue users on fixed links have, when the shaper sits downstream of the true bottleneck.

dlakelan · October 13, 2021, 7:47pm

I'm guessing it would be better than without the script, but not as good as if you weren't running Netflix. One good thing is at least in the UPSTREAM direction the other meeting participants shouldn't have choppy results from your video/audio. It's mostly that you will have some choppy experience of others.

Lynx · October 13, 2021, 8:57pm

Interesting point about choppy downstream but smooth upstream since how I come across is actually that is what I really care about for business meetings. Thanks for that observation.

With the modification to 20ms / 3 the script seems to be working well for me. Still testing but so far so good. Seems to have coped with some congestion this evening. Delighted to be able to reclaim back some bandwidth and still avoid huge bufferbloat.

Anyone else out there using this solution today?

Lynx · October 14, 2021, 7:44am

@dlakelan I have some thoughts about your script.

My experience with having tested it plenty so far is that its convergence under load works very well. So I think that problem has been solved. But I think the approach is incomplete owing to its load blindness. Specifically: increasing bandwidth based on no ping delays under no load seems flawed because the ping data in this condition is not indicative of more capacity.

Perhaps with the modification below this script could offer a good solution for the many OpenWrt users wanting autorate-ingress functionality?

To illustrate the problem, the maximum bandwidth my LTE connection ever sees is 70Mbit/s. So I tried increasing the max bandwidth to 60Mbit/s to see what would happen.

Under no load your script will tend to revert to maximum set upload and download. But the ping data that led there is invalid because no delays would ever be experienced anyway since there is no load. And thus any impulse load following no load will tend to result in massive bufferbloat until the script reacts.

This does not seem optimal:

How about only converge upwards so long as load is greater than a set threshold? After all, there is no need to provide extra bandwidth if there is no demand for that extra bandwidth, right? I think @lantis1008's script sensibly factors this in.

By providing the extra bandwidth even when it is not even called for under no load (greed), aren't we setting up the system for a bufferbloat-related fall once the load happens?

Wouldn't it be better to:

[the cautious approach] cautiously increase bandwidth reactive to demand for load, whilst it is safe to do so (so under no load, system is at min bandwidth)

rather than:

[the greedy approach] greedily increase bandwidth even when there is no demand for it, and then have to react to increased ping upon load (so under no load, system is at max bandwidth)

How about only allow an increase in bandwidth when the load is greater than say 50%. And perhaps also converge downwards when load is less than 50%?

So the driving forces for opening and closing would then be:

load > 50% AND no delay encountered → increase bandwidth
load < 50% OR delay encountered → reduce bandwidth

Here is an illustration of the above:

This would mean that under no load the system is low latency and ready to react to an increased demand for load. It will deal with impulse loads better. Also I think we are always reacting to meaningful ping data.

Obviously keeping things simple seems highly desirable, but the modification proposed herein seems simple enough and potentially of significant benefit.

What do you think?

Would there be an easy way to take into account load in this way (or similar) in this part of your script:

monitor_delays(RepPid, Sites) ->
    receive 
	{delay,Site,Delay,Time} ->
	    NewSites=[{Site,Delay,Time}|Sites],
	    monitor_delays(RepPid,NewSites);
	{timer,_Time} -> 
	    io:format("Checking up on things: ~B\n",[erlang:system_time(seconds)]),
	    Now=erlang:system_time(seconds),
	    RecentSites = [{Site, Del, T} || {Site,Del,T} <- Sites, T > Now-30],
	    io:format("Full Delayed Site List: ~w\n",[Sites]),
	    io:format("Recent Delayed Site List: ~w\n",[RecentSites]),

	    %% use random scaling factors, but make sure down followed
	    %% by up averages slightly less than 1, based on
	    %% simulations this averages around .98... this ensures
	    %% we don't grow too fast.

	    if length(RecentSites) > 2 ->
		    Factor = rand:uniform() * 0.15 + 0.85,
		    RepPid ! {factor,Factor};
	       true ->
		    Factor = rand:uniform() * 0.12 + 1.0,
		    RepPid ! {factor,Factor}
	    end,
	    monitor_delays(RepPid,RecentSites)
    end.

Is there an already existing stock system command for obtaining instantaneous upload/download bytes transferred per second that could be called? Or perhaps rx/tx bytes transferred would need to be polled and compared with a timer?

@moeller0 and @lantis1008 I'd also be very interested in your thoughts on this.

dlakelan · October 14, 2021, 2:40pm

yes, you'd have to constantly poll the current bytes transferred. The thing is, I'd be afraid the script and the TCP bandwidth finding algorithm might interact leading to all sorts of weird oscillations. TCP already searches for the right bandwidth, it doesn't just jam packets as fast as possible. Of course both speedtest and streaming sites are tuned to do something different from standard TCP so that they jam packets pretty quick.

Lynx · October 14, 2021, 2:47pm

But what about UDP traffic, e.g. for VPN. Or other UDP traffic.

So I see the script already determines the present time using:

Now=erlang:system_time(seconds),

So how about simply recording rx / tx bytes since the previous iteration and divide by the time between iterations? And using that to determine whether load is > or < than 50% and so proceeding that way? Would that work?

dlakelan · October 14, 2021, 2:53pm

I'm in favor of you doing some testing. I think it should work fine. There's some /sys/ file you can read to see the total transfer on an interface. here's for my desktop's ethernet which is named "lan"

/sys/class/net/lan/statistics/tx_bytes

also rx_bytes

It's been 2 or almost 3 years since I worked on this script. The best approach is for you to try it out and submit some pull requests I think. I spent a lot of time on the Erlang docs site: https://www.erlang.org/docs

Erlang is a pretty sweet language for doing this kind of thing, but I already had a love for Lisp and Prolog before starting