Mmmh, but many interactive use cases have longish download only phases, until you e.g. have read the page and click a link, which you expect to open quickly, no? Could be that they special case different upload traffic mixes, but my hunch is this is simply driven by rate...
I've been looking through some papers to see if there might be an LTE-based explanation for the behaviour. But aside from having been reminded about just how complicated anything LTE is, I have not found anything entirely concrete that might help explain such hefty delays.
I found this interesting:
Even if my cake-autorate-generated ICMPs are delayed during saturating download (without any upload), presumably this doesn't necessarily mean that the ACKs associated with the saturating download are delayed (since I think ACKs are handled separately in LTE), right?
I suppose we can't tell whether the ACKs themselves are delayed? Is such information discernible using a particular testing facility?
You would need to capture the TCP traffic at the other end, so you can see the temporal order of the ACK stream as well... but for that you need your own server... (while I am willing to help, my gross uplink rate is limited to 35 Mbps, which might not be enough to reliably saturate your link*)
*) In case that might be enough, we could think about how to structure such an experiment including packet captures... either flent or crusader might be suitable tools, combined with packet captures as both ends...
Newb here, so trying to make it all happen on an Edgerouter X. Just hoping that I can get some helpful answers as I have 2 x Edgerouters. 1 is converted to OpenWRT 22.03 from EdgeOS and I've tried the SQM - QOS package with little benefit as we are on Australia's NBN Fixed Wireless with considerable speed variations throughout the day and the usual bogging down in the evening. Cake w/ Adaptive Bandwidth sounds like the closest thing we will get to manage the fluctuations throughout the day. I have fumbled and fudged my way being relatively new to Linux and SSH, scripts, CLI's, etc.
So my questions are as follows:
What is the ideal OpenWRT version for this script? I read somewhere that 22.03 is not supported by the Devs for this script, yet it installed OK after much frustration. Is it functioning in this version or not?
I will try to convert the second Edgerouter to whatever the recommended version is to question 1, is there anything I need to be mindful of?
It would be much easier to install in EdgeOS but I presume that is not an option.
Our speeds range from best 65down/5up to worse (mainly mid evening) 4down/1up with the ping around the 35ms range but severe bufferbloat from time to time. What would be the suggested numbers for entry into the SQM auto-rate configuration?
Should I install the luci-app-sqm or just this one?
Thanks in advance for any worthwhile suggestions.
This script should not really care, only sqm-scripts/luci-app-sqm will drag in iptables unless you install iptables-nft first (or so I heard, I am still on OpenWrt21 so can not test this).
Probably possible, but you would be on your own, realistically I would start with OpenWrt if only to learn how it is supposed to work before translating to EdgeOS.
Your "best" numbers are good candidates for the maximum rates for the script, so it does not bother searching for rates well above the possible... the "worse" numbers are decent candidates for the minimum rates; you will need to decide what to put as base numbers, essentially the rates the controller converges on without any load.
The ping thresholds depend on the measurement method, if you use fping or ping (and hence RTT instead of tsping's OWDs) start out by setting each threshold to say 20ms and then collect a logfile and look at it in detail to check the thresholds...
I would install luci-app-sqm so you can configure your desired cake options and then have cake-autorate swoop in and adjust cake's shaper rates dynamically at run-time...
Appreciate your feedback and suggestions, will have a play over the next few days and see where it goes.
Cheers...Earle - Waussie.
@moeller0 I just thought of an issue. Right now we rotate out reflectors based on whether the baseline of any reflector exceeds the minimum baseline by 10ms. Folding in the rollover compensation into the baseline is then problematic in that sense. How should we deal with that?
Mmmh, not sure this makes all that much sense for icmp timestamps....
Enforcing a max baseline deviation will essentially bias towards reflectors with correct roll-over time, but that might throw out too many usefull reflectors... we really only want to get rid of reflectors with unusual RTT, so maybe the reflector replacement code needs to operate on that RTT by combining the two OWD baselines first?
I agree. I also wondered about separating the compensation from the baseline. But wondered if there was another way.
Seems like you have something in mind:
But I don't quite follow - how do you mean?
The OWD baselines can now be huge or even negative. Their combination presumably could also be huge or even negative.
We could track yet a further ewma of the RTTs?
As reminder, right now we rotate reflector if:
a) baseline is greater than minimum by 10ms
b) delta ewma is greater than minimum by 10ms
And only rotate max one per round.
And we also randomly rotate every so often.
Sure, but their sum should still roughly equal an RTT EWMA, since we essentially do:
up: remote_receive - local_send
down: local_receive - remote_send
rtt: up + down: remote_receive -local_send + local_receive - remote_send
now remote_receive is mostly almost equal to remote_send, so let's substitute remote_send with remote_receive this effectively ends up being real close to:
local_receive - local_send
note how this is truly independent of the clock offset as long as roll-over does not happen within a send/receive pair.
I've been running cake-autorate on an Edgerouter X running OpenWrt for maybe a year? I'm currently on 22.03.3. I think I tried getting it to work on EdgeOS at the very beginning but had some reason I switched to OpenWrt, I can't remember now what the problem was though.
Ah sweet - @moeller0 to the rescue - I see what you mean now. So actually we should fold in the dl and ul baseline comparisons with their minimums to signed combination of the baseline comparison with minimum.
Should we still consider the ewma of the deltas separately? I don't think those will be affected by the rollover issue.
Have not thought about this yet, and I do not have enough brain cycles left today to think over it
Hey @gba are you using a fairly recent version of cake-autorate now? How's performance these days on Starlink? Could you share your latest config since I want to update the readme for Starlink users. Are you bothering with the satellite switching compensation? We never truly bottomed that out. Maybe that should be deleted if it's not offering any benefit now.
You might like to try the tsping iteration since it uses OWDs and will give higher performance for mixed for mixed download and upload. On the other hand, I'd understand: if it ain't broke don't fix it!
I think we should reconstitute the RTT for all delay sources and operate on the RTTs here (so one method works for true OWD sources like tsping and fake OWD (really RTT/2) sources like fping)...
I think the deltas should be immune, after all we remove the constant (changing) baseline exactly so we look at something more stable... BTW I am still amazed how well that heuristic works
Which, the baseline one? That does work well. The reflector convergence in general seems to work well too. I guess the fundamentals of cake-autorate haven't really changed very much for some time now. Although of course actually using OWDs is a big change.
Nah mostly just looking at the deltas to abstract over path differences between different reflectors... (not that the replacement heuristic not also seem to do the right thing.)
I am slowly warming to your idea about trying to make the shaper change somewhat proportional to the amount of delay increase... (one precondition is that all/most reflectors show pretty similar delataDelay behaviour/CDFs).
Yes, that is quite a big item! Now we need to get tsping integrated as normal part of the OpenWrt packages (after some more testing that is...)
I have been running the development commit on March 6 since then. And I have continued to use the Starlink compensation functionality. Although I haven't done enough testing to really determine how useful that is or not. Unfortunately (or fortunately?) I haven't really had many loads lately that are very latency sensitive so cake-autorate probably hasn't been strictly necessary for me. Case in point, I just logged in to check cake-autorate after not having done so in perhaps a month and I see that the script isn't running right -- it got in that state where all the pingers weren't working but the script was continuing to run. I have no idea how long it had been doing that, so I may not have actually been using cake-autorate for some time.
I just upgraded to the latest version and will monitor over the next couple days to make sure it stays stable. I'm just running fping for now. I will plan on switching to tsping once it has a package in OpenWRT but I don't have time to compile it myself at the moment.
Thanks for your work!
I guess one thing I notice right away is that I see in the changelog "significantly reduced CPU consumption" so I wanted to check that. I did a speed test while cake-autorate is running on the new version.
When the download is going and gets up about 75 Mbps down, my router's CPU is pretty close to pegging. I can see in htop that all 4 cores are around 85-95%. This is on the EdgeRouter X, which isn't a super fast modern router by any means. But this is mostly cake-autorate as if I stop cake-autorate and do the speed test, I'm getting around 85 Mbps down with 20-40% CPU usage.
Do you have any suggestions of anything to try to adjust? My config is pretty straightforward right now:
# *** STANDARD CONFIGURATION OPTIONS ***
### For multihomed setups, it is the responsibility of the user to ensure that the probes
### sent by this instance of cake-autorate actually travel through these interfaces.
### See ping_extra_args and ping_prefix_string
dl_if=ifb4eth0 # download interface
ul_if=eth0 # upload interface
# Set either of the below to 0 to adjust one direction only
# or alternatively set both to 0 to simply use cake-autorate to monitor a connection
adjust_dl_shaper_rate=1 # enable (1) or disable (0) actually changing the dl shaper rate
adjust_ul_shaper_rate=1 # enable (1) or disable (0) actually changing the ul shaper rate
min_dl_shaper_rate_kbps=10000 # minimum bandwidth for download (Kbit/s)
base_dl_shaper_rate_kbps=100000 # steady state bandwidth for download (Kbit/s)
max_dl_shaper_rate_kbps=200000 # maximum bandwidth for download (Kbit/s)
min_ul_shaper_rate_kbps=2000 # minimum bandwidth for upload (Kbit/s)
base_ul_shaper_rate_kbps=10000 # steady state bandwidth for upload (KBit/s)
max_ul_shaper_rate_kbps=30000 # maximum bandwidth for upload (Kbit/s)
dl_delay_thr_ms=40 # (milliseconds)
ul_delay_thr_ms=40 # (milliseconds)
I don't remember the CPU going so high on the older version but I'm not 100% sure, maybe it always did.
Does that use:
MediaTek MT7621A Wi-Fi SoC contains a powerful 880 MHz MIPS® 1004KEc™ dual-core CPU
And is that with the scaling governor set to 'performance' and with 'irqbalance' installed and enabled? The former ensures that the CPU is not being scaled down (rendering CPU utilisation statics not so meaningful), whilst the latter supposedly spreads processes out over cores.
I would have thought that the CPU utilisation of cake-autorate shouldn't scale with download rate because the computation relates to processing ping responses, the rate of which is not altered by downloading or not downloading.
Calls to 'tc' may well be excessive and there would be more calls to that associated with rates increasing. If you set 'adjust_dl_shaper_rate' and 'adjust_ul_shaper_rate' to 0, does that alter the CPU usage significantly? If so we should limit the rate of 'tc' calls.
Otherwise, the easiest way to reduce CPU utilisation of cake-autorate is simple to reduce the frequency of ICMP responses. The default is 20Hz (6 pingers with 0.3s spacing each).
Maybe @moeller0 has some other ideas.