CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

Since i sent you data, base is 7000.
I have an external, directive antena (12dbi) as i have a direct view of the antena. Stats are the next: ...
{"type":"lte","rssi":-56,"rsrq":-8,"rsrp":-85,"snr":216}
:face_with_peeking_eye: seems to be more than good.

What you tell me if i tell you that all the town have bad inernet connection at the same time.
I thing that backbone between antena and core network is to stretch.

Then i will down the base to 1000.

Those indeed seem like strong stats - yes presumably the cell tower has a weak backhaul. No other cell towers / providers you can switch to?

There is a relationship between RSRQ and cell tower congestion - see here:

just as example the same speedtest on not collapsed hours: today https://pastebin.com/MJAf6H7D

@Lynx other cell nop :sob: other isp yes but all have the same problem . they certainly share the same link.

Thaks for all i understand better now all the important params of cake-autorate and maybe help others :wink:

I will look for signal data durimg colapsed hour to see if rsrp goes up

1 Like

I won't be able to put in as much time as you on debugging this thing when it's running on other people's routers, but in the spirit of openness and sharing I agree that it makes sense to make it publicly visible, so here it is:

It's pretty amateurish in many ways, but hopefully fairly readable. Unlike the Lua implementation there's no clever maths involved. I'm quite sure there are parts where I'm not using perl in the most efficient way possible, and there are fundamental limitations with the latency checking implementation which I'm not completely happy with, but a full rewrite of the latency checking code would be required to address them. It also doesn't use UCI for configuration and I haven't written a nice install script or even any instructions, but this should be sufficient to try it out:

  1. wget https://raw.githubusercontent.com/tievolu/sqm-autorate/main/sqm-autorate.pl
  2. wget https://raw.githubusercontent.com/tievolu/sqm-autorate/main/sqm-autorate.conf
  3. chmod 755 sqm-autorate.pl
  4. opkg update; opkg install perl perlbase-attributes perlbase-list perlbase-posix perlbase-socket perlbase-threads perlbase-time
  5. Edit sqm-autorate.conf and set the bandwidth / interface properties for your connection. You can also set the logfile location, but you'll need to handle rotation using logrotate.
  6. Download a list of ICMP type 13 reflectors from https://github.com/tievolu/timestamp-reflectors and update the reflectors_csv_file property in sqm-autorate.conf with its location. (You can of course create your own list of reflectors if you prefer.)
  7. Run in the foreground: ./sqm-autorate.pl or in the background: (./sqm-autorate.pl >/dev/null 2>&1)&

I have the script running as a service on my router using an init script, but it doesn't quite work - for some reason it refuses to start automatically on boot, despite being "enabled". I think I'm probably missing something about how init scripts work on OpenWrt and I need to spend some more time figuring it out. It's not a big issue for me to start it manually because my router reboots so infrequently.

2 Likes

Yes this looks nicer under less cell tower loading - see here:

Base is high at 5Mbit/s but this is a decision you need to make. If you set this too high there is a greater likelihood of bufferbloat in a step load. But setting low means potentially sacrificing more bandwidth for latency, and bandwidth is tight here. Well not so much on upload - I mean this plot shows that your connection coped fine up to 20Mbit/s. But your previous plot showed that it struggled even with much lower upload bandwidth.

I think I would decreate the bases though. Perhaps your download base should be 1Mbit/s and your upload base 2Mbit/s.

Here are the columns in the code by the way.

Main ones:

It could be worth increasing the number of reflectors or reducing the ping interval or both to see if a higher ping response frequency helps. Since the download and upload capacity seems to vary very rapidly.

For the next test please can you instead start an ISO download or two and leave it downloading for a while - I want to see how the code behaves in your connection over say 30s or 1 min of loading.

@tievolu (and others familiar with the history underlying this thread) I am delighted that you chose to share to all. Thank you very much indeed! For those not familiar with the historical thread, @tievolu had already come up with many of the cool concepts that were only later introduced or experimented with. The code base is perl so nice and snappy compared to CPU hungry bash.

@rosbeef you might like to give @tievolu's code a try there - looks pretty simple to test out. BTW what device do you have? You may want to keep an eye on CPU just to make sure the bash approach isn't eating up too much. On my RT3200 it's about 3-5% so no problem, but if you increase the ping response frequency the CPU usage will increase. Just something to keep an eye on.

@colo also started a pretty cool and simple awk-based implementation - @colo would you like to share that for people to test too? Last time I tested it it seemed to work. Again I think it was also easy to test out.

1 Like

I updated the OP to provide links to the Bash, Lua, and Perl implementations.

@tievolu - I skimmed your github repo. It looks good, except that it would be helpful to include the installation instructions from (CAKE w/ Adaptive Bandwidth - #45 by tievolu) in the README.md file. Thanks!

1 Like

If xfinity has a 3.1 plant, then they have solved their buffer bloat issues using RED at the CMTS and modem level. But this script should also work with cable internet if you think Peek usage in your neighborhood slows everybody down. But I suspect even if you just use normal SQM you’ll be sitting pretty. It just depends how stable your vdsl setup is and if you are happy with it.

If I was you I would go with cable as it has the lower latency than vdsl generally. To me that is king, but other factors like price might desuade you.

1 Like

Good point! Done :+1:

2 Likes

Thanks for keeping my approach in mind! I think it needs quite a bit of attention before it could be ready for general testing (the configuration suggester needs some love to make getting started more streamlined and easy at a minimum), and I would also like to implement upstream shaper control before positioning it as an alternative to the existing and well-working solutions in the problem space :slight_smile:

1 Like

My friends who have cable seem to have better latency than me, but I've always preferred consistent bandwidth. So if adaptive SQM can mitigate that, it would win out for me.

Thanks for your input and advice.

ok so i change the base to 1000 for dl and 2000 for ul
i changed too the minimun to 500 for both

datas of iso download during free hours more as you need maybe

datas of iso download during collapsed hours

please be patient, my server have 1Gb bandwith but just 2G of memory :stuck_out_tongue:

1 Like

A turris omina armv7l 2g ram with ec25 modem.
Maybe i have to increase the ping so. To be best responsive

@Lynx pinging me in here motivated me to resume work on my fping/awk-based implementation. Its benefits are a small number of dependencies and a measure-/control-loop that's supposedly rather gentle on the CPU - so I guess it could be a good solution for weaker OpenWrt devices that do not have plenty of compute and flash resources to spare.

Today I got the "configuration wizard" component close to finished state. Now, I would need testers, ideally under many different and diverse kinds of circumstances :slight_smile:

If you are willing to participate, all you should need is an OpenWrt device with SQM-case already set up (that is, you picked a device and set a rate to shape to via luci-app-sqm) and the fping package installed. Once you have that, please download https://johannes.truschnigg.info/code/sqm_lagthrottle/sqm_lagthrottle_suggest_config.sh and give it a whirl.

I would very much appreciate your feedback on the "user experience" of navigating the wizard, as well as reports on its success/failure and the configuration it generated for you, as well with a short summary of your connection type/specs (bandwidth, best-case latency for domestic connections) and country of residence. If you could post that kind of here, replying to this post, I'd be very grateful! :slight_smile:

The other components that evaluate the generated configuration are available for download at https://johannes.truschnigg.info/code/sqm_lagthrottle/ - upstream bandwidth shaping still isn't implemented, but if there is sufficient interest in my approach, I will make that happen :wink:

2 Likes

Maybe good, on powerfull devices, for the planet too. each instruction is a power consumtion.

I'm a tester on differents and diverses circonstances. But i 'm not sure to understand what should do your script and is it working over the cake-autorate?

Sorry that my request/posting wasn't clear enough! :slight_smile:

I wrote a configuration wizard for my dynamic SQM rate adjuster, and I want to collect data points to see if the configuration it generates for different users and devices makes sense.

To help me do that, please download sqm_lagthrottle_suggest_config.sh and run it, like so:

wget 'https://johannes.truschnigg.info/code/sqm_lagthrottle/sqm_lagthrottle_suggest_config.sh'
chmod +x ./sqm_lagthrottle_suggest_config.sh
./sqm_lagthrottle_suggest_config.sh

After the script is done, you will find a file in the current working directory, named something like sqm_lagthrottle.conf.SOMEHIGHNUMBER. Please copy its contents and post it here while also describing your connection (LTE/5G, cable/DSL, what latency you consider good, etc.), and if you found the questions and options presented understandable and easy to use. Please also report any failures!

If you want to actually test the program that re-configures SQM cake to deal with increasing latency under variable-bandwidth conditions according to that configuration file, you will have to download two other, additional files: https://johannes.truschnigg.info/code/sqm_lagthrottle/sqm_lagthrottle.sh and https://johannes.truschnigg.info/code/sqm_lagthrottle/__sqm_lagthrottle.awk

Place them in the same directory as the generated configuration file (sqm_lagthrottle.conf.SOMEHIGHNUMBER), rename the configuration file to sqm_lagthrottle.conf, run chmod +x sqm_lagthrottle.sh to mark it executable, and then run ./sqm_lagthrottle.sh - that will load the generated configuration file and steer downstream bandwidth according to the measured latency on your Internet uplink.

1 Like

@colo I have not tried your work, but have a non-zero motivation to start my own project on this, due to disagreements with the existing ones. Since the heuristic controlling the choice of which direction to blame for the bufferbloat is one of my disagreements with @Lynx, and you are going to experiment anyway, let me just share my considerations.

The root of the problem is that, given only regular pings, one cannot be sure whether the bloat is in the upstream or downstream direction. @Lynx's implementation, upon encountering bufferbloat, resets the shaper speed in both directions to 90% of the achieved rate at the moment, under the mistaken but apparently necessary assumption that the other direction would be quick to recover. Yes I understand that nobody has suggested anything better except OWD.

Here is a bad sequence of events. Assume that the minimum speed is set to 200 kbps (because it is that bad in the Philippines around 9PM) in both directions. The base rate is 500 kbps, but even 10000 kbps is sometimes achievable. There are 4 reflectors, and the reflector ping interval is 4 seconds, so that there is only 1 ping per second total. With such low speeds, resetting to the minimum just because "there is no other idea" is a painful event that takes a lot of time to recover from, and must be avoided, even at the cost of some bloat getting through.

  1. Watch a few YouTube videos on a good day, see how the DL shaper rate is steadily increasing to, let's say, 4200 kbps, while the UL is still 500 kbps.

  2. Upload a few photos to Google Drive. Let's say that 3000 kbps upload is achieved before there is bufferbloat.

  3. The script detects bufferbloat. At this point, the achieved rate is around 3000 kbps and something less than 200 kbps down. Result: the script will restrict the upload to 2700 kbps and download to the miserable worst-case 200 kbps.

This result (that performing an upload totally kills the well-earned download speed) is something why I patch @Lynx's script to never try resetting the shaper rate in the idle direction (but still reset the epoch).

This works well for a one-user scenario, because there is never simultaneous uploading and downloading, but this will fail if, during the upload of photos, there is indeed a concurrent slow download (e.g. internet radio).

I tried investigating some ideas in my head, but so far, without OWD, have nothing better. One of the ideas investigated was to restrict the ratios of the rates. E.g., given that we achieve 3000 kbps up, assume that we have at least 1500 kbps down, and do not set the down shaper below that. Similarly, if we achieve 4200 kbps down, assume that we have at least 2100 kbps up, and don't set the shaper below that. Obviously this needs to be generalized to break the symmetry - we need two achieved-to-assumed rate ratios, one for going from achieved upload rate to assumed-to-be-achievable download rate, and the other one for the other way round. But @Lynx's approach already has too many tweakable parameters, that's why I am not really fond of this idea.

EDIT: on a good day such as today, here is what's achieved without SQM: https://www.waveform.com/tools/bufferbloat?test-id=42ad86e8-3041-4a4c-aef5-737b9aff8167. Also please take into account that this LTE connection is only a backup. My primary connection is using GPON.

1 Like

I tried the sqm_langtrottle_suggest_config.sh script. Result:

awk: cmd. line:30: Division by zero

@patrakov working with OWD is desirable, it is just that as far as I am aware there is no good existing solution in OpenWrt for that.

For ICMP type 13, nping offers OWD functionality but is broken, hping3 seemed good but isn't an official OpenWrt package, I lack the skills to make it one, and there is insufficient motivation on the part of those with the skills to do so to make it one. If you do then would you be willing to? @Lochnair already made a makefile that works for building an OpenWrt package in a build environment.

ICMP type 13 is also a little unreliable. Not all reflectors support it and some do weird stuff with the offsets. It seems to be not very well liked perhaps in part because at one time it was used to try to guage reflector cpu load. I don't mean to write it off, I mean it could be fine - it's just that it's not without issues.

Working with ntp packets also looks cool, but again there is no tool to do that save for one that allows one to be sent at a minimum interval of one second.

Is there an OpenWrt tool we can jump over to save for recreating packets ourselves?

If we can find a way I'd be delighted to jump over to that but I haven't yet seen any viable option.

I have also wondered about knocking down the direction with highest load, see if that works, and if not knock down both rates. Alternatively maintain an ntp baseline in the background, and, upon detection of bufferbloat, query ntp to work out the direction and punish that direction.

Also the lua approach works with OWDs - did you try that? I think at the moment you have to build a custom package in the build environment for that to work.

@colo maybe you have some ideas here?

Oh my, how embarrassing :sweat_smile: Can you check if there's still leftover input files from the stats collection phase in /tmp, please? Full pathnames should match /tmp/fping_*.out* - if you could upload/nopaste them somewhere and shoot me a message with the link, that would be great and help me nail down the root of the problem. Thanks a lot for spending time on my stuff!

Concerning your other remarks: I am afraid that with such a low feedback rate (only one measurement per second, as you described) and so little minimum bandwidth (down to 200Kbps), it's going to be VERY hard to arrive at a satisfactory result with a solution that tried to be useful for the "average case" of variable link speed that I've known from LTE links around here... That said, my implementation does not shape upstream at all yet, but features some knobs for users to tweak to affect its decisions at runtime. Now that I've read your concerns, I'm pretty sure it needs a third one, which would affect granularity of the bandwidth slot ladder that is computed between the adjustable minimum and maximum bandwidth that the shaper will cycle through...

1 Like