CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

cake-autorate adjusts the bandwidth of cake instances, which must already have been instantiated - see:

There are always separate instances for download and upload, and so setting the download and upload interface to the same interface does not make sense.

So two questions arise.

Firstly, have you properly instantiated cake instances for download and upload?

Secondly, what are the interfaces on which the cake instances have been applied?

The answer to both questions can be determined by running 'tc qdisc ls'. If you could run that command and paste the output in a response on this thread then we can advise further.

1 Like

That is because Linux will only instantiate a qdisc (like cake) on an egress interface (which for WAN would only cover the upload traffic). So to be able to attach qdisc's to ingress traffic we need to somehow convert what is ingress traffic into some sort of egress traffic (as far as the kernel is concerned, ingress and egress only matter in respect to a given interface, so internet download traffic is ingress for WAN, but egress for LAN).
There are three more or less common methods to achieve that:
A) use the kernels intermediate functional block device (IFB) to create an interface-copy from the egress interface that can deal with ingress traffic (that is what sqm-scripts does)
B) create a veth pair and a matching routing table to redirect ingress traffic over that pair and then attach the qdisc to the egress half of that pair
C) simple instantiate the ingress shaper on a LAN interface (but that only works for wired only puters where the router itself generates next to no traffic)

What all three have in common is, that from the kernel's view there will be two different interfaces one for egress, one for ingress. So "overly complex" might be right, but "rare" seems incorrect, at least for users of ingress traffic shaping/AQM.

I would recommend you install and configure sqm-scripts/luci-app-sqm first (opkg update; opkg install luci-app-sqm), then look at:

and maybe

for the configuration and once you have a working sqm installation, then deal with cake-autorate.

Thanks for the information guys. Apologies, it might be worth adding some additional notes about this, I had CAKE installed and configured, but didn't have it enabled for that interface, so "tc qdisc ls" didn't return much. I guess it's not really an "interface" in the traditional Openwrt form (it's more like a qdisc interface), hence my confusion. I'll try it again soon and see how we go.

An IFB will show up in ifconfig output so looks like a real interface in Linux....

Thanks guys, I've updated my config. Quick question though - if I update my config, will the autorate service use my new config automatically?

Lastly - how do I check what version I have? I can see there is a 2.0 version, but the github only links to the 1.2 version as a release?

cake-autorate will use whatever config file is placed in the running directory /root/cake-autorate at the time the script is launched. If you've either run the setup script or manually downloaded the files from the master branch then you've got the most up to date version. I should look into having the setup script write out the latest commit identified into a version file.

I've still not actually released 2.0.0. Maybe it's time now as everything seems super stable again! Versioning is not my favourite aspect of working on cake-autorate :smile:.

Once you have it running I'd encourage you to obtain a log file showing a couple of speed tests and upload it for us to take a look and make a plot and verify all looks in order.

It seems the complexity of the below was not actually needed. So I'll not introduce the associated complexity into cake-autorate.

Summary

In the light of my recent foray into overwriting ECN bits, which seems desirable in certain circumstances, and facilitating the same in cake-qos-simple - see here:

CAKE w/ DSCPs - cake-qos-simple - #164 by Lynx

it struck me as worthwhile to facilitate working with unusual cake instantiations in respect of the tc change calls in which the cake qdisc is not necessarily placed at root, but instead placed at a specific parent band.

Here is the output from 'tc qdisc ls' on my router when using the new capability of 'cake-qos-simple' to overwrite the ECN bits with '0' on upload and download:

root@OpenWrt-1:~# tc qdisc ls
qdisc noqueue 0: dev lo root refcnt 2
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
qdisc noqueue 0: dev lan1 root refcnt 2
qdisc noqueue 0: dev lan2 root refcnt 2
qdisc noqueue 0: dev lan3 root refcnt 2
qdisc noqueue 0: dev lan4 root refcnt 2
qdisc prio 1: dev wan root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc cake 808f: dev wan parent 1:1 bandwidth 20Mbit diffserv4 triple-isolate nat wash ack-filter split-gso rtt 100ms noatm overhead 0
qdisc ingress ffff: dev wan parent ffff:fff1 ----------------
qdisc noqueue 0: dev br-guest root refcnt 2
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc noqueue 0: dev wlan1 root refcnt 2
qdisc noqueue 0: dev wlan0 root refcnt 2
qdisc noqueue 0: dev wlan0-1 root refcnt 2
qdisc noqueue 0: dev wlan0.sta8 root refcnt 2
qdisc noqueue 0: dev wlan0.sta9 root refcnt 2
qdisc noqueue 0: dev wlan1.sta1 root refcnt 2
qdisc noqueue 0: dev wlan1.sta2 root refcnt 2
qdisc cake 8090: dev ifb-wan root refcnt 2 bandwidth 20Mbit diffserv4 triple-isolate nat nowash ingress no-ack-filter split-gso rtt 100ms noatm overhead 0

I have implemented this in cake-autorate like so:

https://github.com/lynxthecat/cake-autorate/compare/master...lynxthecat-patch-1

So there are new defaults:

dl_if_tc_parent="root" # the download interface tc parent on which cake is applied (normally root)
ul_if_tc_parent="root" # the upload interface tc parent on which cake is applied (normally root)

And these can be overriden in the config like so (to match my cake instantiations as shown above):

ul_if_tc_parent="parent 1:1" # the upload interface tc parent on which cake is applied (normally root)

Does this seem OK @moeller0 and @dave14305?

In terms of nomenclature, I think this is reasonable:

ul_if_tc_parent="parent 1:1" # the upload interface tc parent on which cake is applied (normally root)
tc qdisc change dev "${interface}" ${tc_parent} cake bandwidth "${shaper_rate_kbps}Kbit" 2> /dev/null

because the tc parent can be either root or say a specific band. See, for example:

https://man7.org/linux/man-pages/man8/tc.8.html#TC_COMMANDS

Is it reasonable (or plausible) to have periods of increased latency which are independent of my bandwidth usage?

For instance, during high-traffic hours, suppose my total bandwidth usage (up & down) is zero except for pings. Could my latency still suffer due to network conditions upstream from me?

If so, what is a reasonable expectation of performance?
For context, my ISP is a smallish Fiber provider in rural New England and I have what's advertised as a symmetrical 1Gb connection.

Sure, odd things can happen... e.g. a small ISP might have a direct path to an internet exchange that typically is used with a delay of Xms, but in primetime this link might be constantly overloaded and so some connections/customers get e.g. routed via a secondary path of different length resulting in a statically different delay to the same targets. If the path difference happen inside say an MPLS network this would be really hard to diagnose for end customers...

This is however a hypothetical answer, no idea what your ISP is "cooking" there.

2 Likes

cake-autorate version 3.0.0 release


What's Changed

  • This version restructures the bash code for improved robustness, stability and performance (@lynxthecat and @rany2).
  • Employ FIFOs for passing not only data, but also instructions, between the major processes, obviating costly reliance on temporary files. A side effect of this is that now /var/run/cake-autorate is mostly empty during runs (@lynxthecat).
  • Significantly reduced CPU consumption - cake-autorate can now run successfully on older routers (@lynxthecat and @rany2).
  • Introduce support for one way delays (OWDs) using the 'tsping' binary developed by Lochnair. This works with ICMP type 13 (timestamp) requests to ascertain the delay in each direction (i.e. OWDs) (@lynxthecat).
  • Many changes to help catch and handle or expose unusual error conditions (@rany2).
  • Fixed eternal sleep issue (@rany2).
  • Introduce more user-friendly config format by introducing defaults.sh and config.X.sh with the basics (interface names, whether to adjust the shaper rates and the min, base and max shaper rates) and any overrides from the defaults defined in defaults.sh (@rany2).
  • More intelligent check for another running instance (@rany2).
  • Introduce more user-friendly log file exports by automatically generating an export script and a log reset script for each running cake-autorate instance inside /var/run/cake-autorate/*/ (@lynxthecat).
  • Added config file validation that checks all config file entries against those provided in defaults.sh (@rany2).
  • Improved installer and new uninstaller (@rany2).
  • Updated Octave plotter (@moeller0).
  • Updated documentation (@richb-hanover and @lynxthecat).
  • Many more fixes and improvements (@lynxthecat and @rany2).
  • Incorporates many ideas and suggestions by @moeller0 and @patrakov.

Full Changelog: v1.2.1...v3.0.0

6 Likes

And here is a tiny further release to fix the documentation URL for the setup script:

To install this version just issue:

wget -O /tmp/cake-autorate_setup.sh https://raw.githubusercontent.com/lynxthecat/cake-autorate/v3.0/setup.sh
sh /tmp/cake-autorate_setup.sh

For any who just want to try cake-autorate, an uninstaller has now also been provided to remove the few files that are added by the installer.

There is some code to migrate a previous 2.0.0 install and config, but I'd recommend backing up any previous config(s) before installing version 3.0.1.

There is now rather strict config file validation that checks any entry against defaults.sh to verify: firstly, that the key relates to a configurable parameter; and secondly, that the value is the correct type out of integer, float, string, array, etc. Guidance is given to help the user identify and fix any problematic entries.

The values set for dl_delay_thr and ul_delay_thr now need to be of type float. Just add '.0. to the end of any entries for these from previous configs in which these were set as integers. So, for example, '30' would become '30.0'.

2 Likes

I know this might be a bit off topic to the previous post... but I've just recently discovered my isp has a variable overhead... of 7-14% of the packet size at any given time.( im not entirely sure what they meant exactly but.. from word of mouth idk if they're just trying to get me to drop the excessive ping spikes that even cake cant seem to manage [up to around 3000ms now, they also appear to "DNS hijack" so setting custom dns appears to not help as it will still have a dns leak to the isp provided servers])

This is with FTTH-GPON-MAPe(with a draft-ietf-softwire-map-00) connection

My speed appears to sometimes go from 128kbps up to even 1.5gbps would it be possible that the script is capable of setting overhead to a % of the set rate from the autorating to see if it would~ improve the latency a bit more

I'll defer to @moeller0 here.

That is not really how that works, the per packet overhead tends to be independent of packet size, so perentual overhead inceases when packet size decreases... That said, well possible that you will see larger overhead for ip4 versus ip6.

Now, I admit, I lack first hand experience with both GPON and MAP, so your ISP might well be correct (except the part that overhead is a constant fraction of packet size.

That said your ping spikes are these for the outgoingegress or incoming/ingress direction?

Maybe post the content of:
cat /etc/config/sqm # assuming you use sqm-scripts
tc -s qdisc
as well as a screenshot of a speedtest at:

I will try to help but am traveling ATM away from real computers so my responses will likely be delayed and not well researched....

2 Likes

I wish I could say that, the results I've gotten on the speedtest is 24/7 but sadly its not, and when it spikes to 3000ms its.... not fun.

config queue 'eth1'
        option enabled '1'
        option interface 'eth1'
        option download '450000'
        option upload '450000'
        option qdisc 'cake'
        option script 'piece_of_cake.qos'
        option qdisc_advanced '1'
        option ingress_ecn 'ECN'
        option egress_ecn 'ECN'
        option qdisc_really_really_advanced '1'
        option itarget 'auto'
        option etarget 'auto'
        option linklayer 'ethernet'
        option debug_logging '0'
        option verbosity '5'
        option squash_dscp '0'
        option squash_ingress '0'
        option iqdisc_opts 'nat dual-dsthost diffserv4 ingress ack-filter'
        option eqdisc_opts 'nat dual-srchost diffserv4 egress ack-filter'
        option overhead '56'
        option linklayer_advanced '1'
        option tcMTU '2047'
        option tcTSIZE '128'
        option tcMPU '84'
        option linklayer_adaptation_mechanism 'default'

now its normally 3000ms on the ingress... its variable depending... on time of day

I would not use ACK filter for the download direction, but this certainly is not related to your issues...

About 3 second spikes, sure that is poison for anything more interactive than correspondance chess....

I'm not familiar with fiber, but with what little i've heard about, is it possible that the endpoints receiver is taking too long to separate the channels regardless of pppoe/ipoe connections?(they claim that ppoe causes the latency spikes, however it appears to be there in both ipoe and pppoe... just ipoe has faster consistent speeds)
Without ack-filter on the ingress, i get weird errors related to... gaming (like stuttering, weird error corrections... im not entirely sure if its the cause of acks or something else)

That would a broken receiver... For download all CPE/ONTs/modema receive the same packets but only the correct/targeted ONT will have the correct key to decrypt the data, and for upload each ONT needs to first request transmit slots from the OLT before it can send data (the request grant traffic to request those timeslots is organized as a second logical channel using the same frequency IIRC.
Sure PPPoE is more damanding than IP as it essentially is a tunneling protocol that requires transformation of data frames on send and receice, but that is something that can be 'solved' by using a router with powerful enough CPU(s) and/or PPPoE offloading capabilities.

In my limited experience PPPoE if configued competently enough will not cause reliable latency spikes in the multi second range by itself.

ACK filter really should not help noticeably on ingress if it does something else seems off.... maybe it is time to get some packetcaptures?

@moeller0 curious issue here.

We have two televisions: an LG and a Samsung. Both are connected via the 2.4GHz guest network provided by wirelessly connected extension APs, and from testing it seems that provides circa 60Mbit/s download for clients.

Now for some reason when cake with autorate is enabled (I'm not sure if the autorate is relevant yet but I think not since I didn't see throttling kicking in and cake set to 60Mbit/s download seems problematic), when watching one particular series on prime that requires higher than normal bandwidth (bursts of circa 40Mbit/s), on the LG Prime Video reduces the streaming quality providing cake is enabled. But this doesn't happen on the Samsung. The WiFi signal strengths of both the LG and Samsung are both excellent. When cake is disabled the series reliably streams at full quality on the LG.

Is there some kind of interaction between cake and WiFi when the bandwidths of both are around the same level? Is there some kind of buffering issue that's relevant here?

To be clear, it seems when the episode starts there is an initial sort of testing of the connection whereupon the viewing quality for the remainder of the session is determined. On the LG cake seems to interfere with this such that the quality is downgraded. But this doesn't happen on the Samsung!

Any idea what's happening here? An obvious thing to try seems to be to switch the LG to the much higher bandwidth 5GHz WiFi and see if that makes the issue go away. But if it did I wouldn't understand why since no cake on 2.4GHz works fine.

I can only speculate... The way modern adaptive streaming mostly works is that the receiving device request some segment of data (at a given bitrate and quality) and then monitors how long it takes for that data to arrive in its buffers. Based on the low buffer state and potentially the temporal dynamics of the transfer the receiver then decides what bitrate/quality to request for the next segment. There seems to be no standard about the actual numerical critetia used by endpoints to decide what bitrate to request next, so this might simply be a more cautious configuration by LG compared to Samsung, but I have no way to confirm this hypothesis... (I assume that both streams are marked with the same DSCPs and hence use the same WiFi access class).

My gut feeling is that this question might get more responses if posed in a new thread compared to being placed at the end of a huge thread, especially since I agree that it does not really look like autorate is involved?