CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

patrakov · December 4, 2022, 4:42pm

Yes. So your device has a raw IP interface.

EDIT: actually even that is irrelevant. What's relevant is that LTE does not have MAC addresses in the radio packets. So, even though my modem (based on cdc-wdm) does have a MAC address, it's just a dummy needed for the fake DHCP server inside the modem to work, and its size and overhead should be disregarded. Of course, the LTE-specific overhead should be included, but that's on top of raw IP.

moeller0 · December 4, 2022, 4:52pm

Thanks a lot, this is a territory where I have little intuition about what is included or not...

What is added exactly? I would assume the whole L2 frame (see https://en.wikipedia.org/wiki/Ethernet_frame) from destination MAC address to frame check sequence (FCS)with the resulting mpu of 64 byte (so preamble, start of frame delimiter and interpacket gap excluded). But is that true or is the FCS omitted?

bmork · December 4, 2022, 5:15pm

Only the 14 byte header is added. It's never transmitted on any link so preamble and FCS don't make much sense. I have no idea how this is accounted for. We might add a VLAN tag too, but that never even makes it into the frame. Does it count?

Lynx · December 4, 2022, 5:29pm

@patrakov and @moeller0, how about this for the new log file handling strategy:

moeller0 · December 4, 2022, 6:33pm

Mmmh, but if the FCS is not transmitted there is likely going to be some other checksum used and transmitted to detect on-air corruption, and that checksum will use up some gross transferrates, no?
The kernel will only add 14 bytes for true ethernet as well, as the driver/NIC will add the missing parts, but on-link all of the over head is taking up transmission time... (this is fine, no complaints against the kernel, these 14 bytes are the bytes the kernel handles so it does know about them an needs to reserve space).

If a VLAN tag is translted into some codepoint of a permanently transmitted data structure/header it does not count, if however the on-air size increases when VLANs are used it should count.

That said, by know it is pretty clear that slightly overestimating the true overhead has little throughput cost (and zero throughput cose with autorate) so I got a bit more relaxed about overhead; if we can deduce is reasonably reliably take the precise value otherwise set it to 42-48 and be done with...

Lynx · December 5, 2022, 7:47am

So what's out best guess here for the correct LTE overhead to use?

moeller0 · December 5, 2022, 8:58am

Hard to tell, if you follow the 3GPP links on:

you end up with somewhat little-fun-to-read standards documents that leave a lot of encapsulation options for the ISP... hard to tell what is actually used. It seems however that FCS is never encapsulated (but any padding is included in the payload). Still have not fully understood where the checksum ends up in...
You can dive into the 3GPP standards documents, maybe you can come to some useful conclusions

patrakov · December 5, 2022, 10:16am

There is an easier way - experimental.

I tried to determine the overhead that applies to the upload direction of my LTE link. To do so, I found an iperf3 server which is hopefully not too far (in India). Then I ran this command:

for x in 1 2 3 4 5 6 7; do
    iperf3 -t 10 -b 2M -l 1300 -u -c $SERVER ; sleep 3
    iperf3 -t 10 -b 2M -l 300 -u -c $SERVER ; sleep 3
done

That is, UDP-based upload speedtests with two different packet sizes, with the sending rate known to exceed the available upload bandwidth, and all forms of SQM off. The sizes refer to the UDP payload, so on the IP level, the sizes are 1328 and 328 bytes, respectively.

Here is the "Bitrate" as indicated by the receiver, in kbit/s. The left column is for 1328-byte packets, the right column is for 328-byte packets.

As you see, the speed is just too variable. So the investigation below is just to show the math - but the result is known wrong because of the noise.

I found it more convenient to work with packet counts, not raw speeds. iperf3 reports the total packets and lost packets. Again, let me show values for 1328-byte packets on the left and for 328-byte packets on the right.

1339/1923  6118/8318
1117/1919  6469/8325
1533/1914  6519/8327
1308/1919  7138/8328
1583/1917  6255/8332
1477/1921  5905/8332
1382/1922  5834/8332

By subtracting the numbers, we get the number of packets that got through:

According to https://www.graphpad.com/quickcalcs/Grubbs1.cfm, there are no outliers (P=0.05). The means are:

For 1328-byte packets: 528 (SD: 158.93)

For 328-byte packets: 2008 (SD=445.26)

Assuming the same underlying bottleneck bandwidth, and denoting the overhead above the IP level as X, we can calculate the total volume of the data passed through the bottleneck in both cases, and put the equal sign in between:

528 * (1328 + X) = 2008 * (328 + X)

Therefore, X = 28.7. Well, as mentioned, this value cannot be trusted (too much noise), but at least it has the correct sign

tievolu · December 5, 2022, 10:28am

To do this you need to use layer cake (diffserv) and classify your traffic by adding DSCP tags to the packets. It's very doable, but there's a lot to learn and it takes time to get it right. There are lots of ideas and examples of firewall rules in this script.

I've never used Qosify but it looks like people have a lot of success with it. As far as I can tell it simplifies the process somewhat, at the expense of some (but not much) of the flexibility that you get when writing firewall rules directly.

moeller0 · December 5, 2022, 10:33am

Well, I performed a similar thought experiment:

(and I did some back of the envelope calculations as well about the require accuracy and trueness of the speedtests, leading me to think that this a rather theoretical approach unlikely to work well with the vagaries of the real internet).
From my own testing not the least issue is that a router capable of saturating a link fully with ethernet payload size 1500 might not be up to the task with say ethernet payload size 150... (and infrastructure nodes of the ISP might have similar issues as well, but one of the core assumptions of this approach is that the gross rate stays constant* and the link can be reliably saturated (and the saturation rate can be measured)).

So in theory I agree, manipulating packet size can theoretically by used to tweeze apart per-packet-overhead and gross-rate, in practice however that is far from simple...

*) Essentially what this exploits is that we can formulate the goodput as a function of gross rate and per-packet-overhead and packetsize, by manipulating packet size and assuming gross rate stays constant we can cancel out gross rate and are left with a single unkown and a simple equation...

But now that you got me thinking over this again, maybe this method has value not so much in detecting the per-packet-overhead automatically, but more in helping to establish a range of probable packet-overheads that than can be considered as an additional (but not sufficient) piece of information to decide on which per-packet-overhead might be in use...

That said, the exact per-packet-overhead for SQM does not really matter all that much as long as it is >= true per-packet-overhead, that is why I maintain setting the overhead to ~44 is pretty much the best advice for most users we can give. But it took @dlakelan running and presenting numbers (thanks for your patience!) for me to grudgingly come to accept that, because out of curiosity I would really like to know the the true per-packet-overhead...

moeller0 · December 5, 2022, 10:37am

While cake's diffserv modes allow a lot of things, they do not allow a reliable 50/50 split of the capacity between different tiers. In cake priority tiers are not hard rate-limited, but "bleed" packets above their priority share into essentially lower priority.

The unsatisfactory response is either do not do this from the same computer then, or use network namespaces to assure your different applications use different IP addresses, in which case cake's per-IP fairness should help.

tievolu · December 5, 2022, 10:51am

Yes, I should have clarified that - a hard 50/50 split isn't possible.

But a much better degree of fairness is definitely doable, i.e. in the OP's situation, prioritising Youtube streaming over Steam downloads to stop the videos from constantly buffering.

moeller0 · December 5, 2022, 11:09am

As I said, if fairness is desired I would go a different route than diffserv and DSCPs, as the whole classical QoS is very much about (targeted) unfairness. In the case of Steam the quickest solution might be to limit the maximum download speed in the steam client, but I understand that this might not be terribly attractive as it will require constant fiddling to avoid throttling steam down when nothing else is used.

However if steam downloads really have lowest priority for the OP then using qosify to put them into the bulk/background class might actually work, it will allow fast steam downloads when nothing else is going on, but video streaming in besteffort will have priority. (This is an example of targeted unfairness, the bulk class only gets 1/16 of the capacity when the link is saturated, enough to make some forward progress but clearly not "fair", however all the capacity not used by video will be used by stream, within reason*)

*) Video streaming being bursty in nature means that steam will not be able to use all the left-over capacity but it should get a considerable fraction.

tievolu · December 5, 2022, 12:30pm

100% get what you're saying, but I think this really depends on the definition of "fair". Diffserv allows you to force unfairness (by one definition) in order to achieve another type of fairness.

From a content-agnostic point of view, splitting bandwidth equally between all connections/hosts is fair, and in the vast majority of situations this works brilliantly.

But from the point of view of applications running on a single client it's massively unfair that Steam is able to hog all of the client's bandwidth simply because it is using more connections.

Anyway, we're digressing significantly here and we've discussed this before, so I'll stop now

moeller0 · December 5, 2022, 12:52pm

+1; I accept the use-case and the motivation here just fine. However the get more throughput by using more connections is an approach that already works without havinf an FQ-scheduler in the mix, so that is a case where FQ scheduling does not help much and not a novel failure-case compared to dumb FIFO scheduling.

I actually do not think we disagree much here. All I am saying is that a traffic-shaper/AQM on a router does not have sufficient information out of the box to solve the application fairness issue. There are ways around that, that can use DSCPs or different identifiers to define aggregates that should be treated similarly; it is just cake does offer little in that regards out of the box (and that pretty much by design) that is it offers differential treatment by DSCP, but it does not offer toggles to adjust the specific sharing properties. That DSCPs could be set by any application to any traffic as a means to exploit such a mechanism is IMHO no real concern, as this simply requires that the firewall does appropriate admission control for the selected DSCPs.

Lynx · December 5, 2022, 5:45pm

@moeller0 where you have one LAN IP 192.168.1.130 that has 10x Steam streams and 1x Zoom stream, clearly DSCP will help massively, but what about triple-isolate? What does triple-isolate do in this situation?

@rb1 really good to know that your findings are positive set against Gargoyle's tried and tested ACC algorithm (which is surely a benchmark). If we're being compared favourably against that then I think things are looking rather nice indeed. In general, I think the bash implementation works for quite a lot of different use cases now since we keep on seeing different users with different connection types come on here or GitHub and comment favourably.

Also I am about to publish a very simple way to implement qos with diffserv with cake. It is just one simple nftables script and one simple service file. I think it will serve your use case perfectly.

Lynx · December 5, 2022, 5:55pm

Really nice idea @patrakov. Do you have a script I could try running on my connection to see if I get a similar number (around 30?).

Also, do you think we are ready for me to pull the 'testing' branch with multiple instance cake-autorate into 'main'? Perhaps this should constitute a new version since we have added a whole bunch of features. BTW I really appreciate all your input and help into this project.

patrakov · December 5, 2022, 6:06pm

Sorry, no script.

I also tried to perform this on my fiber connection, and would advise anyone against publishing the results of such measurements without a warning that it is valid only for their ISP. Reason: GPON ISPs rarely give you the whole bandwidth of the link. There is a shaper somewhere on the ISP side. You get only what you pay for (well, Globe Telecom de-facto provides a bit more than advertised). What would be measured is not the GPON overhead, but the definition of 1 Mbps in their shaper.

Full disclosure: If I select 1200 and 1300 as the test packet sizes, I reliably get a small negative overhead (on top of the raw IP packet size). Below 1000-byte packets, apparently I hit some packet-rate limit, not sure where.

moeller0 · December 5, 2022, 6:26pm

If you do this you need to check the router's CPU load, when I tested something similar, my turris omnia was reaching its limits before saturating the link with small packets...

The only thing that matters is the effective overhead be it from the actual link layer or from what the shaper accounts for anyways ;).

Telling us something is off

patrakov · December 5, 2022, 6:36pm

Regarding pulling the multi-instance support into main, I would like to see a few final comment-only touches. Namely, we should clearly indicate the use case (multihomed setups) and the prerequisites that cake-autorate expects to be satisfied externally.

Something like:

### Note: for multihomed setups, it is your responsibility to ensure
### that the probes sent by this instance of cake-autorate actually
### travel through these interfaces.
### See ping_extra_args and ping_prefix_string

dl_if=ifb-dl # download interface
ul_if=ifb-ul # upload interface

and

### In multi-homed setups, it is mandatory to use either ping_extra_args
### or ping_prefix_string to direct the pings through $dl_if and $ul_if.
### No universal recommendation exists, because there are multiple
### policy-routing packages available (e.g. vpn-policy-routing and mwan3).
### Typically they either react to a firewall mark set on the pings, or
### provide a convenient wrapper.
###
### In a traditional single-homed setup, there is usually no need for these parameters.
###
### These arguments can also be used for any other purpose - e.g. for setting a
### particular QoS mark.

# extra arguments for ping or fping
# e.g., here is how you can set the correct outgoing interface and
# the firewall mark for ping:
# ping_extra_args="-I wwan0 -m $((0x300))"
# Unfortunately, fping does not offer a command line switch to set
# the firewall mark.
# WARNING: no error checking so use at own risk!
ping_extra_args=""

# a wrapper for ping or fping - used as a prefix for the real command
# e.g., when using mwan3, it is recommended to set it like this:
# ping_prefix_string="mwan3 use gpon exec"
# WARNING: the wrapper must exec ping as the final step, not run it as a subprocess!
# Running ping or fping as a subprocess will lead to problems stopping it.
# WARNING: no error checking so use at own risk!
ping_prefix_string=""