CAKE + VPN - Overhead/Interface

Am I correct in thinking that to correctly apply CAKE when using a VPN, the correct interface to apply CAKE on is the 'tun' interface (for upload) and an IFB interface based on the 'tun' interface (for download)? Using the 'nat' option in CAKE can CAKE correctly determine the destination LAN ip from the IFB interface? Using tcpdump on the 'tun' interface I only see different source IP addresses since the destination is always just the router IP.

I determined the VPN overhead based on:

tcpdump -vpni tun11
 12:28:09.320900 IP (tos 0x0, ttl 64, id 37907, offset 0, flags [DF], proto ICMP (1), length 84)
    10.8.2.5 > 8.8.8.8: ICMP echo request, id 53794, seq 1, length 64
12:28:09.371879 IP (tos 0x0, ttl 120, id 0, offset 0, flags [none], proto ICMP (1), length 84)
    8.8.8.8 > 10.8.2.5: ICMP echo reply, id 53794, seq 1, length 64
tcpdump -vpni tun11
12:25:43.753390 IP (tos 0x0, ttl 64, id 32446, offset 0, flags [DF], proto UDP (17), length 137)
    10.1.168.205.34090 > 195.206.183.101.1194: UDP, length 109
12:25:43.801997 IP (tos 0x28, ttl 54, id 3721, offset 0, flags [DF], proto UDP (17), length 137)
    195.206.183.101.1194 > 10.1.168.205.34090: UDP, length 109

137-84 = 53.

I have no clue what to use for the WAN packet overhead for my LTE connection(?), so I have just assumed this is 38. Is that a reasonable guesstimate? That would give 38 + 53 = 91 for the total overhead, right? Would that be for both download and upload?

tc qdisc ls:

qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth3 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth4 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth5 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev spu_us_dummy root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev spu_ds_dummy root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth6 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc pfifo_fast 0: dev eth7 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc cake 8016: dev tun11 root refcnt 2 bandwidth 30720Kbit diffserv3 dual-srchost nat nowash no-ack-filter split-gso rtt 100ms noatm overhead 91
qdisc ingress ffff: dev tun11 parent ffff:fff1 ----------------
qdisc cake 8017: dev ifb root refcnt 2 bandwidth 35840Kbit besteffort dual-dsthost nat wash ingress no-ack-filter split-gso rtt 100ms noatm overhead 91

tc -s qdisc:

tc -s qdisc
qdisc pfifo_fast 0: dev eth0 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 1907462860 bytes 2451283 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth1 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 76974733 bytes 420991 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth2 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 74847091 bytes 390824 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth3 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 77308930 bytes 411791 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth4 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 3847947597 bytes 3225189 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth5 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev spu_us_dummy root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev spu_ds_dummy root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth6 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 2685520945 bytes 2702650 pkt (dropped 0, overlimits 0 requeues 7)
 backlog 0b 0p requeues 7
qdisc pfifo_fast 0: dev eth7 root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 6792469138 bytes 6569039 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8016: dev tun11 root refcnt 2 bandwidth 30720Kbit diffserv3 dual-srchost nat nowash no-ack-filter split-gso rtt 100ms noatm overhead 91
 Sent 425968333 bytes 521945 pkt (dropped 296, overlimits 671517 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 835Kb of 4Mb
 capacity estimate: 30720Kbit
 min/max network layer size:           29 /    1443
 min/max overhead-adjusted size:      120 /    1534
 average network hdr offset:            0

                   Bulk  Best Effort        Voice
  thresh       1920Kbit    30720Kbit     7680Kbit
  target         9.37ms          5ms          5ms
  interval        104ms        100ms        100ms
  pk_delay          0us       1.68ms        101us
  av_delay          0us        559us          5us
  sp_delay          0us         25us          2us
  backlog            0b           0b           0b
  pkts                0       521640          601
  bytes               0    426260646       105596
  way_inds            0         4640            0
  way_miss            0         4837            3
  way_cols            0            0            0
  drops               0          296            0
  marks               0            0            0
  ack_drop            0            0            0
  sp_flows            0            1            1
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0         1443          437
  quantum           300          937          300

qdisc ingress ffff: dev tun11 parent ffff:fff1 ----------------
 Sent 298963180 bytes 470879 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8017: dev ifb root refcnt 2 bandwidth 35840Kbit besteffort dual-dsthost nat wash ingress no-ack-filter split-gso rtt 100ms noatm overhead 91
 Sent 287991570 bytes 462836 pkt (dropped 8043, overlimits 331808 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 1005Kb of 4Mb
 capacity estimate: 35840Kbit
 min/max network layer size:           30 /    1426
 min/max overhead-adjusted size:      121 /    1517
 average network hdr offset:            0

                  Tin 0
  thresh      35840Kbit
  target            5ms
  interval        100ms
  pk_delay        117us
  av_delay          7us
  sp_delay          1us
  backlog            0b
  pkts           470879
  bytes       298963180
  way_inds         1177
  way_miss         4576
  way_cols            0
  drops            8043
  marks               0
  ack_drop            0
  sp_flows            2
  bk_flows            1
  un_flows            0
  max_len          1426
  quantum          1093

I am a bit confused because with having set 35Mbit/s download and 30Mbit/s upload I see on speed tests around 27Mbit/s download and 26Mbit/s upload. It seems as if the download is getting more heavily throttled as compared to the download.

Does all this seem reasonable?

Further to the above, I am a little confused about the 'overhead' in respect of VPN. The '38' is external to the 1500 MTU, right? Is the VPN-related overhead to be considered inside or outside the 1500 MTU? What would the bandwidth calculation look like when combining the guestimate '38' with the VPN overhead determined to be '53'?

@moeller0 I am very much hoping for your insight here, since you seem to be one of the few individuals that understands this aspect of CAKE.

Well, I know next to nothing about LTE encapsulations in regards to the gross rate, but I also am not sure how relevant that is if your LTE data connection is not limited by your ISP according to your contract, but simply by low signal strength....

What cake typically does if an explicit overhead is defined, is to take the current IP packet size and add the specified overhead. So if cake acts on a normal ethernet link without encryption, it will typically see an IP size <= 1500 Byte, then it will add the configured 38 Bytes to account for all real and virtual ethernet overhead (the interframe gap is not real data, but "radio-silence" for a time required to transmit 12 bytes, that is what I call "virtual" overhead here).

If that ethernet interface also sees encrypted packets, no problem, because the encryptions outer IP layer still has the appropriate size (the encryption overhead is included in the payload already, so needs no special accounting). You only need to know the encryption per packet overhead, if you instantiate the shaper on an interface that only sees unencrypted traffic.

So if tun11 sees only encrypted data, all you need is the LTE overhead, which I know way too little about to be of help.

But if you specify too high an overhead, all you loose is a bit throughput, if you specify too low an overhead you risk bufferbloat especially when loads of small packets are in your traffic mix.

Again, for a variable rate link like LTE cake is not really an ideal shaper (unless your LTE rates simply are super stable, then you just configure a safe number as gross shaper rate fr SQM and be done with).

@moeller0 thank you so much for your response.

Without SQM I generally get between 35-70Mbit/s download and 30Mbit/s upload. I tried autorate-ingress and it works well for about 30 seconds and then fails - namely it reduces the download bandwidth all the way down to around 0 Mbit/s during inactivity and then ramps very slowly up during activity. Happily I have found that by sacrificing the fluctuating component of the download, CAKE works very well when setting download to 35Mbit/s and upload to 30Mbit/s. With that I consistently see A/A+ bufferbloat on the various test sites. And otherwise my connection feels snappy. Without CAKE I see huge bufferbloat that is noticeable when using Zoom or Teams.

Whereas eth0 is the WAN interface that sees only encrypted traffic, tun11 is my VPN tunnel interface that sees unencrypted traffic. For this reason I intentionally set CAKE to operate on tun11 because otherwise CAKE only sees traffic between my router IP and my VPN IP.

On tun11 I see traffic between router IP and the different external IPs. I am assuming that the 'nat' option of CAKE allows it to drill down from the router IP to the individual LAN IPs?

Since tun11 sees unencrypted traffic, and since I saw that going from tun11 to eth0 exactly 53 bytes are added, then am I correct in including '53' in the overhead?

What confuses me is that having set CAKE to 35Mbit/s download I see around 28Mbit/s on speedtest. But that doesn't seem to stack up with (1500-20-20)/(1500+91). It is a bit like my download bandwidth is rather more heavily throttled by CAKE than my upload. But they should be the same right? If it matters, I operate CAKE on the IFB based on tun11.

Great!

That is testable, in per-internal-IP fairness doing a single stream upload test from one computer at the same time as a multistream upload test on another computer should show you...

Tough question, it depends on your VPNs header.

E,G. according to the whitepaper wireguard will add a 16 byte header to each IP-packet is encrypts, then an 8 Byte UDP header, then an 20 byte IPv4 or an 40 byte IPv6 header, for a total of 16+8+20 = 44 or 64 bytes

EDIT: according to https://lists.zx2c4.com/pipermail/wireguard/2017-December/002201.html there seems to be an additional 16 byte tag per packet, so 16+16+8+20 = 60 or 80 bytes...

But at that point you are just at the outer IP packet size and still need to add the rest of the overhead (which for LTE I have little intuition, this os a whole new set of acronyms, like PLDC, RLC, MAC).

Well, goos approach, but I would do this for a few more packet sizes to confirm you see a stable offset.

Well, no in upload direction cake sits between your sending application and the true bottleneck, so is in full control, while in download direction data packets first need to traverse the real bottleneck before cake sees them, so the control loop is much less tight. And you specified the ingress keyword, which makes cake aim to throttle such that the incoming rate into cake matches the requested download rate (instead of the traditional egress rate for the shaper). That makes cake a bit more aggressive (especially if applications do not respond quickly enough) but it solves the issue that post-bottleneck are typically to lenient when there are multiple inrushing flows. The traditional remedy is to require users to set a lower shaper speed, but cake sort of does that automatically and adjusts its "aggressiveness" to the incoming data's behaviour. Yes that can cost a few Mbps, but IMO overall does the right thing.

Thanks @moeller0 I really appreciate your insight. I have been lurking on this forum for a few months and there would appear to be very few people who have some kind of understanding about the CAKE overhead parameters. Perhaps they could be simplified somehow because they seem a little enigmatic / like black magic. I wish autorate-ingress worked for everyone and we didn't even have to determine them!

Yes in general I think CAKE is working, see here:

And here:

Bear in mind this is with LTE whilst using a VPN. (Also my wife was on a WhatsApp video call during these tests). So I am not expecting perfection. Without CAKE I saw huge bufferbloat getting on for a 500ms increase in ping.

Concerning the 'nat' aspect, I tried simultaneous (single stream) downloads using Google internet speed test from different clients and I saw absolutely perfect matching on the download - it looked really cool actually with my Google Pixel and iPhone side by side - you saw exact correspondence on download. But the upload looked rather different between the clients.

Here is some tun11 traffic:

And here is some eth0 traffic:

Perhaps too short to be meaningful, but on tun11 traffic is always between my router IP and different external IP addresses. On eth0 traffic is between my router IP and the VPN address.

Would you say my CAKE parameters look OK to work with this?

Could it be that even with the 'nat' option CAKE is not determining the correct LAN source and that explains why my upload rates didn't seem to get evenly shared but the download rates were evenly shared?

I use OpenVPN. For determining the VPN overhead I used the approach listed here:

I have tested with random packet sizes and there is always exactly 53 bytes added from tun11 to eth0. Ought this to be a reliable way of determining the VPN overhead?

Various online articles state that the VPN overhead is 69 (that is 16 more bytes than the 53 bytes I determined with the approach above).

See here: https://256.insys-icom.com/vpn-overhead
And here: https://serverfault.com/questions/249935/openvpn-performance

Any idea what might explain the difference?

So am I right in thinking that the OpenVPN overhead sits inside the IP packet but the Ethernet/LTE overhead goes outside the IP packet? I presume that between my router and my modem, the Ethernet overhead gets added. And I understand that the 4G overhead goes outside the Ethernet packet, which is transparently taken care of by the modem, so the cellular network can carry Ethernet packets with the full 1500 byte payload. So I thought I could just ignore the LTE overhead and work with the Ethernet overhead (which I am assuming is 38).

Would the above still render the calculation: (1500-20-20)/(1500+91) appropriate, or would it need to framed differently I wonder?

Here is my OpenVPN config:

resolv-retry infinite
remote-random
tun-mtu 1500
tun-mtu-extra 32
mssfix 1450
ping 15
ping-restart 0
ping-timer-rem
remote-cert-tls server
pull
fast-io
cipher AES-256-CBC

I don't know if that actually results in an MTU of 1500 or whether it results in something higher that gets fragmented. I have a feeling it could be the latter. Not sure if I ought to play around with this or not.

Well, iy does not get much simpler than adding "overhead XX", the challenge is getting to the correct XX :wink: (honestly, erring a bit on the side of too large does very little harm in that a bit bandwidth is wasted).
But yes, if I had a fool-proof method to autodetect the overhead, I would happily work on having that implemented in SQM.

Yes, autorate-ingress looks tempting, and it does work for cake's primary author, but it requires a specific set-up which seems rather untypical. There are a few scripts out here in the forum, where people used ICMP echos to well connected near by nodes to estimate whether the shaped rate is too high and built their own scripted rate-tracking tools. Myself I am on a fixed rate DSL link so could not even test rate-tracking if I wanted...

[SPEEDTESTS]

These speedtests look okay to me, especially over a shared variable rate medium like LTE, but I am no expert here.

Okay, but the real test to do a 4-flow test on client A and a single-flow test on client B, with the dual-xxxhost options, both clients should get an equal share (unless say the single flow test from clientB is limited by the server or the further network).

And you should run these two both through the VPN...

Well possible, but the shaper on the pre-encryted interface will see all your internal addresses even without the 'nat' keyword, only on the real WAN interface NAT will already have happened (and since we recommend to instantiate SQM on the real WAN Interface, cake "grew" the nat keyword).

On first look (I have no time right now to dive into the details) the method looks sane.

No idea, and no time to research that, but the rate cost of adding 16 superfluous bytes to the overhead will be pretty small, so err on the side of too much for the overhead (just don't go overboard and double it without indications that might help).

Yes, pretty much so.

Yes, but unless that link is the bottleneck the ethernet overhead might not matter...

Well, you really need to know the overhead for the bottleneck link, just punching numbers from any old link along your access path just because you know might not necessarily help :wink: (That said, ignoring the added encryption related overhead, which you have solved already, ethernet with its 38 or with a VLAN tag 42 bytes is already pretty large and might actually be a good starting point for an unknown encapsulation). Looking at this indicates that LTE overhead might be considerably smaller than 38 bytes, but I really have not researched that sufficiently deep to make a recommendation here.

Sorry, not an OpenVPN expert either, so nothing I can add here.

Thanks a lot for this insight. Concerning:

Would using the 'nat' keyword allow CAKE to determine the source and destination IPs if applied on eth0? I thought that doing this would fail because CAKE couldn't work out the source and destination IPs properly given the encryption.

Using tcpdump -vpni tun11 I can see all the external WAN IPs, but in terms of my LAN all traffic is between those external WAN IPs and my router IP (I don't see the ultimate destination LAN IP). I thought in this situation the purpose of the 'nat' keyword was to allow CAKE to determine the local LAN IP based on the router IP and thereby to determine the LAN destination of traffic from any external WAN IP?

Your router IP on tun11 is really the router-end of the OpenVPN client tunnel, not your WAN IP. Since all your outgoing traffic to tun11 is masqueraded by the firewall, the nat keyword should be able to look into conntrack when CAKE is instantiated on tun11, and determine your LAN addresses for appropriate host fairness.

If you instantiate CAKE on eth0, CAKE won’t peek into the conntrack masquerading happening on tun11, AFAIK. It won’t find any natting for the VPN tunnel on eth0 since the tunnel originates from your router and not your LAN, and is therefore not masqueraded at all.

@moeller0 and @dave14305 is it correct to state that if the bandwidth amounts are being set by trial and error there is no need to work out the overhead and it can just be set to 0? Or does CAKE use the overhead in a way that is not just about bandwidth control but also something else?

hint: do you have lots of small packets?

1 Like

I am not sure. I use WireGuard. @anon50098793 see:

Looks like it?

I do not think that is a correct interpretation see the discussion about the link between the shaper rate and the overhead here. In short, to small an overhead will lead the shaper to admit too many packets per time unit, if the packets are sufficiently smaller than the packet size used to figure out a safe shaper gross rate. From a bufferbloat perspective it is always better to err on the side of "too large" for the per-packet-overhead and "too small" for the shaper rate.

Nope, it is just that per-packet-overhead and shaper gross rate are not truly orthogonal variables in regards to observed bufferbloat, so it matters to get both of them "right" (see above for the permitted level imprecision).

Many thanks indeed @moeller0. So in this instance it sounds like as compared to bandwidth=X and offset=0, increasing offset to Y and bandwidth to X+Z (to give same overall bandwidth) would give different performance.

Taking your WireGuard value of 60 for IPv4 - what would you recommend I add to this to deal with LTE? You know no less than me. Perhaps 100?