CAKE w/ Adaptive Bandwidth [August 2022 to March 2024]

Thanks for the explanation. Im just learning as I go along.
I will wait for the throttle to ease up and will restart autorate and log the results to see how it handles.

Great and helpful explanation from @moeller0 above @hammerjoe.

Should be:

reflector_ping_interval_s=4 # (seconds, e.g. 0.2s or 2s)

And @hammerjoe don't forget to change:

Maybe even try 30.0 or 40.0.

And you could also relax these to:

# bufferbloat is detected when (bufferbloat_detection_thr) samples
# out of the last (bufferbloat detection window) samples are delayed
bufferbloat_detection_window=10  # number of samples to retain in detection window
bufferbloat_detection_thr=5      # number of delayed samples for bufferbloat detection

I use a VPN because my ISP has in the past selectively throttled on ports 80 and 443. But I send ICMPs out in a way that bypasses the VPN.

@hammerjoe I rather doubt your ISP is a) detecting your ICMPs and then b) punishing your available bandwidth for that. That seems far fetched. But you could indeed like @moeller0 suggests just push everything over a VPN and that way the ISP only sees traffic to one IP and that's it.

So I'd recommend relaxing the settings (cake-autorate is very aggressive by default partly because I favour low latency over high numbers in speed tests) first, and posting log file so we can see what's happening.

And don't forget when you upload log file to also upload revised config.

1 Like

I think what happens is that the ISPs detects ICMP "abuse" and then throttles ICMP hard, and as a result autorate does not work... I would probably try to look at the ISP's EULA/terms of service and politely ask for allowing high ICMP use on the link under discussion.

It could well be, but I dont think its just icmp that gets throttled.

btw do I keep monitor_achieved_rates_interval_ms=4000 # (milliseconds)?

No, please don't, my mistake. monitor_achieved_rates_interval_ms controls the granularity of our load measurements, theoretically smaller is better here, except making this very small is a bit costly and if too small it will behave a bit erratic 200ms appears to be an acceptable compromise.

Just did a little more testing on my newly outdoor-mounted Zyxel NR7101 with 4K video around 50% to 80% of the timeline.

Notice the big rate drop on step load (lower bandwidth Netflix traffic) after the sustained baseline on zero load at 20 Mbit/s after the 4K video test? I suppose that means my base rate at 20 Mbit/s is too high because a step load around this time of day can cause bufferbloat.

So perhaps my baseline should be more like 10/10? What do you think @moeller0?

I suppose this could be made part of an automated cake-autorate test connection routine: maintain zero load for a while and then start saturating download, and then check to see how well the step load is managed? After all, such step loads could presumably interrupt low bandwidth Zoom/Teams calls?

@hammerjoe Just an idea, perhaps your provider is rate limiting you because your high speed is a burstable rate? Does your provider reserves a right to limit connection speed based on usage/peak times, etc?

2 Likes

Yes, they have traffic management that nobody really knows how it works but they adapt the user speeds based on congestion and just how much they are using.

I have restarted the tests and in the log I see that the reflectors report no ping:

DEBUG; 2022-11-14-23:49:42; 1668484182.199382; no ping response from reflector: 9.9.9.10 within reflector_response_deadline: 1s
DEBUG; 2022-11-14-23:49:42; 1668484182.215437; reflector=9.9.9.10, sum_reflector_offences=0 and reflector_misbehaving_detection_thr=3
DATA; 2022-11-14-23:49:42; 1668484182.282035; 1668484182.280792; 0; 6; 0; 0; 1668484182.261700; 1.1.1.1; 0; 19577; 24950; 5377; 75350; 19577; 24950; 5377; 76434; 0; 0; dl_idle; ul_idle; 35000; 8000
LOAD; 2022-11-14-23:49:42; 1668484182.354745; 1668484182.354057; 7; 23
LOAD; 2022-11-14-23:49:42; 1668484182.556458; 1668484182.555769; 2; 0
LOAD; 2022-11-14-23:49:42; 1668484182.758376; 1668484182.757691; 10; 8
LOAD; 2022-11-14-23:49:42; 1668484182.959639; 1668484182.958948; 220; 178
LOAD; 2022-11-14-23:49:43; 1668484183.162538; 1668484183.161853; 78; 4
DEBUG; 2022-11-14-23:49:43; 1668484183.226580; no ping response from reflector: 1.0.0.1 within reflector_response_deadline: 1s
DEBUG; 2022-11-14-23:49:43; 1668484183.230083; reflector=1.0.0.1, sum_reflector_offences=0 and reflector_misbehaving_detection_thr=3
DEBUG; 2022-11-14-23:49:43; 1668484183.233702; Warning: reflector: 1.0.0.1 seems to be misbehaving.
DEBUG; 2022-11-14-23:49:43; 1668484183.237315; replacing reflector: 1.0.0.1 with 94.140.14.14.
DEBUG; 2022-11-14-23:49:43; 1668484183.308154; no ping response from reflector: 9.9.9.9 within reflector_response_deadline: 1s
DEBUG; 2022-11-14-23:49:43; 1668484183.311804; reflector=9.9.9.9, sum_reflector_offences=0 and reflector_misbehaving_detection_thr=3
DEBUG; 2022-11-14-23:49:43; 1668484183.319151; no ping response from reflector: 9.9.9.10 within reflector_response_deadline: 1s
DEBUG; 2022-11-14-23:49:43; 1668484183.322698; reflector=9.9.9.10, sum_reflector_offences=0 and reflector_misbehaving_detection_thr=3
LOAD; 2022-11-14-23:49:43; 1668484183.371436; 1668484183.370673; 3; 7

It just keeps going on and on... when I ping it in windows it gives results so Im not sure why it doesnt work...

What does:

fping --timestamp --loop --period 200 --interval 50 --timeout 10000 1.1.1.1 1.0.0.1 8.8.8.8 8.8.4.4

look like from your router?

1 Like

I agree with @Lynx time to do direct tests from the router to get a better idea how/why the latenvy probes fail.

It works, I stopped it over 500:

[1668516123.30769] 1.1.1.1 : [511], 64 bytes, 72.6 ms (57.7 avg, 0% loss)
[1668516123.34165] 1.0.0.1 : [511], 64 bytes, 56.5 ms (58.0 avg, 0% loss)
[1668516123.40047] 8.8.8.8 : [511], 64 bytes, 65.2 ms (55.7 avg, 0% loss)
[1668516123.44073] 8.8.4.4 : [511], 64 bytes, 55.4 ms (57.4 avg, 0% loss)
[1668516123.50769] 1.1.1.1 : [512], 64 bytes, 72.2 ms (57.8 avg, 0% loss)
[1668516123.55064] 1.0.0.1 : [512], 64 bytes, 65.1 ms (58.0 avg, 0% loss)
[1668516123.59064] 8.8.8.8 : [512], 64 bytes, 55.0 ms (55.7 avg, 0% loss)
[1668516123.64064] 8.8.4.4 : [512], 64 bytes, 54.9 ms (57.4 avg, 0% loss)
[1668516123.67567] 1.1.1.1 : [513], 64 bytes, 39.8 ms (57.7 avg, 0% loss)
[1668516123.75563] 1.0.0.1 : [513], 64 bytes, 69.7 ms (58.0 avg, 0% loss)
[1668516123.80164] 8.8.8.8 : [513], 64 bytes, 65.5 ms (55.8 avg, 0% loss)
[1668516123.84163] 8.8.4.4 : [513], 64 bytes, 55.4 ms (57.4 avg, 0% loss)
[1668516123.91068] 1.1.1.1 : [514], 64 bytes, 74.3 ms (57.8 avg, 0% loss)
[1668516123.94574] 1.0.0.1 : [514], 64 bytes, 59.3 ms (58.0 avg, 0% loss)
[1668516124.00034] 8.8.8.8 : [514], 64 bytes, 63.8 ms (55.8 avg, 0% loss)
[1668516124.04565] 8.8.4.4 : [514], 64 bytes, 59.0 ms (57.4 avg, 0% loss)
[1668516124.10166] 1.1.1.1 : [515], 64 bytes, 64.9 ms (57.8 avg, 0% loss)
[1668516124.15164] 1.0.0.1 : [515], 64 bytes, 64.8 ms (58.0 avg, 0% loss)
[1668516124.20531] 8.8.8.8 : [515], 64 bytes, 68.4 ms (55.8 avg, 0% loss)
[1668516124.24063] 8.8.4.4 : [515], 64 bytes, 53.6 ms (57.4 avg, 0% loss)
[1668516124.28066] 1.1.1.1 : [516], 64 bytes, 43.6 ms (57.7 avg, 0% loss)
[1668516124.34065] 1.0.0.1 : [516], 64 bytes, 53.5 ms (58.0 avg, 0% loss)
[1668516124.40665] 8.8.8.8 : [516], 64 bytes, 69.2 ms (55.8 avg, 0% loss)
[1668516124.44667] 8.8.4.4 : [516], 64 bytes, 59.2 ms (57.4 avg, 0% loss)
[1668516124.50022] 1.1.1.1 : [517], 64 bytes, 62.6 ms (57.7 avg, 0% loss)
[1668516124.54064] 1.0.0.1 : [517], 64 bytes, 52.9 ms (58.0 avg, 0% loss)
[1668516124.59764] 8.8.8.8 : [517], 64 bytes, 59.8 ms (55.8 avg, 0% loss)
[1668516124.63728] 8.8.4.4 : [517], 64 bytes, 49.4 ms (57.4 avg, 0% loss)
[1668516124.69764] 1.1.1.1 : [518], 64 bytes, 59.7 ms (57.8 avg, 0% loss)
[1668516124.73567] 1.0.0.1 : [518], 64 bytes, 47.4 ms (58.0 avg, 0% loss)
^C[1668516124.77766] 8.8.8.8 : [518], 64 bytes, 39.3 ms (55.8 avg, 0% loss)

1.1.1.1 : xmt/rcv/%loss = 519/519/0%, min/avg/max = 36.4/57.8/169
1.0.0.1 : xmt/rcv/%loss = 519/519/0%, min/avg/max = 38.1/58.0/120
8.8.8.8 : xmt/rcv/%loss = 519/519/0%, min/avg/max = 34.7/55.8/229
8.8.4.4 : xmt/rcv/%loss = 518/518/0%, min/avg/max = 34.4/57.4/181

Heres the latest logs.

Same issue as before no delay data just DEBUG lines reporting failure...
What is the output of:
df -h
maybe there is not enough room to store the logfile?

Filesystem                Size      Used Available Use% Mounted on
/dev/root                 4.0M      4.0M         0 100% /rom
tmpfs                    59.7M      2.7M     57.0M   5% /tmp
/dev/mtdblock4            9.8M      3.4M      6.4M  35% /overlay
overlayfs:/overlay        9.8M      3.4M      6.4M  35% /
tmpfs                   512.0K         0    512.0K   0% /dev
1 Like

Mmmh, that looks sort of OK... still why can you run fping successfully from the command line, but not from within the bash script?

Could it be because I have the wrong set up for
dl_if=ifb4eth0.2 # download interface
ul_if=eth0.2 # upload interface

Im still confused about interfaces and switches and stuff.
SQM is set for ETH0.2 (switch vlan: "eth0.2" (wan,wan6)

the other choices are bridge: (br-lan" (lan)
ethernet switch: "eth0"
ethernet switch: "eth1"
switch vlan: "eth1.1"
and wireless network master for 2g and 5g

Output from: tc qdisc ls?

What does ifstatus wan | grep -e device report?

1 Like

root@The_Lair:~# ifstatus wan | grep -e device
"l3_device": "eth0.2",
"device": "eth0.2",