Cake-autorate Breaks After Reboot

Hello,

I'm hoping that maybe I'm overlooking something and someone can help.

My cake-autorate does not seem to work correctly after the router reboots.
The command “# service cake-autorate status” returns “running”

When I installed and set up cake-autorate it works great I get a steady download and almost no buffer bloat. Speed test usually averages about 50Mbps download 7Mbps upload. However, after rebooting the router the speed drops to 1-3Mbps download and about the same 1-3 upload.

If I connect directly to the wifi from the t-mobile home internet router the speed is around 100Mbps but the buffer bloat makes it virtually unusable.

This is the second time this has happened. The first time was several months ago when there was a power outage. I assumed that something went haywire, as a result of that, because after reinstalling Openwrt and also cake-autorate everything was fine, until I rebooted the router this morning. Now it’s broken once again.

Router: Linksys WRT1200AC
OpenWRT Firmware: 23.05.3 r23809-234f1a2efa

cake-autorate:
version=3.2.0-PRERELEASE
commit=97fb144aa877f2196a54647b7c557af84b019736

config.primary.sh:
#!/usr/bin/env bash

*** INSTANCE-SPECIFIC CONFIGURATION OPTIONS ***

cake-autorate will run one instance per config file present in the /root/cake-autorate

directory in the form: config.instance.sh. Thus multiple instances of cake-autorate

can be established by setting up appropriate config files like config.primary.sh and

config.secondary.sh for the respective first and second instances of cake-autorate.

For multihomed setups, it is the responsibility of the user to ensure that the probes

sent by this instance of cake-autorate actually travel through these interfaces.

See ping_extra_args and ping_prefix_string

dl_if=ifb4wan # download interface
ul_if=wan # upload interface

Set either of the below to 0 to adjust one direction only

or alternatively set both to 0 to simply use cake-autorate to monitor a connection

adjust_dl_shaper_rate=1 # enable (1) or disable (0) actually changing the dl shaper rate
adjust_ul_shaper_rate=1 # enable (1) or disable (0) actually changing the ul shaper rate

min_dl_shaper_rate_kbps=4000 # minimum bandwidth for download (Kbit/s)
base_dl_shaper_rate_kbps=20000 # steady state bandwidth for download (Kbit/s)
max_dl_shaper_rate_kbps=60000 # maximum bandwidth for download (Kbit/s)

min_ul_shaper_rate_kbps=2000 # minimum bandwidth for upload (Kbit/s)
base_ul_shaper_rate_kbps=5000 # steady state bandwidth for upload (KBit/s)
max_ul_shaper_rate_kbps=10000 # maximum bandwidth for upload (Kbit/s)

connection_active_thr_kbps=2000 # threshold in Kbit/s below which dl/ul is considered idle

Logging toggles for various stats

output_processing_stats=0 # enable (1) or disable (0) output monitoring lines showing processing stats
output_load_stats=0 # enable (1) or disable (0) output monitoring lines showing achieved loads
output_reflector_stats=0 # enable (1) or disable (0) output monitoring lines showing reflector stats
output_summary_stats=0 # enable (1) or disable (0) output monitoring lines showing summary stats

*** OVERRIDES ***

See defaults.sh for additional configuration options

that can be set in this configuration file to override the defaults.

Place any such overrides below this line.

How long did you give it before deciding it was stuck?

This time I gave it a few hours. The previous time I gave it a few days because I frustrated and assumed it was t-mobile issue related to the storms and whatever. For whatever reason last time I did not think to try to connect to t-mobile wifi directly and double check speeds until a few days later.

I did some looking and this time I found an uninstall script on the github page. I ran that and after a reboot and reinstalling cake-autorate, while keeping the same config file, It looks like it's working again.

1 Like

Question: did the version change? Maybe there the behaviour was changed for the better by the update...

It appears that a different version is installed.
cake_autorate_version="3.3.0-PRERELEASE"

Sorry about being unclear about the "..working.." above. The reboot issue did not seem to be resolved, working only in that the network speed came back temporarily. Overall the issue does seem to remain, for me. The service appeared to work after the reinstall but a short time later, after a brownout, during the storm, the router rebooted and the issue started again. I had to do the uninstall and reinstall again to get back to normal.

Ah, I guess I need to see an autorate logfile from the working and the non working condition to make heads and tails out of this report. I am sure @Lynx will help us with the current invocation to get a logfile... (thanks in advance)

Seems like cake-autorate is not surviving a reboot in this instance? Can’t think what’s going on since when an interface is down it normally just keeps retrying.

1 Like

Just to add after the router reboots and things are not working the status of cake-autorate service will show as running. I know it seems bit strange but one of the things I tried previously was to stop and also disable cake-autorate service but even then after rebooting the router, to be sure it did not start, nothing was better.

If possible I would like to see the following both for the working as expected and the 'wedged performance' condition:

  1. tc -s qdisc
  2. an autorate logfile (ideally with information at what time the issue occurred)
  3. a full screenshot of the final output of https://speed.cloudflare.com (I want to see the working latency plots as well as all the rest). ATTENTION, the top right info box contains your estimated location and your IP address, if these are sensitive, redact them before posting.
1 Like

Also output from:

logread | grep cake-autorate

shortly after problem manifests would be helpful.

I suspect for some reason the cake interfaces are not set up after the reboot, but I'm not sure. Hopefully the tc -s qdisc will show that.

Thanks,
Will try to find a time to do the reboot to recreate the issue and then run the command for the log along with grabbing screen shots. Need to find a time that others are not relying on and using the Internet. Maybe late Thursday or Friday night.

Will do hopefully soon. See above. Thanks.

1 Like

Well, feel a bit foolish now. I swear it was an issue but now I can’t recreate it. I tried rebooting the router multiple time along with just pulling the power cord and I can’t seem to duplicate what I was seeing before.

Just because, here’s output of the requested commands run shortly before sending this. So,1:35pm CST.

root@OpenWrt:~# service cake-autorate status
running
root@OpenWrt:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc mq 0: dev eth0 root
Sent 12518303 bytes 53103 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 parent :8 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :7 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :6 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :5 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :4 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :3 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :2 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1522 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64Sent
12518303 bytes 53103 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev lan4 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan3 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan2 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev lan1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 8011: dev wan root refcnt 2 bandwidth 5Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms noatm overhead 44
Sent 11735184 bytes 50167 pkt (dropped 59, overlimits 47998 requeues 0)
backlog 0b 0p requeues 0
memory used: 486272b of 4Mb
capacity estimate: 8Mbit
min/max network layer size: 28 / 1436
min/max overhead-adjusted size: 72 / 1480
average network hdr offset: 14

              Tin 0 

thresh 5Mbit
target 5ms
interval 100ms
pk_delay 2.01ms
av_delay 100us
sp_delay 6us
backlog 0b
pkts 50226
bytes 11813079
way_inds 2804
way_miss 3176
way_cols 0
drops 59
marks 0
ack_drop 0
sp_flows 7
bk_flows 1
un_flows 0
max_len 1450
quantum 300

qdisc ingress ffff: dev wan parent ffff:fff1 ----------------
Sent 98732007 bytes 84804 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev phy0-ap0 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev phy1-ap0 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 8012: dev ifb4wan root refcnt 2 bandwidth 20Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100ms noatm overhead 44
Sent 98947421 bytes 84030 pkt (dropped 774, overlimits 138179 requeues 0)
backlog 0b 0p requeues 0
memory used: 719276b of 4Mb
capacity estimate: 60Mbit
min/max network layer size: 46 / 1456
min/max overhead-adjusted size: 90 / 1500
average network hdr offset: 14

              Tin 0

thresh 20Mbit
target 5ms
interval 100ms
pk_delay 39ms
av_delay 14.9ms
sp_delay 3us
backlog 0b
pkts 84804
bytes 100054299
way_inds 2212
way_miss 2204
way_cols 0
drops 774
marks 0
ack_drop 0
sp_flows 6
bk_flows 1
un_flows 0
max_len 15774
quantum 610

qdisc noqueue 0: dev phy1-ap1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Was not able to recreate the issue again. I suppose that's good. Here is the output regardless.

root@OpenWrt:~# logread | grep cake-autorate
Wed Jun 5 13:20:33 2024 user.notice cake-autorate.primary: INFO: 1717611633.858113 Starting cake-autorate with PID: 4582 and config: /root/cake-autorate/config.primary.sh

Things like this happen, and occasionally the root cause is still a real bug, that is simply< hard to trigger, folks sometimes call these Heisenbugs. But if things work now, I guess we all are happy?