CAKE w/ Adaptive Bandwidth

Hey @rb1 , sure, I can share the script. I'm currently at work and just came to realize that my wireguard tunnel to my server is down :pensive: so I can only share it lately.

Meanwhile, shouldn't we split this discussion into a new thread, or even proceed from the SQM Reporting thread, where it suits better?

EDIT: @rb1 In case you want to start something without waiting for my script, I can advance some details:

  1. The script reads the json output from tc -s -j qdisc dev ...., then it uses jq to save each metric into a small file named the same as the metric. So, at each run, it updates the files bytes, bandwidth, drops and so on. I set it to run each 10 seconds as a systemd process, started by a timer. Warning: for each metric a jq execution, so the script isn't exactly resource friendly;
  2. I created some extended snmp oid's, one for each metric. Each oid is obtained simply by cat the metrics files;
  3. In collectd, I use the snmp plugin, and set it up with all metrics oid's.

By the way, there are two servers (x86/64) involved here: the router (where the script and snmp runs) and the grafana/prometheus/collectd server.

After starting all this, I noticed an increase in the router CPU usage from around 1% to 8%, mainly due to the script running each 10 seconds, as well as snmp being called (by collectd) each 10 seconds.

This is a messy setup, I recognize, but like I said before, I took the script + snmp processes that were already running as collectors for my now dead Cacti monitoring. It can certainly be improved and simplified. But since I have spare resources in the router, I kinda relaxed (well... maybe I should look closely to the energy bill...)


ok, here's the script that pulls cake metrics on the router.

This one is for upstream. There's another one for pulling downstream metrics.


/sbin/tc -s -j qdisc show dev enp2s0 > /tmp/cakeup

cat /tmp/cakeup | jq '.[].backlog' > /etc/snmp/cakeresults/backlogup
cat /tmp/cakeup | jq '.[].bytes' > /etc/snmp/cakeresults/bytesup
cat /tmp/cakeup | jq '.[].packets' > /etc/snmp/cakeresults/packetsup
cat /tmp/cakeup | jq '.[].qlen' > /etc/snmp/cakeresults/qlenup
cat /tmp/cakeup | jq '.[].drops' > /etc/snmp/cakeresults/dropsup
cat /tmp/cakeup | jq '.[].options.bandwidth' > /etc/snmp/cakeresults/bandwidthup

# bulk
cat /tmp/cakeup | jq '.[].tins[0].target_us' > /etc/snmp/cakeresults/0targetup
cat /tmp/cakeup | jq '.[].tins[0].peak_delay_us' > /etc/snmp/cakeresults/0peak_delayup
cat /tmp/cakeup | jq '.[].tins[0].avg_delay_us' > /etc/snmp/cakeresults/0avg_delayup
cat /tmp/cakeup | jq '.[].tins[0].base_delay_us' > /etc/snmp/cakeresults/0base_delayup
cat /tmp/cakeup | jq '.[].tins[0].drops' > /etc/snmp/cakeresults/0dropsup
cat /tmp/cakeup | jq '.[].tins[0].ecn_mark' > /etc/snmp/cakeresults/0ecn_markup
cat /tmp/cakeup | jq '.[].tins[0].sparse_flows' > /etc/snmp/cakeresults/0sparse_flowsup
cat /tmp/cakeup | jq '.[].tins[0].bulk_flows' > /etc/snmp/cakeresults/0bulk_flowsup
cat /tmp/cakeup | jq '.[].tins[0].unresponsive_flows' > /etc/snmp/cakeresults/0unresponsive_flowsup
cat /tmp/cakeup | jq '.[].tins[0].sent_bytes' > /etc/snmp/cakeresults/0sent_bytesup
# best effort
cat /tmp/cakeup | jq '.[].tins[1].target_us' > /etc/snmp/cakeresults/1targetup
cat /tmp/cakeup | jq '.[].tins[1].peak_delay_us' > /etc/snmp/cakeresults/1peak_delayup
cat /tmp/cakeup | jq '.[].tins[1].avg_delay_us' > /etc/snmp/cakeresults/1avg_delayup
cat /tmp/cakeup | jq '.[].tins[1].base_delay_us' > /etc/snmp/cakeresults/1base_delayup
cat /tmp/cakeup | jq '.[].tins[1].drops' > /etc/snmp/cakeresults/1dropsup
cat /tmp/cakeup | jq '.[].tins[1].ecn_mark' > /etc/snmp/cakeresults/1ecn_markup
cat /tmp/cakeup | jq '.[].tins[1].sparse_flows' > /etc/snmp/cakeresults/1sparse_flowsup
cat /tmp/cakeup | jq '.[].tins[1].bulk_flows' > /etc/snmp/cakeresults/1bulk_flowsup
cat /tmp/cakeup | jq '.[].tins[1].unresponsive_flows' > /etc/snmp/cakeresults/1unresponsive_flowsup
cat /tmp/cakeup | jq '.[].tins[1].sent_bytes' > /etc/snmp/cakeresults/1sent_bytesup
# video
cat /tmp/cakeup | jq '.[].tins[2].target_us' > /etc/snmp/cakeresults/2targetup
cat /tmp/cakeup | jq '.[].tins[2].peak_delay_us' > /etc/snmp/cakeresults/2peak_delayup
cat /tmp/cakeup | jq '.[].tins[2].avg_delay_us' > /etc/snmp/cakeresults/2avg_delayup
cat /tmp/cakeup | jq '.[].tins[2].base_delay_us' > /etc/snmp/cakeresults/2base_delayup
cat /tmp/cakeup | jq '.[].tins[2].drops' > /etc/snmp/cakeresults/2dropsup
cat /tmp/cakeup | jq '.[].tins[2].ecn_mark' > /etc/snmp/cakeresults/2ecn_markup
cat /tmp/cakeup | jq '.[].tins[2].sparse_flows' > /etc/snmp/cakeresults/2sparse_flowsup
cat /tmp/cakeup | jq '.[].tins[2].bulk_flows' > /etc/snmp/cakeresults/2bulk_flowsup
cat /tmp/cakeup | jq '.[].tins[2].unresponsive_flows' > /etc/snmp/cakeresults/2unresponsive_flowsup
cat /tmp/cakeup | jq '.[].tins[2].sent_bytes' > /etc/snmp/cakeresults/2sent_bytesup
# voice
cat /tmp/cakeup | jq '.[].tins[3].target_us' > /etc/snmp/cakeresults/3targetup
cat /tmp/cakeup | jq '.[].tins[3].peak_delay_us' > /etc/snmp/cakeresults/3peak_delayup
cat /tmp/cakeup | jq '.[].tins[3].avg_delay_us' > /etc/snmp/cakeresults/3avg_delayup
cat /tmp/cakeup | jq '.[].tins[3].base_delay_us' > /etc/snmp/cakeresults/3base_delayup
cat /tmp/cakeup | jq '.[].tins[3].drops' > /etc/snmp/cakeresults/3dropsup
cat /tmp/cakeup | jq '.[].tins[3].ecn_mark' > /etc/snmp/cakeresults/3ecn_markup
cat /tmp/cakeup | jq '.[].tins[3].sparse_flows' > /etc/snmp/cakeresults/3sparse_flowsup
cat /tmp/cakeup | jq '.[].tins[3].bulk_flows' > /etc/snmp/cakeresults/3bulk_flowsup
cat /tmp/cakeup | jq '.[].tins[3].unresponsive_flows' > /etc/snmp/cakeresults/3unresponsive_flowsup
cat /tmp/cakeup | jq '.[].tins[3].sent_bytes' > /etc/snmp/cakeresults/3sent_bytesup

#cat /tmp/cakeupout | tr '\r\n' ' '  && echo " "  > /tmp/cakeupout

#cat /tmp/cakeupout | sed -i 's/null/0/' /tmp/cakeupout

#cat /tmp/cakeupout

(I'm really embarassed to share this dumb, inefficient piece of code ...)

Having all these small files available, then a correctly configured snmp daemon is able to deliver them whenever needed:
(excerpt from /etc/snmpd.conf)


#  Arbitrary extension commands
# sqm down
extend backlog /bin/cat /etc/snmp/cakeresults/backlog
extend bandwidth /bin/cat /etc/snmp/cakeresults/bandwidth
extend drops /bin/cat /etc/snmp/cakeresults/drops
extend bytes /bin/cat /etc/snmp/cakeresults/bytes
extend packets /bin/cat /etc/snmp/cakeresults/packets
extend qlen /bin/cat /etc/snmp/cakeresults/qlen
# sqm down - bulk
extend 0target /bin/cat /etc/snmp/cakeresults/0target
extend 0peak_delay /bin/cat /etc/snmp/cakeresults/0peak_delay
extend 0avg_delay /bin/cat /etc/snmp/cakeresults/0avg_delay
extend 0base_delay /bin/cat /etc/snmp/cakeresults/0base_delay
extend 0drops /bin/cat /etc/snmp/cakeresults/0drops
extend 0ecn_mark /bin/cat /etc/snmp/cakeresults/0ecn_mark
extend 0sparse_flows /bin/cat /etc/snmp/cakeresults/0sparse_flows
extend 0bulk_flows /bin/cat /etc/snmp/cakeresults/0bulk_flows
extend 0unresponsive_flows /bin/cat /etc/snmp/cakeresults/0unresponsive_flows
extend 0sent_bytes /bin/cat /etc/snmp/cakeresults/0sent_bytes
# sqm down - best_effort
extend 1target /bin/cat /etc/snmp/cakeresults/1target
extend 1peak_delay /bin/cat /etc/snmp/cakeresults/1peak_delay
extend 1avg_delay /bin/cat /etc/snmp/cakeresults/1avg_delay
extend 1base_delay /bin/cat /etc/snmp/cakeresults/1base_delay
extend 1drops /bin/cat /etc/snmp/cakeresults/1drops
extend 1ecn_mark /bin/cat /etc/snmp/cakeresults/1ecn_mark
extend 1sparse_flows /bin/cat /etc/snmp/cakeresults/1sparse_flows
extend 1bulk_flows /bin/cat /etc/snmp/cakeresults/1bulk_flows
extend 1unresponsive_flows /bin/cat /etc/snmp/cakeresults/1unresponsive_flows
extend 1sent_bytes /bin/cat /etc/snmp/cakeresults/1sent_bytes
# sqm down - video
extend 2target /bin/cat /etc/snmp/cakeresults/2target
extend 2peak_delay /bin/cat /etc/snmp/cakeresults/2peak_delay
extend 2avg_delay /bin/cat /etc/snmp/cakeresults/2avg_delay
extend 2base_delay /bin/cat /etc/snmp/cakeresults/2base_delay
extend 2drops /bin/cat /etc/snmp/cakeresults/2drops
extend 2ecn_mark /bin/cat /etc/snmp/cakeresults/2ecn_mark
extend 2sparse_flows /bin/cat /etc/snmp/cakeresults/2sparse_flows
extend 2bulk_flows /bin/cat /etc/snmp/cakeresults/2bulk_flows
extend 2unresponsive_flows /bin/cat /etc/snmp/cakeresults/2unresponsive_flows
extend 2sent_bytes /bin/cat /etc/snmp/cakeresults/2sent_bytes
# sqm down - voice
extend 3target /bin/cat /etc/snmp/cakeresults/3target
extend 3peak_delay /bin/cat /etc/snmp/cakeresults/3peak_delay
extend 3avg_delay /bin/cat /etc/snmp/cakeresults/3avg_delay
extend 3base_delay /bin/cat /etc/snmp/cakeresults/3base_delay
extend 3drops /bin/cat /etc/snmp/cakeresults/3drops
extend 3ecn_mark /bin/cat /etc/snmp/cakeresults/3ecn_mark
extend 3sparse_flows /bin/cat /etc/snmp/cakeresults/3sparse_flows
extend 3bulk_flows /bin/cat /etc/snmp/cakeresults/3bulk_flows
extend 3unresponsive_flows /bin/cat /etc/snmp/cakeresults/3unresponsive_flows
extend 3sent_bytes /bin/cat /etc/snmp/cakeresults/3sent_bytes
#sqm up
extend backlogup /bin/cat /etc/snmp/cakeresults/backlogup
extend bandwidthup /bin/cat /etc/snmp/cakeresults/bandwidthup
extend dropsup /bin/cat /etc/snmp/cakeresults/dropsup    
extend bytesup /bin/cat /etc/snmp/cakeresults/bytesup
extend packetsup /bin/cat /etc/snmp/cakeresults/packetsup
extend qlenup /bin/cat /etc/snmp/cakeresults/qlenup
#sqm up - bulk
extend 0targetup /bin/cat /etc/snmp/cakeresults/0targetup
extend 0peak_delayup /bin/cat /etc/snmp/cakeresults/0peak_delayup
extend 0avg_delayup /bin/cat /etc/snmp/cakeresults/0avg_delayup
extend 0base_delayup /bin/cat /etc/snmp/cakeresults/0base_delayup
extend 0dropsup /bin/cat /etc/snmp/cakeresults/0dropsup
extend 0ecn_markup /bin/cat /etc/snmp/cakeresults/0ecn_markup
extend 0sparse_flowsup /bin/cat /etc/snmp/cakeresults/0sparse_flowsup
extend 0bulk_flowsup /bin/cat /etc/snmp/cakeresults/0bulk_flowsup
extend 0unresponsive_flowsup /bin/cat /etc/snmp/cakeresults/0unresponsive_flowsup
extend 0sent_bytesup /bin/cat /etc/snmp/cakeresults/0sent_bytesup
#sqm up - best effort
extend 1targetup /bin/cat /etc/snmp/cakeresults/1targetup
extend 1peak_delayup /bin/cat /etc/snmp/cakeresults/1peak_delayup
extend 1avg_delayup /bin/cat /etc/snmp/cakeresults/1avg_delayup
extend 1base_delayup /bin/cat /etc/snmp/cakeresults/1base_delayup
extend 1dropsup /bin/cat /etc/snmp/cakeresults/1dropsup
extend 1ecn_markup /bin/cat /etc/snmp/cakeresults/1ecn_markup
extend 1sparse_flowsup /bin/cat /etc/snmp/cakeresults/1sparse_flowsup
extend 1bulk_flowsup /bin/cat /etc/snmp/cakeresults/1bulk_flowsup
extend 1unresponsive_flowsup /bin/cat /etc/snmp/cakeresults/1unresponsive_flowsup
extend 1sent_bytesup /bin/cat /etc/snmp/cakeresults/1sent_bytesup
#sqm up - video
extend 2targetup /bin/cat /etc/snmp/cakeresults/2targetup
extend 2peak_delayup /bin/cat /etc/snmp/cakeresults/2peak_delayup
extend 2avg_delayup /bin/cat /etc/snmp/cakeresults/2avg_delayup
extend 2base_delayup /bin/cat /etc/snmp/cakeresults/2base_delayup
extend 2dropsup /bin/cat /etc/snmp/cakeresults/2dropsup
extend 2ecn_markup /bin/cat /etc/snmp/cakeresults/2ecn_markup
extend 2sparse_flowsup /bin/cat /etc/snmp/cakeresults/2sparse_flowsup
extend 2bulk_flowsup /bin/cat /etc/snmp/cakeresults/2bulk_flowsup
extend 2unresponsive_flowsup /bin/cat /etc/snmp/cakeresults/2unresponsive_flowsup
extend 2sent_bytesup /bin/cat /etc/snmp/cakeresults/2sent_bytesup
#sqm up - voice
extend 3targetup /bin/cat /etc/snmp/cakeresults/3targetup
extend 3peak_delayup /bin/cat /etc/snmp/cakeresults/3peak_delayup
extend 3avg_delayup /bin/cat /etc/snmp/cakeresults/3avg_delayup
extend 3base_delayup /bin/cat /etc/snmp/cakeresults/3base_delayup
extend 3dropsup /bin/cat /etc/snmp/cakeresults/3dropsup
extend 3ecn_markup /bin/cat /etc/snmp/cakeresults/3ecn_markup
extend 3sparse_flowsup /bin/cat /etc/snmp/cakeresults/3sparse_flowsup
extend 3bulk_flowsup /bin/cat /etc/snmp/cakeresults/3bulk_flowsup
extend 3unresponsive_flowsup /bin/cat /etc/snmp/cakeresults/3unresponsive_flowsup
extend 3sent_bytesup /bin/cat /etc/snmp/cakeresults/3sent_bytesup

Again, still embarrased, because there's certainly a clever way of configure oid's.

This ends the router side.

On the monitor server side, collectd snmp plugin is configured to issue snmpgets (or walks, not sure) to finally pull the metrics:

<Plugin snmp>
#	<Data "powerplus_voltge_input">
#		Type "voltage"
#		Table false
#		Instance "input_line1"
#		Scale 0.1
#		Values "SNMPv2-SMI::enterprises.6050."
#	</Data>
#	<Data "hr_users">
#		Type "users"
#		Table false
#		Instance ""
#		Shift -1
#		Values "HOST-RESOURCES-MIB::hrSystemNumUsers.0"
#	</Data>
#	<Data "std_traffic">
#		Type "if_octets"
#		Table true
#		InstancePrefix "traffic"
#		Instance "IF-MIB::ifDescr"
#		Values "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets"
#	</Data>
#	Cake - Down
	<Data "cake_bandwidth">
		Type  "cake_bandwidth"
		Table false
		TypeInstance ""
		Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"bandwidth\"" 
        <Data "cake_drops">
                Type  "cake_drops"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"drops\""
        <Data "cake_bytes">
                Type  "cake_bytes"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"bytes\""
        <Data "cake_packets">
                Type  "cake_packets"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"packets\""
        <Data "cake_backlog">
                Type  "cake_backlog"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"backlog\""
        <Data "cake_qlen">
                Type  "cake_qlen"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"qlen\""
#       Cake - Up
        <Data "cake_bandwidthup">
                Type  "cake_bandwidthup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"bandwidthup\"" 
        <Data "cake_dropsup">
                Type  "cake_dropsup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"dropsup\""
        <Data "cake_bytesup">
                Type  "cake_bytesup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"bytesup\""
        <Data "cake_packetsup">
                Type  "cake_packetsup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"packetsup\""
        <Data "cake_backlogup">
                Type  "cake_backlogup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"backlogup\""
        <Data "cake_qlenup">
                Type  "cake_qlenup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"qlenup\""
        <Data "cake_0target">
                Type  "cake_0target"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"0target\""
        <Data "cake_0targetup">
                Type  "cake_0targetup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"0targetup\""
        <Data "cake_1target">
                Type  "cake_1target"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"1target\""
        <Data "cake_1targetup">
                Type  "cake_1targetup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"1targetup\""
        <Data "cake_2target">
                Type  "cake_2target"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"2target\""
        <Data "cake_2targetup">
                Type  "cake_2targetup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"2targetup\""
        <Data "cake_3target">
                Type  "cake_3target"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"3target\""
        <Data "cake_3targetup">
                Type  "cake_3targetup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"3targetup\""
        <Data "cake_0peak_delay">
                Type  "cake_0peak_delay"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"0peak_delay\""
        <Data "cake_0peak_delayup">
                Type  "cake_0peak_delayup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"0peak_delayup\""
        <Data "cake_1peak_delay">
                Type  "cake_1peak_delay"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"1peak_delay\""
        <Data "cake_1peak_delayup">
                Type  "cake_1peak_delayup"
                Table false
                TypeInstance ""
                Values "NET-SNMP-EXTEND-MIB::nsExtendOutput1Line.\"1peak_delayup\""
(there's more in here but I ommited because the post is already too long)

#	<Host "">
#		Address ""
#		Version 1
#		Community "community_string"
#		Collect "std_traffic"
#		Inverval 120
#		Timeout 10
#		Retries 1
#	</Host>
	<Host "">
		Address ""
		Version 2
		Community "public"
		Collect "cake_bandwidth" "cake_drops" "cake_bytes" "cake_packets" "cake_backlog" "cake_qlen" \
                        "cake_bandwidthup" "cake_dropsup" "cake_bytesup" "cake_packetsup" "cake_backlogup" "cake_qlenup" \
                        "cake_0target" "cake_0targetup" "cake_1target" "cake_1targetup" "cake_2target" "cake_2targetup" "cake_3target" "cake_3targetup" \
                        "cake_0peak_delay" "cake_0peak_delayup"  "cake_1peak_delay" "cake_1peak_delayup"  "cake_2peak_delay" "cake_2peak_delayup"  "cake_3peak_delay" "cake_3peak_delayup" \
                        "cake_0avg_delay" "cake_0avg_delayup" "cake_1avg_delay" "cake_1avg_delayup" "cake_2avg_delay" "cake_2avg_delayup" "cake_3avg_delay" "cake_3avg_delayup" \
                        "cake_0base_delay" "cake_0base_delayup"  "cake_1base_delay" "cake_1base_delayup"  "cake_2base_delay" "cake_2base_delayup"  "cake_3base_delay" "cake_3base_delayup" \
                        "cake_0drops" "cake_0dropsup"   "cake_1drops" "cake_1dropsup"   "cake_2drops" "cake_2dropsup"   "cake_3drops" "cake_3dropsup" \
                        "cake_0ecn_mark" "cake_0ecn_markup"  "cake_1ecn_mark" "cake_1ecn_markup"  "cake_2ecn_mark" "cake_2ecn_markup"  "cake_3ecn_mark" "cake_3ecn_markup" \
                        "cake_0sparse_flows" "cake_0sparse_flowsup"  "cake_1sparse_flows" "cake_1sparse_flowsup"  "cake_2sparse_flows" "cake_2sparse_flowsup"  "cake_3sparse_flows" "cake_3sparse_flowsup" \
                        "cake_0bulk_flows" "cake_0bulk_flowsup" "cake_1bulk_flows" "cake_1bulk_flowsup" "cake_2bulk_flows" "cake_2bulk_flowsup" "cake_3bulk_flows" "cake_3bulk_flowsup" \
                        "cake_0unresponsive_flows" "cake_0unresponsive_flowsup" "cake_1unresponsive_flows" "cake_1unresponsive_flowsup" "cake_2unresponsive_flows" "cake_2unresponsive_flowsup" "cake_3unresponsive_flows" "cake_3unresponsive_flowsup" \
                        "cake_0sent_bytes" "cake_0sent_bytesup" "cake_1sent_bytes" "cake_1sent_bytesup" "cake_2sent_bytes" "cake_2sent_bytesup" "cake_3sent_bytes" "cake_3sent_bytesup"

The part where prometheus grabs data from collectd, I believe, is not an issue to you.

1 Like

Hi @gadolf, thanks for taking the time explaining in detail how you get the data into prometheus. Very interesting. When I get some time, see if I can set up something similar in Grafana.

Thanks for pointing me towards the SQM Reporting thread. Yeah I guess this could be split into a different thread, it is a bit off topic, I haven't explored this forum much beyond this topic.

One more question, just curious with your cake setup, looks like you have 4 tins: bulk, best effort, video and voice. I'm happy with the built in piece of cake with the one tin but if I use layer_cake.qos or simple.qos, tins I get are bulk, best effort and voice. What do you use to get the video tin?


I use diffserv4 (you're probably using diffserv3)

See my sqm conf file

gustavo@srv:/etc/sqm$ cat enp2s0.iface.conf 
# Default SQM config; the variables defined here will be applied to all
# interfaces. To override values for a particular interface, copy this file to
# <dev>.iface.conf (e.g., "eth0.iface.conf" for eth0).
# When using ifupdown, the interface config file needs to exist for sqm-scripts
# to be activated on that interface. However, these defaults are still applied,
# so the interface config can be empty.

# Uplink and Downlink values are in kbps

# SQM recipe to use. For more information, see /usr/lib/sqm/*.help

# Optional/advanced config




# ECN ingress resp. egress. Values are ECN or NOECN.

# Extra qdisc options ingress resp. egress
IQDISC_OPTS="diffserv4 nat dual-dsthost"
#EQDISC_OPTS="diffserv4 nat dual-srchost ack-filter"
EQDISC_OPTS="diffserv4 nat dual-srchost"

# CoDel target


As for the whole grafana/prometheus/collectd setting, I was thinking that if you use a single machine, there's no need to bring snmp into the equation, and just use collectd_exec plugin, as proposed in that thread I pointed before.
I would probably be able to also eliminate snmp from my setup. All I had to do is installing collectd in the router machine and have prometheus on the monitor machine pointing to collectd url. Maybe I'll do that soon.

Please, remember: All this setup is under Debian machines, not Openwrt.

Ok thanks. SQM and cake is something I want to learn more about.

I'm thinking about making another python prometheus exporter starting off with the json values in your first script. I've played around with collectd a long time ago, (I know there is the collectd-mod-sqm package) but had some troubles with collectd exporter getting data from it into Prometheus. I believe there is another way using collectd to influx then grafana but haven't tested that yet. Looks like in that SQM reporting thread, there are some nice grafana plots using collectd-influx.

1 Like

Just tried to install via the method defined on the Github readme page, it looks like I am getting a 404 error when running the following command:

wget -O /tmp/

Is it just me?

Can't tell, I typically do a separate git clone/pull on a separate machine, look at the scripts and manually copy them to my router, so I never tried the installation script. But maybe open a new issue on the github page, so you get the developr's attention to the issue?

Typo in the name? Works for me from here:

That did it - the URL is case sensitive. The correct command is as follows:

wget -O /tmp/

Alright, got it installed, but for some reason the script refuses to function if I want to run QOS for both download and upload.

Surely this is a bug with the script? This is the error message I get:

download interface and upload interface are both set to: 'usb0', but cannot be the same. Exiting script.

Not sure who has download and upload split across two interfaces, that seems like a really rare and overly complex setup.

cake-autorate adjusts the bandwidth of cake instances, which must already have been instantiated - see:

There are always separate instances for download and upload, and so setting the download and upload interface to the same interface does not make sense.

So two questions arise.

Firstly, have you properly instantiated cake instances for download and upload?

Secondly, what are the interfaces on which the cake instances have been applied?

The answer to both questions can be determined by running 'tc qdisc ls'. If you could run that command and paste the output in a response on this thread then we can advise further.

1 Like

That is because Linux will only instantiate a qdisc (like cake) on an egress interface (which for WAN would only cover the upload traffic). So to be able to attach qdisc's to ingress traffic we need to somehow convert what is ingress traffic into some sort of egress traffic (as far as the kernel is concerned, ingress and egress only matter in respect to a given interface, so internet download traffic is ingress for WAN, but egress for LAN).
There are three more or less common methods to achieve that:
A) use the kernels intermediate functional block device (IFB) to create an interface-copy from the egress interface that can deal with ingress traffic (that is what sqm-scripts does)
B) create a veth pair and a matching routing table to redirect ingress traffic over that pair and then attach the qdisc to the egress half of that pair
C) simple instantiate the ingress shaper on a LAN interface (but that only works for wired only puters where the router itself generates next to no traffic)

What all three have in common is, that from the kernel's view there will be two different interfaces one for egress, one for ingress. So "overly complex" might be right, but "rare" seems incorrect, at least for users of ingress traffic shaping/AQM.

I would recommend you install and configure sqm-scripts/luci-app-sqm first (opkg update; opkg install luci-app-sqm), then look at:

and maybe

for the configuration and once you have a working sqm installation, then deal with cake-autorate.

Thanks for the information guys. Apologies, it might be worth adding some additional notes about this, I had CAKE installed and configured, but didn't have it enabled for that interface, so "tc qdisc ls" didn't return much. I guess it's not really an "interface" in the traditional Openwrt form (it's more like a qdisc interface), hence my confusion. I'll try it again soon and see how we go.

An IFB will show up in ifconfig output so looks like a real interface in Linux....

Thanks guys, I've updated my config. Quick question though - if I update my config, will the autorate service use my new config automatically?

Lastly - how do I check what version I have? I can see there is a 2.0 version, but the github only links to the 1.2 version as a release?

cake-autorate will use whatever config file is placed in the running directory /root/cake-autorate at the time the script is launched. If you've either run the setup script or manually downloaded the files from the master branch then you've got the most up to date version. I should look into having the setup script write out the latest commit identified into a version file.

I've still not actually released 2.0.0. Maybe it's time now as everything seems super stable again! Versioning is not my favourite aspect of working on cake-autorate :smile:.

Once you have it running I'd encourage you to obtain a log file showing a couple of speed tests and upload it for us to take a look and make a plot and verify all looks in order.

It seems the complexity of the below was not actually needed. So I'll not introduce the associated complexity into cake-autorate.


In the light of my recent foray into overwriting ECN bits, which seems desirable in certain circumstances, and facilitating the same in cake-qos-simple - see here:

CAKE w/ DSCPs - cake-qos-simple - #164 by Lynx

it struck me as worthwhile to facilitate working with unusual cake instantiations in respect of the tc change calls in which the cake qdisc is not necessarily placed at root, but instead placed at a specific parent band.

Here is the output from 'tc qdisc ls' on my router when using the new capability of 'cake-qos-simple' to overwrite the ECN bits with '0' on upload and download:

root@OpenWrt-1:~# tc qdisc ls
qdisc noqueue 0: dev lo root refcnt 2
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
qdisc noqueue 0: dev lan1 root refcnt 2
qdisc noqueue 0: dev lan2 root refcnt 2
qdisc noqueue 0: dev lan3 root refcnt 2
qdisc noqueue 0: dev lan4 root refcnt 2
qdisc prio 1: dev wan root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
qdisc cake 808f: dev wan parent 1:1 bandwidth 20Mbit diffserv4 triple-isolate nat wash ack-filter split-gso rtt 100ms noatm overhead 0
qdisc ingress ffff: dev wan parent ffff:fff1 ----------------
qdisc noqueue 0: dev br-guest root refcnt 2
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc noqueue 0: dev wlan1 root refcnt 2
qdisc noqueue 0: dev wlan0 root refcnt 2
qdisc noqueue 0: dev wlan0-1 root refcnt 2
qdisc noqueue 0: dev wlan0.sta8 root refcnt 2
qdisc noqueue 0: dev wlan0.sta9 root refcnt 2
qdisc noqueue 0: dev wlan1.sta1 root refcnt 2
qdisc noqueue 0: dev wlan1.sta2 root refcnt 2
qdisc cake 8090: dev ifb-wan root refcnt 2 bandwidth 20Mbit diffserv4 triple-isolate nat nowash ingress no-ack-filter split-gso rtt 100ms noatm overhead 0

I have implemented this in cake-autorate like so:

So there are new defaults:

dl_if_tc_parent="root" # the download interface tc parent on which cake is applied (normally root)
ul_if_tc_parent="root" # the upload interface tc parent on which cake is applied (normally root)

And these can be overriden in the config like so (to match my cake instantiations as shown above):

ul_if_tc_parent="parent 1:1" # the upload interface tc parent on which cake is applied (normally root)

Does this seem OK @moeller0 and @dave14305?

In terms of nomenclature, I think this is reasonable:

ul_if_tc_parent="parent 1:1" # the upload interface tc parent on which cake is applied (normally root)
tc qdisc change dev "${interface}" ${tc_parent} cake bandwidth "${shaper_rate_kbps}Kbit" 2> /dev/null

because the tc parent can be either root or say a specific band. See, for example:

Is it reasonable (or plausible) to have periods of increased latency which are independent of my bandwidth usage?

For instance, during high-traffic hours, suppose my total bandwidth usage (up & down) is zero except for pings. Could my latency still suffer due to network conditions upstream from me?

If so, what is a reasonable expectation of performance?
For context, my ISP is a smallish Fiber provider in rural New England and I have what's advertised as a symmetrical 1Gb connection.

Sure, odd things can happen... e.g. a small ISP might have a direct path to an internet exchange that typically is used with a delay of Xms, but in primetime this link might be constantly overloaded and so some connections/customers get e.g. routed via a secondary path of different length resulting in a statically different delay to the same targets. If the path difference happen inside say an MPLS network this would be really hard to diagnose for end customers...

This is however a hypothetical answer, no idea what your ISP is "cooking" there.


cake-autorate version 3.0.0 release

What's Changed

  • This version restructures the bash code for improved robustness, stability and performance (@lynxthecat and @rany2).
  • Employ FIFOs for passing not only data, but also instructions, between the major processes, obviating costly reliance on temporary files. A side effect of this is that now /var/run/cake-autorate is mostly empty during runs (@lynxthecat).
  • Significantly reduced CPU consumption - cake-autorate can now run successfully on older routers (@lynxthecat and @rany2).
  • Introduce support for one way delays (OWDs) using the 'tsping' binary developed by Lochnair. This works with ICMP type 13 (timestamp) requests to ascertain the delay in each direction (i.e. OWDs) (@lynxthecat).
  • Many changes to help catch and handle or expose unusual error conditions (@rany2).
  • Fixed eternal sleep issue (@rany2).
  • Introduce more user-friendly config format by introducing and with the basics (interface names, whether to adjust the shaper rates and the min, base and max shaper rates) and any overrides from the defaults defined in (@rany2).
  • More intelligent check for another running instance (@rany2).
  • Introduce more user-friendly log file exports by automatically generating an export script and a log reset script for each running cake-autorate instance inside /var/run/cake-autorate/*/ (@lynxthecat).
  • Added config file validation that checks all config file entries against those provided in (@rany2).
  • Improved installer and new uninstaller (@rany2).
  • Updated Octave plotter (@moeller0).
  • Updated documentation (@richb-hanover and @lynxthecat).
  • Many more fixes and improvements (@lynxthecat and @rany2).
  • Incorporates many ideas and suggestions by @moeller0 and @patrakov.

Full Changelog: v1.2.1...v3.0.0