SQM has no effect

Hey :slight_smile: I'm having the same issues as OP:

  • It's an GL-AR750S.
  • I'm using the Web UI (OpenWrt 18.06.1).
  • I've selected "eth0.2" in the UI.
  • My internet's pretty bad, I've entered 7000 down / 500 up.
  • By default the UI selects "fq_codel" and "simple.qos". I've changed it to "cake" and "piece_of_cake".
  • My link layer is set to ATM and 44.
  • I've done nothing on the CLI yet (apart from run the same commands as OP, the output is below).
  • The values in the UI seem to have no effect on the network.
# cat /etc/config/sqm

config queue 'eth1'
	option qdisc_advanced '0'
	option debug_logging '0'
	option verbosity '5'
	option linklayer 'atm'
	option overhead '44'
	option enabled '1'
	option download '7000'
	option upload '500'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option interface 'eth0.2'
# ifstatus wan
...
	"l3_device": "eth0.2",
	"device": "eth0.2",
# tc -s qdisc
...
qdisc cake 803d: dev eth0.2 root refcnt 2 besteffort flowblind rtt 5.0ms raw atm overhead 44 
# tc -d qdisc
qdisc cake 803e: dev ifb4eth0.2 root refcnt 2 besteffort flowblind wash rtt 5.0ms raw atm overhead 44
root@GL-AR750S:~# SQM_DEBUG=1 SQM_VERBOSITY_MAX=8 /etc/init.d/sqm stop
SQM: Stopping SQM on eth0.2
root@GL-AR750S:~# SQM_DEBUG=1 SQM_VERBOSITY_MAX=8 /etc/init.d/sqm start
SQM: Starting SQM script: piece_of_cake.qos on eth0.2, in: 7000 Kbps, out: 500 Kbps
SQM: QDISC cake is useable.
SQM: Starting piece_of_cake.qos
SQM: ifb associated with interface eth0.2: 
SQM: Currently no ifb is associated with eth0.2, this is normal during starting of the sqm system.
SQM: egress
SQM: LLA: default link layer adjustment method for cake is cake
SQM: cake link layer adjustments: atm overhead 44 mpu 0
SQM: egress shaping activated
SQM: QDISC ingress is useable.
SQM: ingress
SQM: LLA: default link layer adjustment method for cake is cake
SQM: cake link layer adjustments: atm overhead 44 mpu 0
SQM: ingress shaping activated
SQM: piece_of_cake.qos was started on eth0.2 successfully

Following the stop+start, the output of tc -s qdisc remains the same:

qdisc cake 803e: dev ifb4eth0.2 root refcnt 2 besteffort flowblind wash rtt 5.0ms raw atm overhead 44 

I can confirm I'm clicking "Save and Apply" on the GUI.

Any help would be greatly appreciated :smile:

That is not good, it should default not to flowblind, as that effectively disables the flow-queueing part of cake, I wonder why it does that. I also note that 18.06.1 is not the most recent version, so I would kindly propose to re-test with at least 18.06.2 (or if you wait a few days 18.06.3 or 4 or even 19.07.1)

There seems to be something wrong, but first let's figure out wether we are chasing an already fixed bug :wink:

P.S.: It generally is better to open a new topic, you can always link in the thread you came from/ wanted to respond in your first post, but having individual threads that can achieve a solution seems overall nicer than open-ended threads collecting similar issues.

I've poked around the scripts, made a backup of piece_of_cake.qos (apologies for hacking around your scripts :wink:) and twiddled the debugging, the core commands it's running are:

$ /usr/sbin/tc qdisc add dev eth0 root cake bandwidth 500kbit atm overhead 44 mpu 0 besteffort
$ /usr/sbin/tc qdisc add dev ifb4eth0 root cake bandwidth 7000kbit atm overhead 44 mpu 0 besteffort wash 

which look about right compared to the man page I found (http://man7.org/linux/man-pages/man8/tc-cake.8.html)

Not really my scripts, it is a team effort; and the one thing I like about shell scripts is that they are easily hackable; so hack away, with my blessing :slight_smile:
Actually, it would be propbably better to rename the script slightly so it will be easier to transfer your script to newer versions of sqm-scripts....

Yes, this looks decent, but it does not explain where the "flowblind" in your earlier example came from. Flowblind really is only useful for comparison and otherwise typically not the right thing...

Ok, this is odd. If I do horrible things to the scripts to force flow isolation (I've gone with triple-isolate as it's apparently a default) so the commands become this:

$ /usr/sbin/tc qdisc add dev eth0 root  cake bandwidth 500kbit atm overhead 44 mpu 0 besteffort triple-isolate
$ /usr/sbin/tc qdisc add dev ifb4eth0 root  cake bandwidth 7000kbit atm overhead 44 mpu 0 besteffort triple-isolate wash

then the statistics read a bit better:

# tc -s qdisc
qdisc cake 8055: dev eth0 root refcnt 2 besteffort triple-isolate rtt 5.0ms raw atm overhead 44 
qdisc ingress ffff: dev eth0 parent ffff:fff1 ---------------- 
qdisc cake 8056: dev ifb4eth0 root refcnt 2 besteffort triple-isolate wash rtt 5.0ms raw atm overhead 44 

running dslreports and checking back, packets are definitely flowing through these (interfaces?):

# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2 
 Sent 0 bytes 0 pkts (dropped 0, overlimits 0) 
qdisc cake 8055: dev eth0 root refcnt 2 besteffort triple-isolate rtt 5.0ms raw atm overhead 44 
 Sent 19310732 bytes 22840 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 14b of 2240b
 capacity estimate: 532691Tbit
 min/max network layer size:         1492 /      35
 min/max overhead-adjusted size:     1499 /11534337
 average network hdr offset:           28

qdisc ingress ffff: dev eth0 parent ffff:fff1 ---------------- 
 Sent 19085817 bytes 22911 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc cake 8056: dev ifb4eth0 root refcnt 2 besteffort triple-isolate wash rtt 5.0ms raw atm overhead 44 
 Sent 19406571 bytes 22911 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 14b of 1984b
 capacity estimate: 532691Tbit
 min/max network layer size:         1492 /      53
 min/max overhead-adjusted size:     1499 /11534337
 average network hdr offset:           46

but there's still no flow control happening:

I think you may be right that an upgrade could fix this... However the router is an off-the-shelf product that ships with this software baked in, along with some extra nice stuff on the side. I'm reluctant to throw that away if it can be avoided. I'll keep poking around tomorrow.

you do not, by chance have a flow offload feature enabled anywhere?

??? Unless this is an evenroute iqrouter, I am puzzled that it seems to offer cake at all. And if it is an iqrouter, have you talked to them already?

@moeller0 not likely, as he stated the device is a :

1 Like

Ah, thanks, I was a bit inattentive it seems....
It is just that when I think about an OpenWrt commercial distribution with modern qdiscs, all I can think of is your fine product.

1 Like

This is the device:
https://openwrt.org/toh/gl.inet/gl-ar750s

As per the above link near of the article (https://forum.gl-inet.com/t/installing-openwrt-on-gl-750s/5291/4) it says the device uses NAND instead of NOR. The TL;DR from reading through the links is that they need some changes to land in the core OpenWRT (https://github.com/openwrt/openwrt/pull/1428) to enable NAND which are being help back until OpenWRT hits linux 4.19. They have their own fork of OpenWRT (https://github.com/gl-inet/openwrt) where they're adding their own software in (and I presume have patched in NAND support).

They do however have some pre-release firmware on their website, so I've flashed the latest build from their fork as of about a week ago openwrt-ar750s-3.025-0626 and things are now happening.

After clicking through the UI as before, I'm left with a much healthier looking:

root@GL-AR750S:~# tc -s qdisc
...
qdisc cake 8007: dev eth0 root refcnt 2 bandwidth 600Kbit besteffort triple-isolate split-gso rtt 100.0ms atm overhead 44 
qdisc cake 8008: dev ifb4eth0 root refcnt 2 bandwidth 7Mbit besteffort triple-isolate wash split-gso rtt 100.0ms atm overhead 44 

Running dslreports shows the bufferbloat is indeed working, but my bandwidth numbers are terrible:

Ok, it looks like the egress shaping massively limits my ingress... possibly because my upstream bandwidth is so terrible? By edging the upstream bandwidth from 600 up and up the downstream bandwidth goes up with it, however anything over 700 triggers bufferbloat.

If I set the ingress threshold to where I need it and disable egress shaping, I can download huge files whilst maintaining a consistent ping. So I'm half way there. For me, the most valuable half is there :slight_smile:

@moeller0 was right - all it took was a firmware update (to a pre-release version). Thanks for the support, really appreciated :+1:

I note that according to the openwrt wiki WAN is on eth0.1 so try to select that interface for sqm, or better post the output of ifstatus wan here....

Erm, sorry I am stupid, but according to your first post it should be eth0.2 unless that changed with the new firmware...

root@GL-AR750S:~# ifstatus wan
...
	"l3_device": "eth0.2",
	"device": "eth0.2",

If I pick eth0.2, the egress shaping works great, but the ingress shaping is a bit off. With the values 7000/600 where I think they should be yields:

After playing with the numbers, going to 2000/600 shows the issue - it starts great, then half way in it seems to give up and releases the flood gates:

Nah this is terrible, you requested 2.0 and got 7.96, which with your encapsulation denotes a gross shaper rate of:

7960 / ((1500 - 8 - 20 - 20) / (ceil(1536 / 48) * 53)) = 9297.63 Kbps

and even the 7810 for a set 7000 seem off to my, again this could be related to some offload engine actively routing packets around the traffic shaper.... especially you comment

indicates the same (except it is not consistently visible in all dslreports graphs)