SQM makes bufferbloat worse

Hello,

I installed SQM but I got some weird results.

SQM disabled: http://www.dslreports.com/speedtest/45535955
SQM enabled: http://www.dslreports.com/speedtest/45535913

So the upload dropped as expected, but the download gets worse.
Why does this happen?

Device: TP-Link TL-WR1043ND
OpenWrt 18.06.2

Ah, could you post the output of the following commands please:

cat /etc/config/sqm
tc -s qdisc
cat /proc/cpuinfo 
cat /etc/os-release

Looking at https://openwrt.org/toh/tp-link/tl-wr1043nd I believe it might well be that you are running out of CPU-cycles (which might explain the low bandwidth after shaping, but I note until I see /etc/config/sqm I have no idea what to expect for goodput after shaping), Also is this a cable link, and are you by any chance using a modem affected by the pume5/6/7 latency bug (see e.g. https://www.badmodems.com)? In that case sqm will only help to improve the average latency but will not be able to get rid of the annoying >=100ms latency spikes introduced by the modem...

Hi, thanks for your reply

config queue 'eth1'
        option upload '10000'
        option qdisc_advanced '0'
        option linklayer 'none'
        option interface 'eth0.2'
        option debug_logging '0'
        option verbosity '5'
        option qdisc 'cake'
        option script 'piece_of_cake.qos'
        option download '180000'
        option enabled '1'

root@OpenWrt:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 682768184 bytes 2022722 pkt (dropped 0, overlimits 0 requeues 14)
 backlog 0b 0p requeues 14
  maxpacket 1514 drop_overlimit 0 new_flow_count 14 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 4180756735 bytes 3162375 pkt (dropped 0, overlimits 0 requeues 2)
 backlog 0b 0p requeues 2
  maxpacket 1462 drop_overlimit 0 new_flow_count 2 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth1.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 800a: dev eth0.2 root refcnt 2 bandwidth 10Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms raw overhead 0
 Sent 542982 bytes 3874 pkt (dropped 1, overlimits 1415 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 62Kb of 4Mb
 capacity estimate: 10Mbit
 min/max network layer size:           42 /    1514
 min/max overhead-adjusted size:       42 /    1514
 average network hdr offset:           14

                  Tin 0
  thresh         10Mbit
  target          5.0ms
  interval      100.0ms
  pk_delay        577us
  av_delay         51us
  sp_delay          7us
  backlog            0b
  pkts             3875
  bytes          544374
  way_inds            0
  way_miss          134
  way_cols            0
  drops               1
  marks               0
  ack_drop            0
  sp_flows            1
  bk_flows            1
  un_flows            0
  max_len          1514
  quantum           305

qdisc ingress ffff: dev eth0.2 parent ffff:fff1 ----------------
 Sent 14029693 bytes 11999 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 800b: dev ifb4eth0.2 root refcnt 2 bandwidth 180Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100.0ms raw overhead 0
 Sent 14196165 bytes 11998 pkt (dropped 1, overlimits 12608 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 103168b of 9000000b
 capacity estimate: 180Mbit
 min/max network layer size:           60 /    1514
 min/max overhead-adjusted size:       60 /    1514
 average network hdr offset:           14

                  Tin 0
  thresh        180Mbit
  target          5.0ms
  interval      100.0ms
  pk_delay        188us
  av_delay         57us
  sp_delay          4us
  backlog            0b
  pkts            11999
  bytes        14197679
  way_inds            0
  way_miss          133
  way_cols            0
  drops               1
  marks               0
  ack_drop            0
  sp_flows            2
  bk_flows            2
  un_flows            0
  max_len          1514
  quantum          1514

root@OpenWrt:~# cat /proc/cpuinfo
system type             : Qualcomm Atheros QCA9558 ver 1 rev 0
machine                 : TP-LINK TL-WR1043ND v2
processor               : 0
cpu model               : MIPS 74Kc V5.0
BogoMIPS                : 358.80
wait instruction        : yes
microsecond timers      : yes
tlb_entries             : 32
extra interrupt vector  : yes
hardware watchpoint     : yes, count: 4, address/irw mask: [0x0ffc, 0x0ffc, 0x0ffb, 0x0ffb]
isa                     : mips1 mips2 mips32r1 mips32r2
ASEs implemented        : mips16 dsp dsp2
shadow register sets    : 1
kscratch registers      : 0
package                 : 0
core                    : 0
VCED exceptions         : not available
VCEI exceptions         : not available

root@OpenWrt:~# cat /etc/os-release
NAME="OpenWrt"
VERSION="18.06.2"
ID="openwrt"
ID_LIKE="lede openwrt"
PRETTY_NAME="OpenWrt 18.06.2"
VERSION_ID="18.06.2"
HOME_URL="http://openwrt.org/"
BUG_URL="http://bugs.openwrt.org/"
SUPPORT_URL="http://forum.lede-project.org/"
BUILD_ID="r7676-cddd7b4c77"
LEDE_BOARD="ar71xx/generic"
LEDE_ARCH="mips_24kc"
LEDE_TAINTS=""
LEDE_DEVICE_MANUFACTURER="OpenWrt"
LEDE_DEVICE_MANUFACTURER_URL="http://openwrt.org/"
LEDE_DEVICE_PRODUCT="Generic"
LEDE_DEVICE_REVISION="v0"
LEDE_RELEASE="OpenWrt 18.06.2 r7676-cddd7b4c77"

Modem: Cisco EPC3925. No puma chip.

On cable networks I recommend:

config queue
	option debug_logging '0'
	option verbosity '5'
	option qdisc_advanced '1'
	option squash_dscp '0'
	option squash_ingress '0'
	option ingress_ecn 'ECN'
	option egress_ecn 'ECN'
	option qdisc_really_really_advanced '1'
	option linklayer 'ethernet'
	option overhead '18'
	option linklayer_advanced '1'
	option tcMTU '2047'
	option tcTSIZE '128'
	option linklayer_adaptation_mechanism 'default'
	option iqdisc_opts 'nat dual-dsthost ingress'
	option eqdisc_opts 'nat dual-srchost'
	option interface 'eth0.2'
	option download '180000'
	option upload '10000'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option enabled '1'
	option tcMPU '64'

as the mandatory DOCSIS shaper assumes 18 bytes overhead. The ingress keyword in iqdisc_opts will keep the link slightly more responsive when multiple bulk flows are saturating the downstream. The two dual-XXXhost options in that combination should give you equal bandwidth sharing between your internal IP-addresses (if yiou do not want this simple delete these two options)

I note that the openwrt wiki for v2 of your router says:
"The v2.x and v3.x get the lan and wan interface not via eth0.1/eth0.2 but via eth0 (wan) and eth1 (lan). The eth0 is in the same vlan as port 5 on the switch. Because of that, port 6 on the v2.x, v3.x routers is an additional CPU port - used for wan traffic only." So something looks off with your eth0.2, but it might be the wiki (see https://openwrt.org/toh/tp-link/tl-wr1043nd)

That is a 720 MHz single core MIPS cpu, that will in all likelihood be overtaxed to shape at the combined value of 190000 Kbps... (to test this log into your router while running a speedtest via ssh and run top -d 1 concurrently and monitor the %idle value in the second row, if this gets < 10 you most likely are out of CPU cycles when you need them)

Respect, that is quick :wink: given that it was just announced...

2 Likes

Thank you for your clear explanation! I think you are right, my idle get down to 0% when testing download speed. So it looks like the router has simply not enough horsepower..

Download CPU: 0% usr 0% sys 0% nic 0% idle 0% io 0% irq 98% sirq
Upload CPU: 0% usr 0% sys 0% nic 81% idle 0% io 0% irq 17% sirq

Okay, there are two stop-gap measures you could try (short of getting a beefier router):

  1. try simlest.qos/fq_codel there you will still not be able to shape more than say 70-80 Mbps combined, but instead of getting higher bandwidth with terrible bufferbloat the bufferblloat should stay under control.

  2. if 1) actually improves things, try to get the most recent version of sqm-scripts installed (from source see " "Installing" the current development version from git" on https://github.com/tohojo/sqm-scripts) as that will alllow you to potentially ecplicitly trade-off latency against bandwidth (if you want to try that, please holler and I will walk you though it).

is the docsis keyword in cake not enough for cable?

1 Like

Sure, using the DOCSIS keyword will also, under the hood, set overhead to 18 and mpu to 64, but the only way to set cake specific keywords in the GUI is to forego most sanity checking via the advanced option strings, so I tend not to recommend that...

Alright, I think it's best to not use SQM at all. With all the different configurations I get worse download results than when I have SQM disabled. Thanks for your help.

Try disabling the ingress/download shaper, but leaving the upload shaper in place, your CPU should handle typical upload speeds (5 to 10 Mbps for most cable connections).

3 Likes

Why?

Which configurations did you try?

I can understand if you are happy with the non-sqm performance, but I would like to understand what you tried and why it did not deliver on its promise, please.


No SQM 200 mbit

SQM on your recomended settings

SQM on fq_codel simplest

I also tested < 60 so the CPU is not overloaded. I see improvements in download latency but dropping that much in speed is not an option for me. :wink:

In fact, it only improves upload latency.

A 400Mhz Atheros single core (or thereabouts). So CPU is really the bottleneck even just for WiFi + Routing your 200Mbps download. You'll experience a lot better results in general with a hardware upgrade. Today the best all-in-one is either the WRT32X or the ZyXEL NBG6817 and either one will do SQM and routing and NAT and WiFi at your speeds without problem.

I've found that one has to watch out to not set it in both places or they add... i.e. if I set the overhead to ether and 18, and also use the "docsis" command in the advanced line, it will happily set the overhead to 36, giving me the 18 +18 I asked for. Not sure whether it should do that or not, but something to be aware of.

I used to have the OP's problem of running out of CPU trying to sqm my bandwidth. (C7, 300/30mbit cable connection) A C7, for instance, can really only run cake and everything else being a NAT router/AP at 100-120mbit before you run out of idle time. But, I did the above, and ran 0 and 28-30mbit for the shaper speeds, which helped a lot while being a light load for the router. Even the download side seemed to benefit from the upload being better managed.

Another note... don't know if you watched with top -d1 to see if even that lower speed didn't run you out of CPU, that might explain lack of improvement there. Best is do trial and error with the shaper speed till you stay above 0% during the download test. Then you might see an improvement, but at too much speed loss cost, I'd agree. Try the shaper only on the upload, with a "0" for download speed and see what that does for you

4 Likes

You could try to use a build with fast path included.

It seems like it also does increase SQM performance.
In case you want to try it:

Oh, I agree, especially since the ISPs bufferbloat on download is not totally terrible to begin with. Setting

       option upload '18000'
        option download '0'

should give you a slightly better debloated upstream and enough CPU cycles for everything else. The ) tells sqm to not instantiate a shaper in that direction... This is @dlakelan's idea repeated in more words :wink:

1 Like

Yes this is on purpose, if you ask for it twice, you will get it twice :wink:

I believe that none of the packet accelerators currently work with sqm.

1 Like

Set the download to 0 seems to work very well. I think this is the max I can get out of this ''weak'' router.
Thanks for all the help!

1 Like