Sqm qos doesnt work in most cases

hey, i tried for a long time to understand sqm qos and im using it for quite a while, but i never had really luck. i followed the guide on the lede wiki step by step.

for our network only simple http downloads like downloading a debian iso get shaped, most of the stuff we use on this network like steam, battle.net app, windows update (its the worst of all) etc. still consume the whole connection (it even goes above the limit set by sqm, to the max which is possible by our connection).

config queue 'eth1'
    option debug_logging '0'
    option verbosity '5'
    option linklayer 'atm'
    option qdisc 'cake'
    option qdisc_advanced '1'
    option squash_dscp '1'
    option squash_ingress '1'
    option ingress_ecn 'ECN'
    option egress_ecn 'NOECN'
    option qdisc_really_really_advanced '1'
    option iqdisc_opts 'nat dual-dsthost'
    option eqdisc_opts 'nat dual-srchost'
    option script 'piece_of_cake.qos'
    option download '15200'
    option upload '2280'
    option enabled '1'
    option interface 'pppoe-wan'
    option overhead '44'

.

root@lede:~# tc -d qdisc
qdisc noqueue 0: dev lo root refcnt 2
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc noqueue 0: dev eth0.1 root refcnt 2
qdisc noqueue 0: dev eth0.2 root refcnt 2
qdisc noqueue 0: dev wlan0 root refcnt 2
qdisc noqueue 0: dev wlan1 root refcnt 2
qdisc cake 8007: dev pppoe-wan root refcnt 2 bandwidth 2280Kbit besteffort dual-srchost nat rtt 100.0ms raw
linklayer atm overhead 44 mtu 2047 tsize 512
qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
qdisc cake 8008: dev ifb4pppoe-wan root refcnt 2 bandwidth 15200Kbit besteffort dual-dsthost nat wash rtt 100.0ms raw
 linklayer atm overhead 44 mtu 2047 tsize 512

.

root@lede:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 183703058951 bytes 342374353 pkt (dropped 0, overlimits 0 requeues 1541)
 backlog 0b 0p requeues 1541
  maxpacket 1514 drop_overlimit 0 new_flow_count 2504 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8007: dev pppoe-wan root refcnt 2 bandwidth 2280Kbit besteffort dual-srchost nat rtt 100.0ms raw
 Sent 30258006658 bytes 201329051 pkt (dropped 47015, overlimits 101379515 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 1857856b of 4Mb
 capacity estimate: 2280Kbit
                 Tin 0
  thresh      2280Kbit
  target         8.0ms
  interval     103.0ms
  pk_delay       292us
  av_delay        19us
  sp_delay         7us
  pkts       201376066
  bytes    30296853750
  way_inds    15425613
  way_miss      698773
  way_cols           0
  drops          47015
  marks              0
  sp_flows           1
  bk_flows           1
  un_flows           0
  max_len         1696

qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
 Sent 356742246085 bytes 278612682 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8008: dev ifb4pppoe-wan root refcnt 2 bandwidth 15200Kbit besteffort dual-dsthost nat wash rtt 100.0ms raw
 Sent 403028898467 bytes 274833458 pkt (dropped 3779224, overlimits 453382139 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 315456b of 4Mb
 capacity estimate: 15200Kbit
                 Tin 0
  thresh     15200Kbit
  target         5.0ms
  interval     100.0ms
  pk_delay       4.9ms
  av_delay       2.7ms
  sp_delay        29us
  pkts       278612682
  bytes   409333945364
  way_inds    12315548
  way_miss      704113
  way_cols           0
  drops        3779224
  marks              0
  sp_flows           2
  bk_flows           1
  un_flows           0
  max_len         1696

connection: adsl 16mbit down, 2,4mbit up
router: TP-Link TL-WDR3600 v1

Hi Mezo,

that sounds bad. Unfortunately steam, battle.net app (not 100% sure about that), windows update are all known to be bad internet citizens that are not really following all RFC recommendations faithfully. That might actually not be an issue with the services themselves, but rather their use of aggressive CDNs closet to the end customers (short RTT, combined with say IW20 and a few number of concurrent flows can simply cause hard to control flows, see https://www.cdnplanet.com/blog/initcwnd-settings-major-cdn-providers/).

I would try the following:

  1. add the following to /etc/config/sqm:

    option linklayer_advanced '1'
    option linklayer_adaptation_mechanism 'default'

  2. I would scale the download shaper back to 8000 (for testing) and if that improves things a bit I would iteratively try to increase it again, but 15200 of 16000 (100*15200/16000 = 95 %) seems too optimistic, I would realistically aim for 80-90 of the sync rate for the download, on upload with correct link layer accounting you should be able to go up to say 99% of sync rate (but always test, a number of ISP employ their own traffic shapers at the BRAS/BNG level and if the ISP shapes sqm scripts needs to be shaped to 80-90% of that ISP shaper bandwidth, but it typically is hard to get reiable data from ISPs whether and to what value their shaper is set).