Bandwidth control

For example if you have a wlan0 and eth0.1 bridged together you have an issue because both wlan0 and eth0.1 need SQM on them (br-lan itself doesn't queue). Here you can actually put a veth0/veth1 pair, place the veth1 into the bridge, and then route traffic to veth0 instead of br-lan, then you put the SQM on the veth0, this allows you to put all the traffic through one place rather than have it branch out to two places.

For example if you have a wlan0 and eth0.1 bridged together you have an issue because both wlan0 and eth0.1 need SQM on them (br-lan itself doesn't queue). Here you can actually put a veth0/veth1 pair, place the veth1 into the bridge, and then route traffic to veth0 instead of br-lan, then you put the SQM on the veth0, this allows you to put all the traffic through one place rather than have it branch out to two places.

Makes sense, thank you!

@dlakelan

@moeller0

I did some more research on this topic and came across the hashlimit module which someone used to limit bandwidth. Wouldn't something like this do the job as well and is it even possible this way?

  1. Specify the subnets and mark them

iptables -t nat -A prerouting_rule_u -s 192.168.1.0/24 -j MARK --set-mark 100 -m comment --comment "Subnet Restriction Upload"

iptables -t nat -A prerouting_rule_d -d 192.168.1.0/24 -j MARK --set-mark 101 -m comment --comment "Subnet Restriction Download"

  1. Mark the whole connection

iptables -t nat -A prerouting_rule_u -m mark --mark 102 -j CONNMARK --save-mark

iptables -t nat -A prerouting_rule_d -m mark --mark 103 -j CONNMARK --save-mark

  1. Limit the marked connections

iptables -A forwarding_rule_u -m mark --mark 102 -m conntrack --ctstate ESTABLISHED,RELATED -m hashlimit --hashlimit-name "Upload speed" --hashlimit-above 250kb/s -j DROP

iptables -A forwarding_rule_d -m mark --mark 103 -m conntrack --ctstate ESTABLISHED,RELATED -m hashlimit --hashlimit-name "Download speed" --hashlimit-above 500kb/s -j DROP

Couldn’t run a test yet, first have to wait until the hardware is ready.

Well, certainly try to run a dslreports speedtest with the high resolution bufferbloat setting and compare the performance for say shaping to 500Kbps with the hash-limit option versus using sqm-scrits cake/layer_cake to shape the wan interface tp 500Kbps.
Also I wonder, have you tried the per-IP isolation modes yet? If yes, would you be willing to share your assessment? I do not want to claim ,that these are a solution for all QoS challenges and might well be not a good fit for your specific problem, but for a number of users that were simply used to having to do fine-grained QoS rules these isolation modes seemed to have been good enough to not bother anymore with all the details. (Yes I am trying to slolicit both success and failure stories, so I can give better advice in the future when those isolation modes might be applicable).

It seems to me that hashlimit is a bit of a sledgehammer and again doesn't let you fully utilize available bandwidth.

I think DSCP tagging and a smart switch is a good solution and not particularly expensive. Even a tplink sg108e (less than $40) will do weighted round robin for you. Mark guest Network traffic CS1 mark normal traffic CS2 and game traffic CS5, VOIP traffic CS6.

Put layercake/SQM queues on WAN outbound, and let the switch handle inbound post-tag, no goofy veth required.

Edit, thinking about this it doesn't work because the inbound bottleneck will be in your ISP equipment... So you really do need to shape the packets in the router.

The advantage to shaping is that the queue system sees the length of the queue and drops the appropriate packets. Hashlimit will be just a fixed policing of the subset that's tagged. Not in general as good.

This matches what I expect, but I really would like to see real-world measurements...

1 Like

No doubt, measurements would be useful. In fact I'd like to do some research on these things. It'd be nice to get a network of 5 machines and a couple of smart switches and APs and set up some intense latency / bandwidth / game play / VOIP / shaping tests.

@accelerate it seems the package I mentioned ( nft-qos) now made it into the repository:
https://github.com/openwrt/packages/pull/6193

But that still needs to be tested in regards to latency-increase under load...

Hi guys,

I was finally able to test SQM and I have to say it really reduces buffer bloat and the shaping functionality cake/piece of cake is very nice as well! I have currently set SQM to 8500 kbit/s download and 700 kbit/s upload speed and it's basically what my ISP is providing me.

Wow nft-qos is just what was missing up to now and in addition to SQM it will be a great solution in some cases.

I would like to test nft-qos, but can't find it in the available package list. Did I miss something?

Also thank you very much for your help - it was a pleasure and sorry for my late reply!

Great that it is doing something for you! Now, if you post the output of:
cat /etc/config sqm
tc -s qdisc

I can offer to have a look at your current configuration, and potentially help you to optimize it a bit...

Good question, did you do a "opkg update" before checking the list?

Here the requested outputs:

cat /etc/config/sqm:

config queue 'eth1'
	option qdisc_advanced '0'
	option interface 'eth0.3'
	option debug_logging '0'
	option verbosity '5'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option linklayer 'atm'
	option overhead '44'
	option enabled '1'
	option download '8500'
	option upload '700'

tc -s qdisc:

qdisc noqueue 0: dev lo root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn 
 Sent 670502806 bytes 730634 pkt (dropped 0, overlimits 0 requeues 1) 
 backlog 0b 0p requeues 1
  maxpacket 1392 drop_overlimit 0 new_flow_count 2 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-guest root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev br-lan root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc cake 8007: dev eth0.3 root refcnt 2 bandwidth 700Kbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms atm overhead 44 
 Sent 24734283 bytes 259846 pkt (dropped 1122, overlimits 149324 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 329344b of 4Mb
 capacity estimate: 700Kbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:      106 /    1749
 average network hdr offset:           14

                  Tin 0
  thresh        700Kbit
  target         26.0ms
  interval      121.0ms
  pk_delay       17.9ms
  av_delay        1.3ms
  sp_delay         24us
  backlog            0b
  pkts           260968
  bytes        25451154
  way_inds          383
  way_miss         3068
  way_cols            0
  drops            1122
  marks               0
  ack_drop            0
  sp_flows            0
  bk_flows            1
  un_flows            0
  max_len          1514
  quantum           300

qdisc ingress ffff: dev eth0.3 parent ffff:fff1 ---------------- 
 Sent 643340210 bytes 468737 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0-1 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1-1 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc cake 8008: dev ifb4eth0.3 root refcnt 2 bandwidth 8500Kbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100.0ms atm overhead 44 
 Sent 642474839 bytes 463507 pkt (dropped 5230, overlimits 824640 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 62Kb of 4Mb
 capacity estimate: 8500Kbit
 min/max network layer size:           46 /    1500
 min/max overhead-adjusted size:      106 /    1749
 average network hdr offset:           14

                  Tin 0
  thresh       8500Kbit
  target          5.0ms
  interval      100.0ms
  pk_delay        3.9ms
  av_delay        1.4ms
  sp_delay          9us
  backlog            0b
  pkts           468737
  bytes       649902528
  way_inds          319
  way_miss         3042
  way_cols            0
  drops            5230
  marks               0
  ack_drop            0
  sp_flows            1
  bk_flows            2
  un_flows            0
  max_len          1514
  quantum           300

Good question, did you do a "opkg update" before checking the list?

Yes I did, but doesn't seem to be in the repository at the moment :thinking:

It seems it is in the master packet repository (https://github.com/openwrt/packages/tree/master/net/nft-qos) but not in the lede-17.1 or openwrt-18.6 branches, so you might need to install a snapshot from the master branch (in that case, please note that the luci-GUI does not seem to be part of the firmware images AFAIK, so you will need to install that via the command line before you can configure the router via a browser, using a terminal will always work...)

More on your sqm config below, but in short looks good, but is missing the settings to enable per-internal IP-address fairness which should help a bit with your use-case, even though with your low bandwidth I can see why you might want to throttle some hosts unconditionally.
First https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details gives more information about sqm-scripts configurations.

but in your case I would recommend to try adding the following to /etc/config/sqm:

config queue
        option qdisc_advanced '1'
        option squash_dscp '0'
        option squash_ingress '0'
        option ingress_ecn 'ECN'
        option qdisc_really_really_advanced '1'
        option linklayer_advanced '1'
        option tcMTU '2047'
        option tcTSIZE '128'
        option linklayer_adaptation_mechanism 'default'
        option iqdisc_opts 'nat dual-dsthost ingress'
        option eqdisc_opts 'nat dual-srchost'
        option enabled '1'
        option qdisc 'cake'
        option script 'layer_cake.qos'

"layer_cake": will allow to use DSCP markings to expedite some packets over others,
"nat": will allow cake to see the true external and internal addresses which is required for
"dual-srchost" & "dual-dsthost": configured as above (assuming eth0.3 is your WAN interface will try to first distribute the available bandwidth evenly between all concurrently active internal IP-address, and then for each of these internal IP-addresses it will distribute the bandwidth fairly between all concurrent flows
"ingress" finally will instruct cake to shape ingress in a way that does aim for reducing the incoming rate to the specified value (normally shaper aim to shape their outgoing rate, but for a shaper on the wrong ed of a bottleneck that is sub-optimal)

1 Like

I prefer to wait until it's available in the branches then.

Changed my SQM configuration to:

config queue 'eth1'
	option qdisc_advanced '1'
	option squash_dscp '0'
	option squash_ingress '0'
	option interface 'eth0.3'
	option debug_logging '0'
	option verbosity '5'
	option qdisc 'cake'
	option script 'layer_cake.qos'
	option linklayer 'atm'
	option overhead '44'
	option enabled '1'
	option download '8500'
	option upload '700'
	option ingress_ecn 'ECN'
	option qdisc_really_really_advanced '1'
	option tcMTU '2047'
	option tcTSIZE '128'
	option iqdisc_opts 'nat dual-dsthost ingress'
	option eqdisc_opts 'nat dual-srchost'
	option egress_ecn 'NOECN'
	option linklayer_advanced '1'
	option tcMPU '0'
	option linklayer_adaptation_mechanism 'default'

Thanks again for your help and time!

I am not 100% sure, but I have a hunch that new software will not show up in old branches; what probably is going to happen is that this becomes part of the planned 19.X branch. But that is going to happen eventually :wink:

That said, I would be interested to learn how well (or the opposite) cake with internal-IP-isolation as configured with your new /etc/config/sqm actually works for your use case?

Depending on your ISP and the encapsulations used you might have more (rather unlikely) or less per packet overhead, you could try to follow the instructions in https://github.com/moeller0/ATM_overhead_detector to try to empirically measure the overhead on your link so that you neither waste bandwidth nor use a shaper that is too loose.

Well maybe they are going to do it, otherwise I will wait for the next release.

"cake with internal-IP-isolation" seems to do the same job. I didn't really notice a difference.

I measured the overhead and got a result of 10 bytes. At this point I have to add it is interesting, but we shouldn't put much more effort in it, I should get a fiber connection soon and we could have a look at it again then. Thank you for the interesting discussion!

Glad to hear that this works as intended. Now this will not replace a fully bespoke QoS rule setup, but for many use cases it can be good enough to not bother :wink:

Ah, a PPPoA link by any chance?

Well, let me return the compliments then :wink: thanks for the nice discussion and doing the testing!

Ah, a PPPoA link by any chance?

Yes, it's a PPPoA link.

Well, let me return the compliments then :wink: thanks for the nice discussion and doing the testing!

You are welcome!

Just some additional feedback:
Your ATM overhead detector works really well and gives you some nice information and graphs, but to be honest it was a little bit annoying to use Matlab/Octave. If you want more information then additional tools are fine, but it would have been a more pleasant experience if the ping collector script would have had calculated the overhead.

1 Like

I'm going to hazard a guess that the script does some linear regression and calculates an intercept or some similar operation, it's not trivial to do that kind of calculation in a simple shell script :slight_smile:

1 Like

Oh, I agree that it would be nice to do this detection directly on the router instead of using a combination of shell/batch and Matlab/octave. But I lack time and expertise to make that happen, unfortunately. About the first you need to trust me about the second just look at the Matlab code and it will be clear that my background is not in computer science :wink:
Python would be an theoretical alternative since I want/need to learn some python anyway, but that would not solve the problem just replacing octave with python, since neither fits on a typical router....