For example if you have a wlan0 and eth0.1 bridged together you have an issue because both wlan0 and eth0.1 need SQM on them (br-lan itself doesn't queue). Here you can actually put a veth0/veth1 pair, place the veth1 into the bridge, and then route traffic to veth0 instead of br-lan, then you put the SQM on the veth0, this allows you to put all the traffic through one place rather than have it branch out to two places.
For example if you have a wlan0 and eth0.1 bridged together you have an issue because both wlan0 and eth0.1 need SQM on them (br-lan itself doesn't queue). Here you can actually put a veth0/veth1 pair, place the veth1 into the bridge, and then route traffic to veth0 instead of br-lan, then you put the SQM on the veth0, this allows you to put all the traffic through one place rather than have it branch out to two places.
Makes sense, thank you!
I did some more research on this topic and came across the hashlimit module which someone used to limit bandwidth. Wouldn't something like this do the job as well and is it even possible this way?
- Specify the subnets and mark them
iptables -t nat -A prerouting_rule_u -s 192.168.1.0/24 -j MARK --set-mark 100 -m comment --comment "Subnet Restriction Upload"
iptables -t nat -A prerouting_rule_d -d 192.168.1.0/24 -j MARK --set-mark 101 -m comment --comment "Subnet Restriction Download"
- Mark the whole connection
iptables -t nat -A prerouting_rule_u -m mark --mark 102 -j CONNMARK --save-mark
iptables -t nat -A prerouting_rule_d -m mark --mark 103 -j CONNMARK --save-mark
- Limit the marked connections
iptables -A forwarding_rule_u -m mark --mark 102 -m conntrack --ctstate ESTABLISHED,RELATED -m hashlimit --hashlimit-name "Upload speed" --hashlimit-above 250kb/s -j DROP
iptables -A forwarding_rule_d -m mark --mark 103 -m conntrack --ctstate ESTABLISHED,RELATED -m hashlimit --hashlimit-name "Download speed" --hashlimit-above 500kb/s -j DROP
Couldn’t run a test yet, first have to wait until the hardware is ready.
Well, certainly try to run a dslreports speedtest with the high resolution bufferbloat setting and compare the performance for say shaping to 500Kbps with the hash-limit option versus using sqm-scrits cake/layer_cake to shape the wan interface tp 500Kbps.
Also I wonder, have you tried the per-IP isolation modes yet? If yes, would you be willing to share your assessment? I do not want to claim ,that these are a solution for all QoS challenges and might well be not a good fit for your specific problem, but for a number of users that were simply used to having to do fine-grained QoS rules these isolation modes seemed to have been good enough to not bother anymore with all the details. (Yes I am trying to slolicit both success and failure stories, so I can give better advice in the future when those isolation modes might be applicable).
It seems to me that hashlimit is a bit of a sledgehammer and again doesn't let you fully utilize available bandwidth.
I think DSCP tagging and a smart switch is a good solution and not particularly expensive. Even a tplink sg108e (less than $40) will do weighted round robin for you. Mark guest Network traffic CS1 mark normal traffic CS2 and game traffic CS5, VOIP traffic CS6.
Put layercake/SQM queues on WAN outbound, and let the switch handle inbound post-tag, no goofy veth required.
Edit, thinking about this it doesn't work because the inbound bottleneck will be in your ISP equipment... So you really do need to shape the packets in the router.
The advantage to shaping is that the queue system sees the length of the queue and drops the appropriate packets. Hashlimit will be just a fixed policing of the subset that's tagged. Not in general as good.
This matches what I expect, but I really would like to see real-world measurements...
No doubt, measurements would be useful. In fact I'd like to do some research on these things. It'd be nice to get a network of 5 machines and a couple of smart switches and APs and set up some intense latency / bandwidth / game play / VOIP / shaping tests.
@accelerate it seems the package I mentioned ( nft-qos) now made it into the repository:
https://github.com/openwrt/packages/pull/6193
But that still needs to be tested in regards to latency-increase under load...
Hi guys,
I was finally able to test SQM and I have to say it really reduces buffer bloat and the shaping functionality cake/piece of cake is very nice as well! I have currently set SQM to 8500 kbit/s download and 700 kbit/s upload speed and it's basically what my ISP is providing me.
Wow nft-qos is just what was missing up to now and in addition to SQM it will be a great solution in some cases.
I would like to test nft-qos, but can't find it in the available package list. Did I miss something?
Also thank you very much for your help - it was a pleasure and sorry for my late reply!
Great that it is doing something for you! Now, if you post the output of:
cat /etc/config sqm
tc -s qdisc
I can offer to have a look at your current configuration, and potentially help you to optimize it a bit...
Good question, did you do a "opkg update" before checking the list?
Here the requested outputs:
cat /etc/config/sqm:
config queue 'eth1'
option qdisc_advanced '0'
option interface 'eth0.3'
option debug_logging '0'
option verbosity '5'
option qdisc 'cake'
option script 'piece_of_cake.qos'
option linklayer 'atm'
option overhead '44'
option enabled '1'
option download '8500'
option upload '700'
tc -s qdisc:
qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
Sent 670502806 bytes 730634 pkt (dropped 0, overlimits 0 requeues 1)
backlog 0b 0p requeues 1
maxpacket 1392 drop_overlimit 0 new_flow_count 2 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-guest root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 8007: dev eth0.3 root refcnt 2 bandwidth 700Kbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms atm overhead 44
Sent 24734283 bytes 259846 pkt (dropped 1122, overlimits 149324 requeues 0)
backlog 0b 0p requeues 0
memory used: 329344b of 4Mb
capacity estimate: 700Kbit
min/max network layer size: 28 / 1500
min/max overhead-adjusted size: 106 / 1749
average network hdr offset: 14
Tin 0
thresh 700Kbit
target 26.0ms
interval 121.0ms
pk_delay 17.9ms
av_delay 1.3ms
sp_delay 24us
backlog 0b
pkts 260968
bytes 25451154
way_inds 383
way_miss 3068
way_cols 0
drops 1122
marks 0
ack_drop 0
sp_flows 0
bk_flows 1
un_flows 0
max_len 1514
quantum 300
qdisc ingress ffff: dev eth0.3 parent ffff:fff1 ----------------
Sent 643340210 bytes 468737 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0-1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1-1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 8008: dev ifb4eth0.3 root refcnt 2 bandwidth 8500Kbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100.0ms atm overhead 44
Sent 642474839 bytes 463507 pkt (dropped 5230, overlimits 824640 requeues 0)
backlog 0b 0p requeues 0
memory used: 62Kb of 4Mb
capacity estimate: 8500Kbit
min/max network layer size: 46 / 1500
min/max overhead-adjusted size: 106 / 1749
average network hdr offset: 14
Tin 0
thresh 8500Kbit
target 5.0ms
interval 100.0ms
pk_delay 3.9ms
av_delay 1.4ms
sp_delay 9us
backlog 0b
pkts 468737
bytes 649902528
way_inds 319
way_miss 3042
way_cols 0
drops 5230
marks 0
ack_drop 0
sp_flows 1
bk_flows 2
un_flows 0
max_len 1514
quantum 300
Good question, did you do a "opkg update" before checking the list?
Yes I did, but doesn't seem to be in the repository at the moment
It seems it is in the master packet repository (https://github.com/openwrt/packages/tree/master/net/nft-qos) but not in the lede-17.1 or openwrt-18.6 branches, so you might need to install a snapshot from the master branch (in that case, please note that the luci-GUI does not seem to be part of the firmware images AFAIK, so you will need to install that via the command line before you can configure the router via a browser, using a terminal will always work...)
More on your sqm config below, but in short looks good, but is missing the settings to enable per-internal IP-address fairness which should help a bit with your use-case, even though with your low bandwidth I can see why you might want to throttle some hosts unconditionally.
First https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details gives more information about sqm-scripts configurations.
but in your case I would recommend to try adding the following to /etc/config/sqm:
config queue
option qdisc_advanced '1'
option squash_dscp '0'
option squash_ingress '0'
option ingress_ecn 'ECN'
option qdisc_really_really_advanced '1'
option linklayer_advanced '1'
option tcMTU '2047'
option tcTSIZE '128'
option linklayer_adaptation_mechanism 'default'
option iqdisc_opts 'nat dual-dsthost ingress'
option eqdisc_opts 'nat dual-srchost'
option enabled '1'
option qdisc 'cake'
option script 'layer_cake.qos'
"layer_cake": will allow to use DSCP markings to expedite some packets over others,
"nat": will allow cake to see the true external and internal addresses which is required for
"dual-srchost" & "dual-dsthost": configured as above (assuming eth0.3 is your WAN interface will try to first distribute the available bandwidth evenly between all concurrently active internal IP-address, and then for each of these internal IP-addresses it will distribute the bandwidth fairly between all concurrent flows
"ingress" finally will instruct cake to shape ingress in a way that does aim for reducing the incoming rate to the specified value (normally shaper aim to shape their outgoing rate, but for a shaper on the wrong ed of a bottleneck that is sub-optimal)
I prefer to wait until it's available in the branches then.
Changed my SQM configuration to:
config queue 'eth1'
option qdisc_advanced '1'
option squash_dscp '0'
option squash_ingress '0'
option interface 'eth0.3'
option debug_logging '0'
option verbosity '5'
option qdisc 'cake'
option script 'layer_cake.qos'
option linklayer 'atm'
option overhead '44'
option enabled '1'
option download '8500'
option upload '700'
option ingress_ecn 'ECN'
option qdisc_really_really_advanced '1'
option tcMTU '2047'
option tcTSIZE '128'
option iqdisc_opts 'nat dual-dsthost ingress'
option eqdisc_opts 'nat dual-srchost'
option egress_ecn 'NOECN'
option linklayer_advanced '1'
option tcMPU '0'
option linklayer_adaptation_mechanism 'default'
Thanks again for your help and time!
I am not 100% sure, but I have a hunch that new software will not show up in old branches; what probably is going to happen is that this becomes part of the planned 19.X branch. But that is going to happen eventually
That said, I would be interested to learn how well (or the opposite) cake with internal-IP-isolation as configured with your new /etc/config/sqm actually works for your use case?
Depending on your ISP and the encapsulations used you might have more (rather unlikely) or less per packet overhead, you could try to follow the instructions in https://github.com/moeller0/ATM_overhead_detector to try to empirically measure the overhead on your link so that you neither waste bandwidth nor use a shaper that is too loose.
Well maybe they are going to do it, otherwise I will wait for the next release.
"cake with internal-IP-isolation" seems to do the same job. I didn't really notice a difference.
I measured the overhead and got a result of 10 bytes. At this point I have to add it is interesting, but we shouldn't put much more effort in it, I should get a fiber connection soon and we could have a look at it again then. Thank you for the interesting discussion!
Glad to hear that this works as intended. Now this will not replace a fully bespoke QoS rule setup, but for many use cases it can be good enough to not bother
Ah, a PPPoA link by any chance?
Well, let me return the compliments then thanks for the nice discussion and doing the testing!
Ah, a PPPoA link by any chance?
Yes, it's a PPPoA link.
Well, let me return the compliments then thanks for the nice discussion and doing the testing!
You are welcome!
Just some additional feedback:
Your ATM overhead detector works really well and gives you some nice information and graphs, but to be honest it was a little bit annoying to use Matlab/Octave. If you want more information then additional tools are fine, but it would have been a more pleasant experience if the ping collector script would have had calculated the overhead.
I'm going to hazard a guess that the script does some linear regression and calculates an intercept or some similar operation, it's not trivial to do that kind of calculation in a simple shell script
Oh, I agree that it would be nice to do this detection directly on the router instead of using a combination of shell/batch and Matlab/octave. But I lack time and expertise to make that happen, unfortunately. About the first you need to trust me about the second just look at the Matlab code and it will be clear that my background is not in computer science
Python would be an theoretical alternative since I want/need to learn some python anyway, but that would not solve the problem just replacing octave with python, since neither fits on a typical router....