SQM vs qos with dscp

Hi I've been using SQM scripts to protect from bufferbloat on the two interfaces balanced by mwan3

i have 2 sqm instances both using cake and pieceofcake but i want to start prioritising traffic on my local lan now to give VOIP highest priority and video streaming lowest

i'll explain the network:- I have 20 desktops, 11 voip phones, 3 servers and 5 printers and a few kiosks connected to the one lan

if users are watching youtube videos and windows 10 is doing its p2p thing of hogging bandwidth my local voip calls have packet loss of 60 to 90 % if there are more than 5 calls going at the time.

if there is no traffic then voip is fine so its not getting highest priority the phone handsets are ste to dscp 46

i've seen that qos scripts support dscp but i cant see any mention of it for sqm and you cant install the two can someone help me please.

thanks

QoS with DSCP is a great idea here. I do think FireQOS is the way to go. We have been discussing it in detail on a separate thread [Solved] I can't getting firehol to work on latest lede trunk, but the details there are mainly for purpose of handling complex situation with a link that multiplexes two separate network speeds (cached stuff from ISP cache vs the rest from a slow link at ISP).

Assuming your DSCP tags are already on VOIP packets you do not need all the veth stuff we tried. That is also true if you have a simple router routing between two ethernet interfaces rather than a bridge bridging multiple interfaces. Assuming you have a simple router, put QoS queues on the output of WAN and the output of LAN (whatever interfaces those are). input QoS is more tricky because it happens before the firewall sees stuff so better to just output QoS.

in fireqos your setup could be as simple as:

interface eth1 lan rate XXX
class voip rate 1100kbit ## 100kbit for each phone
match DSCP EF

1 Like

Also I'd definitely suggest a separate wire path to your phones. Ideally you'd have something like this:

Router ---> managed switch  ---> voip phones
                            ---> computers

tell the managed switch to prioritize DSCP, so that every packet received on those ports gets to your router before anything from the computers, and everything send from your router gets to phones before anything else. Even a 100 dollar web managed switch can do this. You can tell the switch not to trust DSCP on the computer ports if needed. The main thing is get the phones onto their own wire and port if possible. If the wiring is too complex to segregate, you can also put the phones on a different vlan and prioritize that whole vlan in the switch.

Try layer_cake, which does honor some common dscp markings; it might be not enough for your high number of phones though.
Could you post the output of:

  1. cat /etc/config/sqm
  2. tc -d qdisc
  3. tc -s qdisc
    please?

Best Regards

P.S.: In case the alloted bandwidth per tier is not sufficient in cake, have a look at simple.qos, t should be relatively easy to adjust the bandwidths for the three tiers there.

this is the output of my sqm config

onfig queue
	option verbosity '5'
	option enabled '1'
	option interface 'pppoe-wan'
	option download '8000'
	option upload '600'
	option qdisc_advanced '0'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option linklayer 'ethernet'
	option overhead '8'
	option debug_logging '0'

config queue
	option verbosity '5'
	option enabled '1'
	option interface 'pppoe-wan2'
	option download '8000'
	option upload '600'
	option qdisc_advanced '0'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option linklayer 'ethernet'
	option overhead '8'
	option debug_logging '0'

config queue
	option debug_logging '0'
	option verbosity '5'
	option linklayer 'none'
	option enabled '1'
	option qdisc 'cake'
	option script 'layer_cake.qos'
	option qdisc_advanced '0'
	option interface 'eth1.3'
	option download '50000'
	option upload '50000'

.

root@Router:~# tc -d qdisc
qdisc noqueue 0: dev lo root refcnt 2 
qdisc mq 0: dev eth0 root 
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth0 parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc mq 0: dev eth1 root 
qdisc fq_codel 0: dev eth1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev eth1 parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc noqueue 0: dev eth0.1 root refcnt 2 
qdisc mq 0: dev wlan1 root 
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc mq 0: dev wlan0 root 
qdisc fq_codel 0: dev wlan0 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev wlan0 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev wlan0 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc fq_codel 0: dev wlan0 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
qdisc cake 8080: dev pppoe-wan root refcnt 2 bandwidth 600Kbit besteffort triple-isolate rtt 100.0ms raw 
 linklayer ethernet overhead 8 
qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ---------------- 
qdisc noqueue 0: dev br-lan root refcnt 2 
qdisc cake 807d: dev eth1.3 root refcnt 2 bandwidth 50Mbit diffserv3 triple-isolate rtt 100.0ms raw 
qdisc ingress ffff: dev eth1.3 parent ffff:fff1 ---------------- 
qdisc cake 807e: dev ifb4eth1.3 root refcnt 2 bandwidth 50Mbit besteffort triple-isolate wash rtt 100.0ms raw 
qdisc cake 8081: dev ifb4pppoe-wan root refcnt 2 bandwidth 8Mbit besteffort triple-isolate wash rtt 100.0ms raw 
 linklayer ethernet overhead 8 
qdisc noqueue 0: dev eth0.2 root refcnt 2 
qdisc cake 808f: dev pppoe-wan2 root refcnt 2 bandwidth 600Kbit besteffort triple-isolate rtt 100.0ms raw 
 linklayer ethernet overhead 8 
qdisc ingress ffff: dev pppoe-wan2 parent ffff:fff1 ---------------- 
qdisc cake 8090: dev ifb4pppoe-wan2 root refcnt 2 bandwidth 8Mbit besteffort triple-isolate wash rtt 100.0ms raw 
 linklayer ethernet overhead 8 

and

root@Router:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc mq 0: dev eth0 root 
 Sent 5566499140 bytes 35100834 pkt (dropped 0, overlimits 0 requeues 266) 
 backlog 0b 0p requeues 266 
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 5566499140 bytes 35100834 pkt (dropped 0, overlimits 0 requeues 266) 
 backlog 0b 0p requeues 266 
  maxpacket 1514 drop_overlimit 0 new_flow_count 60690 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth0 parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc mq 0: dev eth1 root 
 Sent 46469812245 bytes 40294450 pkt (dropped 0, overlimits 0 requeues 149) 
 backlog 0b 0p requeues 149 
qdisc fq_codel 0: dev eth1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 46469812245 bytes 40294450 pkt (dropped 0, overlimits 0 requeues 149) 
 backlog 0b 0p requeues 149 
  maxpacket 1514 drop_overlimit 0 new_flow_count 35220 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev eth1 parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev eth0.1 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc mq 0: dev wlan1 root 
 Sent 2652803437 bytes 3493721 pkt (dropped 3741, overlimits 0 requeues 88) 
 backlog 0b 0p requeues 88 
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 2309557 bytes 9979 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 86 drop_overlimit 0 new_flow_count 1 ecn_mark 0
  new_flows_len 1 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 2650493880 bytes 3483742 pkt (dropped 3741, overlimits 0 requeues 88) 
 backlog 0b 0p requeues 88 
  maxpacket 1514 drop_overlimit 0 new_flow_count 6558 ecn_mark 0
  new_flows_len 1 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc mq 0: dev wlan0 root 
 Sent 7433546200 bytes 6584535 pkt (dropped 251, overlimits 0 requeues 95) 
 backlog 0b 0p requeues 95 
qdisc fq_codel 0: dev wlan0 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 1891883 bytes 4997 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan0 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 1615 bytes 8 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan0 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 7431652702 bytes 6579530 pkt (dropped 251, overlimits 0 requeues 95) 
 backlog 0b 0p requeues 95 
  maxpacket 1514 drop_overlimit 0 new_flow_count 12784 ecn_mark 0
  new_flows_len 1 old_flows_len 0
qdisc fq_codel 0: dev wlan0 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc cake 8080: dev pppoe-wan root refcnt 2 bandwidth 600Kbit besteffort triple-isolate rtt 100.0ms raw 
 Sent 337032325 bytes 3227089 pkt (dropped 16258, overlimits 1250644 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 1400128b of 4Mb
 capacity estimate: 600Kbit
                 Tin 0
  thresh       600Kbit
  target        30.4ms
  interval     125.4ms
  pk_delay      12.0ms
  av_delay       1.9ms
  sp_delay         2us
  pkts         3243347
  bytes      354615292
  way_inds      489868
  way_miss      109051
  way_cols           0
  drops          16258
  marks              1
  sp_flows           1
  bk_flows           1
  un_flows           0
  max_len         1500

qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ---------------- 
 Sent 7045919147 bytes 5384842 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc noqueue 0: dev br-lan root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc cake 807d: dev eth1.3 root refcnt 2 bandwidth 50Mbit diffserv3 triple-isolate rtt 100.0ms raw 
 Sent 18045714246 bytes 14883311 pkt (dropped 7, overlimits 739700 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 528192b of 4Mb
 capacity estimate: 50Mbit
                 Bulk   Best Effort      Voice
  thresh      3125Kbit      50Mbit   12500Kbit
  target         5.8ms       5.0ms       5.0ms
  interval     100.8ms     100.0ms      10.0ms
  pk_delay         0us       131us        77us
  av_delay         0us        11us         3us
  sp_delay         0us         1us         1us
  pkts               0    14822693       60625
  bytes              0 18037844437     7880407
  way_inds           0      163561           0
  way_miss           0      417619          82
  way_cols           0           0           0
  drops              0           7           0
  marks              0           0           0
  sp_flows           0           0           0
  bk_flows           0           1           0
  un_flows           0           0           0
  max_len            0        1514         590

qdisc ingress ffff: dev eth1.3 parent ffff:fff1 ---------------- 
 Sent 1547222880 bytes 10306113 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc cake 807e: dev ifb4eth1.3 root refcnt 2 bandwidth 50Mbit besteffort triple-isolate wash rtt 100.0ms raw 
 Sent 1693093682 bytes 10305889 pkt (dropped 224, overlimits 1683247 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 482688b of 4Mb
 capacity estimate: 50Mbit
                 Tin 0
  thresh        50Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay       356us
  av_delay        35us
  sp_delay         1us
  pkts        10306113
  bytes     1693431522
  way_inds      348254
  way_miss      443185
  way_cols           0
  drops            224
  marks              0
  sp_flows           3
  bk_flows           1
  un_flows           0
  max_len         5904

qdisc cake 8081: dev ifb4pppoe-wan root refcnt 2 bandwidth 8Mbit besteffort triple-isolate wash rtt 100.0ms raw 
 Sent 7088980137 bytes 5384830 pkt (dropped 12, overlimits 806541 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 12672b of 4Mb
 capacity estimate: 8Mbit
                 Tin 0
  thresh         8Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay       853us
  av_delay       107us
  sp_delay         1us
  pkts         5384842
  bytes     7088997883
  way_inds       86601
  way_miss      106107
  way_cols           0
  drops             12
  marks              0
  sp_flows           4
  bk_flows           2
  un_flows           0
  max_len         1508

qdisc noqueue 0: dev eth0.2 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc cake 808f: dev pppoe-wan2 root refcnt 2 bandwidth 600Kbit besteffort triple-isolate rtt 100.0ms raw 
 Sent 27248646 bytes 129532 pkt (dropped 1741, overlimits 91969 requeues 0) 
 backlog 1548b 2p requeues 0 
 memory used: 335360b of 4Mb
 capacity estimate: 600Kbit
                 Tin 0
  thresh       600Kbit
  target        30.4ms
  interval     125.4ms
  pk_delay     219.5ms
  av_delay      29.1ms
  sp_delay       3.8ms
  pkts          131275
  bytes       29009381
  way_inds         902
  way_miss        4452
  way_cols           0
  drops           1741
  marks              0
  sp_flows           5
  bk_flows           1
  un_flows           0
  max_len         1500

qdisc ingress ffff: dev pppoe-wan2 parent ffff:fff1 ---------------- 
 Sent 161036217 bytes 165596 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0 
qdisc cake 8090: dev ifb4pppoe-wan2 root refcnt 2 bandwidth 8Mbit besteffort triple-isolate wash rtt 100.0ms raw 
 Sent 159512062 bytes 163656 pkt (dropped 1939, overlimits 184266 requeues 0) 
 backlog 1500b 1p requeues 0 
 memory used: 91264b of 4Mb
 capacity estimate: 8Mbit
                 Tin 0
  thresh         8Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay       2.8ms
  av_delay       334us
  sp_delay        13us
  pkts          165596
  bytes      162360985
  way_inds         833
  way_miss        4564
  way_cols           0
  drops           1939
  marks              0
  sp_flows           0
  bk_flows           1
  un_flows           0
  max_len         1500

my current configuration is attached!

diagram public

I have the 192.168.0.0 traffic tagged into a vlan but i'd prefer to limit the wiring needed so have the desktop pc's connected to the back of the phones

I have been curious regards switch speed my server and switches all negotiate at gigabit apart form all the antique pbx equipment and pc's attached to back of phones I dont think the network is anywhere near its full potential but want it future proof as soon we will be upgrading the twin vdsl lines to a 100/100 dedicated ethernet line and might drop the isdn lines hence why i want to make sure dscp 46 always has top priority over any other traffic

@r_al_sim thanks for the data, please let me stew over that for a few more days.

It looks to me like you've got 4 switches. Replacing each with a zyxel 24 port web managed switch would cost you something like $400. If your document isn't just schematic you'd need something like 12 wires run to individual PCs and then the PCs wouldn't hang off the back of phones and would receive gigabit ethernet instead of 100meg which would make them much more responsive when accessing your server. Furthermore you could use LACP to link aggregate between the switches and bonding on the windows server... and get something like 2 or 3 aggregate gigabit connections internally. Your lan would be a LOT faster than hanging PCs off the back of cheap phones.

another option without running wires would be to put a desktop 8 or 16 port gigabit dumb switch behind each phone or near to a cluster or several phones. Instant gigabit connection for your PCs. I assume you're using that windows server for file sharing, all your PC users will thank you for that relatively cheap upgrade.

Mmmh, so do I understand correctly, by "local calls" you do not mean intra-organisational calls, but VoIP calls that are routed via your DSL-lines? I also note your shaper is is set to layer_cake 50/50 Mbps (eth1.3) but your combined pppoe-wan shapers are set to piece_of_cake 8/0.6. So the two ppoe-wan's are your bottleneck shapers and are not honoring DSCP markings (piece_of_cake defaults to besteffort), have you tried layer_cake on the two wan interfaces as well? (Which might not work well, given how bandwidth starved these are). BTW, overhead 8 is pretty much guranteed to be too optimistic on real dsl links. So could you elaborate on the true nature of those links? From the graphic it looks like low bandwidth VDSL1 links, but I have a hard time believing that (My guess they are either ADSL or VDSL2).
I note the rule of thumb for a VoIP flow is 100 Kbps per direction, so your uploads will only barely allow for 11 concurrent VoIP calls... If that is the case I can guarantee that cake's settings will not work for you since none of the diffserv modes (diffserv3 diffserv4 diffserv8) will allow the usage of effectively 100% of bandwidth for high priority classes (the idea is that of a trade-off, high priority comes with low bandwidth and vice versa). I do note that cake tries to be graceful if traffic exceeds its priority bin and try to still process those packets albeit with a lowered priority.

Best Regards

1 Like

@moeller0

the dsl lines are purely for web traffic the calls are routed through the isdn lines so we can currently take 4 simultaneous calls when i say local calls i mean voip calls within the pbx switch to each other in the office.

the vdsl2 lines are 8mb each they vary in speeds currently one is at 10334kbps/1073kbps and the other is at 6500kbps/797kbps hence why i use piece_of_cake 8/0.6 on them as more of an average do you think i should drop the overhead?

the shaper on the local lan eth1.3 is set at layer cake 50/50 this prevented the phones from getting any packet loss (assume its the difserve) prior to this calls were dropping out constantly, do you think i have room for more than 50/50 on this. the dumb switches and most devices are gigabit but the phones are only 10/100

i wont be ditching the isdn lines till i have a 100mb/100mb dedicated line and only then will i go with voip external calls. but i need to know layer cake can manage it before i make the upgrade as i will be removing the twin vdsl lines and no longer using piece of cake on the wan.

@dlakelan
i will be buying a managed switch when the new office is wired each desk will get 2x ports one for phone and one for desktop. but that is almost a year away. and intend to aggregate the the file server then.

the windows server just isolates the windows update traffic to a single point and hosts a pbx logging server, one day it might host an active directory and office 365. the linux servers run local intranet websites, data cloning scripts, file servers, and vm server for building and testing my dev environments.

@moeller0

when i looked at the file /tmp/run/sqm/eth1.3.state it showed IGNORE_DSCP="1" and ENABLED="" which concerned me

as the phones use dscp 46 to identify their priority

IFACE="eth1.3"
UPLINK="50000"
DOWNLINK="50000"
SCRIPT="layer_cake.qos"
ENABLED=""
QDISC="cake"
LLAM="tc_stab"
LINKLAYER="none"
OVERHEAD="0"
STAB_MTU="2047"
STAB_MPU="0"
STAB_TSIZE="512"
AUTOFLOW="0"
ILIMIT=""
ELIMIT=""
TARGET="5ms"
ITARGET=""
ETARGET=""
IECN="ECN"
EECN="ECN"
ZERO_DSCP="1"
IGNORE_DSCP="1"
IQDISC_OPTS=""
EQDISC_OPTS=""
INGRESS_CAKE_OPTS="diffserv3 besteffort wash"
EGRESS_CAKE_OPTS="diffserv3"
SQM_DEBUG="0"
SQM_DEBUG_LOG=""
OUTPUT_TARGET="/dev/null"

thanks for the help in understanding this sqm and qos is new to me

Ah, okay so local was truly local and not "metro" :wink:

I think that modeling the link correctly is essential, even though underestimates in either bandwidth or overhead can be compensated by the other measure. So with pppoe on a VDSL2 link I expect you need to account for at least 30 bytes of per-packet-overhead (if your ISP also uses a VLAN tag at least 34 bytes). So I would not drop the overhead (in not specifying any) but would rather set it to >= 30 bytes.

Sure, as long as your router can handle that there is no reason to not specify more, after all you want the phone calls to be not disturbed by other traffic. But I am amazed that this has any influence at all since I would assume that the internal VoIP traffic should be completely running on your switches (I would try to isolate the phones and the PBX on their own dedicated switch though, that should isolate them well from all the rest, even without AQM/QoS).
But is the eth1.3 shaper a new addition? And does it fully solve your internal call drops?

Typically sqm is instantiated on a wan link and in that case the ingress side sees the dscp packets as delivered by your ISP. Since ISPs are free to use DSCPs any way they like, one should not blindly trust the markings of incoming packets, hence sqm's default to ignore them on ingress, you might want to set IGNORE_DSCP=0 and IGNORE_DSCP=0 to make your ingress shaper honor the DSCP markings (even though I am not fully sure why your internal VoIP packets should ever be visible on the router). I would recommend to either edit /etc/config/sqm or to use the GUI editing the sate file /tmp/run/sqm/eth1.3.state will not have any effect (but you knew that :wink: ).
The best test for a sqm instance being enabled is to run "tc -s qdisc"

When shaping ethernet close to the link rate one needs to account for the full effective ethernet header, so overhead=38, or in your case 42, since you use an additional VLAN (but I note that with a 50/50 shaper on a gigabit link this will have not much observable difference)

Best Regards

thanks this has been a great help.

The 50/50 shaper was the only way i could get it to work at all but i wasnt sure where my problem of packet loss comes from on the internal lan and assumed it was because sqm didnt support dscp

No packet that goes from a phone either to your PBX or directly to another phone should ever hit that 50/50 LAN queue. The only thing that 50/50 LAN queue should do is wind up re-ordering your inbound packets as they leave the router headed towards the LAN, it won't even drop anything because it's far faster than the inbound queues on the DSL lines.

Otherwise, phone-to-phone in your organization... all of it should be in the switches. The fact that the 50/50 LAN queue has any effect at all is suspicious and to me indicates maybe a cpu bottleneck on the router or faulty RAM or a driver issue or some other weirdness that the queue winds up accidentally working around.

Before doing anything else, I really think that switch up at the upper right of your diagram that the router, PBX and Centos box are attached to should be swapped out for this device: https://www.amazon.com/Zyxel-24-Gigabit-Managed-Rackmount-GS1900-24E/dp/B00GU1KSHS

And I really think you should have one in operations, sales, and warehouse as well...

make sure you download and upgrade to the latest firmware (there is an especially obnoxious display bug in the older firmwares that make some text invisible in certain browsers in the management interface). Then set up QoS to honor DSCP, and adjust your queues so that DSCP 46 and higher go into queue 7, highest priority

Then your internal VOIP traffic is prioritized and traveling over a high quality switching fabric. Short of faulty wiring, or bursts of RF noise, you shouldn't have a single dropped VOIP packet ever.

Next, since you're going to have a 100/100 metro ethernet connection, upgrade your router to one of these: https://www.amazon.com/Firewall-Appliance-Gigabit-AES-NI-Barebone/dp/B072ZTCNLK

with a single 4gig DIMM and a small flash drive, the whole thing is about $350. Since you already run Centos, run Centos on the router as well (I'd use Debian because that's what I'm familiar with), use FireQOS to shape your traffic. Bond two of the NICs and make that your LAN and plug to the managed switch on a link aggregation group you set up in the switch... so you have some redundancy and get better multi-cpu utilization. Then plug each remaining ethernet separately into each VDSL... later you'll just have one metro ethernet plugged to the router. When that's the case, add the extra ethernet into the bonded group.

Put separate FireQOS queues on each of your DSL lines with numbers tuned to each line. Later upgrade it to just the one metro ethernet queue. Under the metro ethernet queue scenario, put shaping on the output of the WAN and on the output of the LAN, and do no inbound shaping anywhere. Then use firewall rules to ensure all packets are DSCP tagged as they cross the router. Because you're shaping output, the firewall rules run before the queues, and you'll be able to use the DSCP in the queues.

Next, add a squid proxy to the router and have all the internal devices access the web entirely through the proxy. Add some "delay pools" in the squid proxy to limit the speed that people access YouTube or other non-critical bandwidth hogs. This is the best place to slow down YouTube and the like because you can do it based on the URL that's requested rather than based on IP addresses. You can tell squid to DSCP tag its output, so you can lower the priority on this kind of stuff as well.

Once you've got all this in place, cutting over to the metro ethernet should be easy. You can drop the ISDN lines entirely once you have metro ethernet. From a cost perspective my guess is replacing two ISDN lines and two DSL lines with one metro ethernet is probably cheaper, and so spending the money on the hardware to get it working right is a cost savings. Also, no point in working out bugs in your current system when you will get way better performance on the new system, better to upgrade your hardware in prep for that scenario, tune the software, and then cut over.

Can a consumer router running LEDE handle the traffic on metro ethernet? Yes there are devices that can. Can your current router? My guess is just barely. Your performance with a full Centos distribution and 4G RAM and 4 Intel cores to shape traffic and provide a squid proxy and etc... will be enormously better, and extra hardware cost for this setup is marginal, probably less than an afternoon of your salary.

My 2c

AH NO WAIT, I see, you have two separate routed internal LANs using VLANS? In that case, yes you might somehow have things configured so that it has to hit the router to route stuff between say the phones and the PBX or something similar. If that's the case, eliminate that by getting the managed switch and reorganizing traffic in the switches!!!!

All packets from a phone destined for the pbx should hit a switch and get switched direct to the PBX not routed. The PBX will probably need 2 VLAN interfaces, one for management and one for the voice VLAN.

EDIT: also for this to work the phones have to be configured as tagged vlans, because you've got computers hanging off the back. The computers on one vlan and the phones on the other. Both on the same wire... The computers could be untagged, but the phones need to be tagged.

Again, you'll benefit from having managed switches in each department as well... gigabit for the computers on separate ports, and 100M for the phones. You can do the phone tagging in the managed switches.

If this is what's going on I even more STRONGLY recommend that you replace that switch with the managed one I mentioned. I have that switch and it works like a champ, well built, and the firmware is easy to use. With that in place, the switch will see your phones on their VLAN and the PBX on the same VLAN and just switch the traffic with the router completely out of the loop. This will also reduce CPU load on your router.

So @dlakelan advise about managed switches certainly seems like a good idea!
I have a hunch though, that in your case you might be getting almost as much by cleverly distributing/partitioning your devices with the available switches (in the end VLANs just emulate physically separate L2 domains :wink: ).
So if you can collect all phones and the pbx on one switch shared with not other devices and the connect that switch say via a routed tagged VLAN to your router you effectively isolated the phones from the rest (unless the rest needs to talk to the phone). As long as the dedicated switches internal bandwidths is higher than the combined traffic the phones can produce I would expect 0 packet loss (again using prioritization on a shared switch will also be limited by the total bandwidth available to the phones' priority tier). That will not be as flexible as the bigger solution, but it might be easier to keep organized by simple behavioral rules (connect only phones to the phone switch, connect other devices to the other switches...)

Best Regards

Of course, you're right about VLANs emulating L2 switches. But in this case, he's got the computers hanging off a second port on the phones. He'd have to rewire every computer to a new switch. Depending on the physical arrangement that could be problematic and costly. Eventually under the "new" metro-ethernet situation (which I'm inferring is actually at a new office which they'll occupy next year) setting up the wires correctly could be a good situation. In that situation, you can still use the managed switches... but under the current situation it seems like either rewiring a building you're about to abandon, or getting managed switches and using VLANs are the main solutions to packet loss. The managed switches seem more drop-in than a rewire of all the computers.

Thinking about this some more, the only way it makes sense to me that the 50/50 shaper on the router would have any effect at all is if the desktop computers and wifi clients are going through the router to route to the file server, and the dropped packets are caused by long delays under high fileserving load. In that scenario the 50/50 shaper would be essentially reserving time on the switches for voice packets, and the cake qdisc is probably prioritizing the voice packets automatically.

If on the other hand, you have 100Mbit switches and you let computer hit the fileserver hard, then you can get bufferbloat in the switching gear and delays that cause packets to arrive too late, and the voice quality suffers. This is especially true on the switching trunks between the office switches in operations or sales or warehouse, and the main switch near the centos server. Along those links, if they're 100Mbit, a single desktop computer asking for a file from the centos server can saturate the link and without managed switches and QoS your voice packets will drop out. If you're routing through the router to get to the centos switch, and you choke off the maximum speed to 50 mbit... then of course you're reserving time on the trunk lines for voice packets, but it's a not good solution comparatively as all the desktop computers will slow down, and the router will have a lot of CPU load that is unnecessary, and etc.

Replacing all the switches with gigabit managed switches is the best idea. But even if that isn't what you do, just replacing the one up near the centos server with a gigabit managed, and the ones in operations/sales/warehouse with gigabit dumb switches would relieve a lot of the switching pressure along those trunk lines.

The other thing though, is that the PBX and the phones should all be on the same IP network and VLAN, and the computers and fileserver should be on their own VLAN and IP network. It shouldn't be the case that either the phones or the desktop computers are routing through the router to get to their destination. If that's already the case, that they don't go through the router, then the fact that putting a 50/50 shaper on the router LAN interface had an effect on your VOIP is ... suspect. What the heck? How could that happen unless the packets are received by the router, and then re-transmitted through the queue?

I think this is the case ^^

sorry for the month late reply been working on other projects, the switches are all gigabit but as the pbx is old hardware its a 10/100 network at the end of the day. the router is a WRT1200AC, I would prefer to drop the lot for a dedicated server and your advice above certainly is the way i'm going but we are waiting on open reach to get our dedicated ethernet line installed then we'll ditch the dual wans and in the next 6 months the office is having a rewire so i'm getting twin ethernet ports at each desk so i can seperate pc's and phones physically.

i'll certainly use your advice for the firewall and squid proxy though

I've used squid proxies in gaming machines we built for italian casino's at a company i worked for

but centos and red hat are my fav linux distros all my web servers and application servers are built on this but i use debian based distros on any desktop machines i find keeping your feet wet in both pools helps alot

thanks for all the insightful help

Looking at your network diagram again I see that the fileserver has two sets of IPs a 192.168.1.0/24 type address and a 10.0.0.0/24 type address

I also see that your desktop machines are getting DHCP in the range 10.0.0.0/24. If you want to avoid routing through the router when your desktops do filesharing, you need to ensure that your fileserver's address is advertised as the 10.0.0.0/24 version to the desktop machines. Whether that's through DNS internally or through some old school WINS or whatnot.

If the desktop machines are connecting to the 10.0.0.0/24 address then all the traffic is on the same subnet and should go straight through the switches. However, if even some of the desktop machines are connecting to the 192.168.1.0/24 address for the fileserver, then they will route through the router.

Just making this change in DNS and/or how people map drives or whatever could relieve some of your congestion issue right away "for free".