SQM - BB vs LEDE - major diff in performance

So here are the results with a C7v2 on a 250Mbps cable line.

Interestingly, the C7v2 gets more throughput than 3600 with SQM off but less throughput than 3600 with SQM on.

30 streams for 20 seconds to load line

239 Mbps Total with QoS off

/etc/config/sqm

config queue 'eth1'
option qdisc 'fq_codel'
option linklayer 'none'
option qdisc_advanced '1'
option squash_dscp '1'
option squash_ingress '1'
option ingress_ecn 'ECN'
option egress_ecn 'NOECN'
option etarget 'auto'
option script 'simple.qos'
option interface 'eth0'
option upload '10000'
option enabled '1'
option download '239000'

stop/start SQM

tc -s qdisc

qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 12 direct_packets_stat 0 direct_qlen 1000
Sent 1074 bytes 12 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 110: dev eth0 parent 1:11 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 120: dev eth0 parent 1:12 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
Sent 1074 bytes 12 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 159 drop_overlimit 0 new_flow_count 8 ecn_mark 0
new_flows_len 0 old_flows_len 1
qdisc fq_codel 130: dev eth0 parent 1:13 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc ingress ffff: dev eth0 parent ffff:fff1 ----------------
Sent 656 bytes 8 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
Sent 1390126 bytes 3154 pkt (dropped 0, overlimits 0 requeues 3)
backlog 0b 0p requeues 3
maxpacket 542 drop_overlimit 0 new_flow_count 7 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc htb 1: dev ifb4eth0 root refcnt 2 r2q 10 default 10 direct_packets_stat 0 direct_qlen 32
Sent 878 bytes 10 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 110: dev ifb4eth0 parent 1:10 limit 1001p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
Sent 878 bytes 10 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 145 drop_overlimit 0 new_flow_count 7 ecn_mark 0
new_flows_len 1 old_flows_len 0

30 streams for 20 seconds to load line

98 Mbps Total with QoS on

Post SQM test results

tc -s qdisc

qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 12 direct_packets_stat 0 direct_qlen 1000
Sent 6939184 bytes 97579 pkt (dropped 0, overlimits 289 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 110: dev eth0 parent 1:11 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
Sent 1865 bytes 19 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 157 drop_overlimit 0 new_flow_count 19 ecn_mark 0
new_flows_len 1 old_flows_len 0
qdisc fq_codel 120: dev eth0 parent 1:12 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
Sent 6937319 bytes 97560 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 1514 drop_overlimit 0 new_flow_count 55729 ecn_mark 0
new_flows_len 0 old_flows_len 1
qdisc fq_codel 130: dev eth0 parent 1:13 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc ingress ffff: dev eth0 parent ffff:fff1 ----------------
Sent 261763953 bytes 175045 pkt (dropped 1, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
Sent 1526595 bytes 3439 pkt (dropped 0, overlimits 0 requeues 3)
backlog 0b 0p requeues 3
maxpacket 542 drop_overlimit 0 new_flow_count 7 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc htb 1: dev ifb4eth0 root refcnt 2 r2q 10 default 10 direct_packets_stat 0 direct_qlen 32
Sent 261672753 bytes 173367 pkt (dropped 1680, overlimits 122368 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 110: dev ifb4eth0 parent 1:10 limit 1001p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
Sent 261672753 bytes 173367 pkt (dropped 1680, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
maxpacket 1514 drop_overlimit 647 new_flow_count 36204 ecn_mark 0
new_flows_len 0 old_flows_len 2

tc -d qdisc

qdisc noqueue 0: dev lo root refcnt 2
qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 12 direct_packets_stat 0 ver 3.17 direct_qlen 1000
qdisc fq_codel 110: dev eth0 parent 1:11 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
qdisc fq_codel 120: dev eth0 parent 1:12 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
qdisc fq_codel 130: dev eth0 parent 1:13 limit 1001p flows 1024 quantum 300 target 5.0ms interval 100.0ms
qdisc ingress ffff: dev eth0 parent ffff:fff1 ----------------
qdisc fq_codel 0: dev eth1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc noqueue 0: dev wlan1 root refcnt 2
qdisc htb 1: dev ifb4eth0 root refcnt 2 r2q 10 default 10 direct_packets_stat 0 ver 3.17 direct_qlen 32
qdisc fq_codel 110: dev ifb4eth0 parent 1:10 limit 1001p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn

So it looks like there is shelf limit on the throughput with QoS on in LEDE and we can't load the line with QoS set to line speed on high speed lines. If it was due to processing power you would think the C7v2 would get more speed than the 3600. So, it looks like SQM is doing something to prevent loading the line on high speed connections. If I drop QoS below 100 Mbps the netperf load tests get within 10% of QoS setting like expected. As I set over 100 Mbps I get very little upside on my actual throughput.

So, setting QoS to 50 Mbps downoad I get 45 Mbps throughput.

Going to QoS of 100 Mbps download I get 85 Mbps throughput.

From there I get very little improvement as I increase QoS.

QoS of 125 Mpbs only improves to 89 Mpbs throughput.

QoS of 200 Mbps only improves to 94 Mbps throughput.

It does not look like it runs out of CPU or sirq capacity either, something limits the flow above a certain point. Could it be because we are running the netperf process in the router itself and SQM limits locally sourced traffic?