Questions about SQM

hbr · December 27, 2016, 10:59am

Hello,

I'm currently running LEDE CAPRICORN 1.2 r2640-08db3e1 / LuCI Master (git-16.358.28306-df0d765) on a Linksys WRT3200ACM. Build runs pretty good and is from @cybrnook

I'm on a 120/6 cable connection behind a Cisco EPC3212.

I started fiddling with SQM a couple of days ago and currently settled on "cake" / "piece_of_cake". All other settings are on default.

When doing tests on the DSLReports speedtest I can see improvements in the rest results. It usually goes from Cs or Ds to As or Bs. Most of the time I'm getting "A B A" with SQM enabled.

But when I enable ingress shaping the results usually get way worse. Sometimes all the way down to F for bufferbloat.

I tried a lot of different combinations and values for ingress / egress but it didn't matter much. As soon as ingress shaping has any other value than 0 the results get worse.

So I just tested with "ingress shaping off & egress shaping on" for a while and keep getting good results on the test with values between 80 - 95% of my maximum upload speed.

Today I also disabled egress shaping and am still getting the same grades and good results (still "A B A" most of the time now with "ingress shaping off & egress shaping off"

Rebooting the router after changing settings in SQM didn't seem to make a difference. Neither does changing queuing disciplines or scripts (similar good / bad results on the ones I tried).

So now I'm confused and have a couple of questions:

Any ideas why ingress shaping could lead to way worse results (and how to change / avoid that)?
Any idea why setting ingress / egress shaping to 0 still gets me good results?
Is SQM even in effect (much) when those are disabled?

fsclavo · December 27, 2016, 6:33pm

Cable networks have a shared bus network topology, and can have some weird (congestion related) behavior if CMTS is overloaded.
I have a SQM (fq_codel) enabled router and sometimes obtain an "A" or "A+" from DSLReports, and sometimes an "F", no matter if shaping is set on 90% or 60% of ingress/egress rate.

anomeome · December 27, 2016, 6:47pm

Something to check would be the nic that is defaulted for WAN OOTB, sqm will probably choose the WAN side incorrectly on every device in the series, except mamba; but build dependent, see PR.

richb-hanover · December 27, 2016, 8:33pm

If you're still having questions, I'd ask you to review the questions in Debugging SQM to Eliminate Bufferbloat and paste the answers into this thread. Thanks.

hbr · December 27, 2016, 10:12pm

Thanks for the replies so far,

here goes:

What brand/model router do you have?
- Linksys WRT3200ACM
LEDE Version
Community build LEDE CAPRICORN 1.2 r2640-08db3e1 / LuCI Master (git-16.358.28306-df0d765)
How do you connect to the internet? Cable? DSL, other?
Cable
What's your nominal/expected/advertised download speed? Upload speed?
120Mbit down / 6Mbit up
If you turn off all QoS/management, what are your measured download/upload speeds?
110.2Mbit down / 5.08 from a test run at dslreports at 11:41:10 this morning.
What is the WAN "interface name" in the Network -> Interfaces page?
eth0
What parameters do you see in the Network -> SQM-QoS values?
Enable this SQM instanceIs checked
Interface name: eth0
Download Speed (in kbps)
- Right now 0, setting it to anything else usually results in worse values for bufferbloat compared to SQM "off" or just setting a value for egress
Upload Speed (in kbps)
- Right now 0, values from 4800 to 5700 also seemed to work well
Queue discipline: cake / piece_of_cake
Set "Which Link Layer to account for:" none

SQM off:
http://www.dslreports.com/speedtest/7925700

SQM on (iirc this was with 4800 or 5100 egress):
http://www.dslreports.com/speedtest/7925535

moeller0 · December 27, 2016, 10:38pm

Mmmh, I am uncertain about the actual topology of that routers WANaccess port. Looking at https://wiki.openwrt.org/toh/linksys/wrt_ac_series it seems like all ports go through a switch. Could you try to reduce the rates to 50% of what you measure without sqm an retest? Also could you please post the output of the foolwing commands on your router's command line:

cat /etc/config/sqm
tc -d qdisc
/etc/init.d/sqm stop ; SQM_VERBOSITY_MAX=8 /etc/init.d/sqm start

This all seems a bit sub optimal, but I have no real idea what is going wrong. And it could be buffering in the switch (which I believe was the initial bufferbloat phenotype) or sqm simply not issueing the right commands, so the diagnostics I asked for might help in figuring our which avenue to follow..

Best Regards

hbr · December 27, 2016, 10:49pm

Heres the cmd line output. Will do the 50% tests tomorrow morning, connection is too busy in the evening.

root@hbr-router-wrt:~# cat /etc/config/sqm
config queue 'eth1'
    option qdisc_advanced '0'
    option linklayer 'none'
    option interface 'eth0'
    option verbosity '5'
    option download '0'
    option qdisc 'cake'
    option script 'piece_of_cake.qos'
    option enabled '1'
    option debug_logging '1'
    option upload '0'

root@hbr-router-wrt:~# tc -d qdisc
qdisc noqueue 0: dev lo root refcnt 2
qdisc mq 0: dev eth0 root
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth0 parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc mq 0: dev eth1 root
qdisc fq_codel 0: dev eth1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :5 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :6 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :7 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev eth1 parent :8 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc mq 0: dev wlan1 root
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc mq 0: dev wlan1-1 root
qdisc fq_codel 0: dev wlan1-1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1-1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1-1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1-1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc mq 0: dev wlan1-2 root
qdisc fq_codel 0: dev wlan1-2 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1-2 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1-2 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
qdisc fq_codel 0: dev wlan1-2 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn

root@hbr-router-wrt:~# /etc/init.d/sqm stop ; SQM_VERBOSITY_MAX=8 /etc/init.d/sqm start
SQM: Stopping SQM on eth0
SQM: Starting SQM script: piece_of_cake.qos on eth0, in: 0 Kbps, out: 0 Kbps
SQM: QDISC cake is useable.
SQM: Starting piece_of_cake.qos
SQM: ifb associated with interface eth0:
SQM: Currently no ifb is associated with eth0, this is normal during starting of the sqm system.
SQM: egress shaping deactivated
SQM: ingress shaping deactivated
SQM: piece_of_cake.qos was started on eth0 successfully

hbr · December 28, 2016, 9:30am

And here are the new test results (all done with cake / piece_of_cake).

SQM at 80 - 85% egress only (0 ingress, 4800 - 5100 egress) still seems to be the winner overall.

SQM "off":
http://www.dslreports.com/speedtest/7980589

SQM at 50% for ingress & egress:
http://www.dslreports.com/speedtest/7980622

SQM at 80% for ingress & egress:
http://www.dslreports.com/speedtest/7980665

SQM at 80% egress only:
http://www.dslreports.com/speedtest/7980868

SQM at 85% egress only:
http://www.dslreports.com/speedtest/7980728

hbr · December 28, 2016, 10:19am

Just in case it matters:

There is a second switch (Cisco SG 100D-08, unmanaged) connected to the router, it goes:

Wired devices -> Switch -> Router

moeller0 · December 28, 2016, 11:48pm

Okay, so the cmd output was with both ingress and egress shaping deactivated so everything as expected, but also not diagnostic for anything. Could you please redo this with at least ingress shaping activated? Also please add:

tc -d class show dev eth0
tc -s class show dev eth0
tc -d class show dev ifb4eth0
tc -s class show dev ifb4eth0

with both ingress and egress shapers active, right after running a speedtest.

The test results show that your egress is over-buffered and that shaping does bring a noticeable improvement (you might want to set the link layer accounting to ethernet and specify 4 bytes of additional overhead (on eth0 the linux kernel will already silently account for 14 bytes of overhead, so you only need to add the missing 4 to reach the 18 that DOCSIS systems seem to require)). I would even try, after setting the proper overhead to set the egress shaper at 100%.
The ingress shaper is more of a concern; I would guess that your cable segment might be quite full (as the tests without ingress shaping give pretty variable/hideous results. In that case sqm would be off the hook as with congestion our ingress shaping simply is at the merci of the CMTS. But I agree that the no-ingress shaping results have more samples with acceptable latencies than the ingress shaping ones, so unfortunatelt sqm-scripts might still be involved...

Final question, have you tried simple.qos with fq_codel as qdisc for ingress as well and could you post a link to a dslreports speedtest, please (also you could try to activate the high resoltion bufferbloat tests and up the test duration to 30 seconds for both directions to get more data quicker)

Best Regards

hbr · December 29, 2016, 9:40am

Thanks for the tips.

And yes, sadly my segment is pretty full. So picking the holidays to test something like this was not the best idea.

Will repeat the tests with your suggestions around next week when things should start to calm down again.

I tried fq_codel / simple in the beginning but cake / piece_of_cake seemed to get less spikes during the tests. Will stick to fq_codel / simple this time (seems to be more mature according to some reading I did so far).

richb-hanover · December 29, 2016, 1:33pm

Just to confirm: Is this the topology?

Wired devices -> Switch -> Router -> Cisco EPC3212 -> your cable ISP

moeller0 · December 29, 2016, 4:22pm

Oh, I do not want to imply cake might only be half baked, but rather testing htb+fq_codel at least once for comparison seems like a good thing to do. My hypothesis is that it behaves similar to cake, but real data would be nice.

Best Regards

hbr · December 29, 2016, 6:54pm

@richb-hanover yeah that's right.

Never had trouble with the cables but just for completeness:

Ethernet cables used are labeled CAT7 but are most likely just CAT6 with the highest grade of shielding since there's still (shielded) RJ45 plugs on them (and iirc real CAT7 cables don't have RJ45). The COAX cable is also one with a higher grade of shielding.

Topology with cable length:

Wired devices (3m and shorter) -> Switch (3m) -> Router (15m) -> Cisco EPC3212 (50cm) -> Cable stuff (outlet, amp etc) -> ISP

MattBroekemeier · December 29, 2016, 6:58pm

Should 4 bytes be added to every cable connection? I thought it only applied to DSL connections. Is it always 4 bytes or should I run the ATM detector script to find the correct value?

moeller0 · December 29, 2016, 8:52pm

Hi Matt,

well yes and no.The shaper used in DOCSIS systems that limits a users maximal bandwidth does completely ignore DOCSIS overhead and only includes ethernet frames including their frame check sequence (FCS 4 Byte). (The linux kernel accounts for ethernet framing without the FCS).

To cite the relevant section from the Docsis standard (http://www.cablelabs.com/specification/docsis-3-0-mac-and-upper-layer-protocols-interface-specification/):

"C.2.2.7.2 Maximum Sustained Traffic Rate 632 This parameter is the rate parameter R of a token-bucket-based rate limit for packets. R is expressed in bits per second, and MUST take into account all MAC frame data PDU of the Service Flow from the byte following the MAC header HCS to the end of the CRC, including every PDU in the case of a Concatenated MAC Frame. This parameter is applied after Payload Header Suppression; it does not include the bytes suppressed for PHS. The number of bytes forwarded (in bytes) is limited during any time interval T by Max(T), as described in the expression: Max(T) = T * (R / 8) + B, (1) where the parameter B (in bytes) is the Maximum Traffic Burst Configuration Setting (refer to Annex C.2.2.7.3). NOTE: This parameter does not limit the instantaneous rate of the Service Flow. The specific algorithm for enforcing this parameter is not mandated here. Any implementation which satisfies the above equation is conformant. In particular, the granularity of enforcement and the minimum implemented value of this parameter are vendor specific. The CMTS SHOULD support a granularity of at most 100 kbps. The CM SHOULD support a granularity of at most 100 kbps. NOTE: If this parameter is omitted or set to zero, then there is no explicitly-enforced traffic rate maximum. This field specifies only a bound, not a guarantee that this rate is available."

So in essence DOCSIS users need to (only) account for 18 Bytes of ethernet overhead in both ingress and egress directions under non-congested conditions. But since on an ethN interface the linux kernel already accounts for 14 of those for fq_codel+HTB specify the overhead as 4. For recent cake you can and should specify the overhead as 18 as cake can undo the kernels automatic overhead addition.

Best Regards

hbr · December 30, 2016, 11:48am

Found a good slot to do the tests earlier.

Overall results didn't change. Setting "linklayer" did also not make a difference. Still getting best results with ingress set to 0.

Changes for this test run:

Switched to "LEDE Reboot SNAPSHOT r2701-c5ca304 / LuCI Master (git-16.363.68908-f12fdba)" with minimal extra packages.
Used "fq_codel / simple" for all tests.
Values for ingress / egress range from 80% to 95%.
Enabled "Hi-Res BufferBloat" on the speed test and set download / upload times to 30 seconds.

Did the tests as per suggestions from @moeller0

Since it's a lot of text, I put the results on pastebin (including the links to the speed test results). They will expire in 2 weeks.

Speed test result with SQM off & cmd line output just after enabling SQM:

http://pastebin.com/cTKFUMnN

Ingress & egress on & link layer ethernet & cmd line output just after running a speed test:

Ingress & egress on & link layer none & cmd line output just after running a speed test:

Egress only & link layer none & cmd line output just after running a speed test:

Egress only & link layer ethernet & cmd line output just after running a speed test:

hbr · December 30, 2016, 11:54pm

Also just found these in the kernel log:

http://pastebin.com/3RXwFyAG

I guess they are from SQM_VERBOSITY_MAX=8? Hadn't noticed before tho.

moeller0 · December 31, 2016, 12:34am

HI hbr,

thanks for the data, I will take a few days before I find time to look over it closely as I did not find any smoking gun on my first reading (and I might not find one on fine reading either). The error messages you got in the log are an already known issue with the hfsc kernel module used by some of the qos scripts (hfsc_lite.qos, hfsc_litest.qos, and nxt_routed_hfsc.qos) if you tried those that might have triggered the message (but we also load the hfsc module during sqm start-up to have it available for the listed hfsc using scripts, auto-loading of modules is not reliable on all "supported" distributions)

Best Regards

moeller0 · January 13, 2017, 9:43pm

Okay, I have looked closer into your files and I am sorry to say, I made you do all these tests for no gain, I have no real idea why you seem to be better of with only upstream shaping. I also am out of realistic ideas what to test next...

Best Regards