SQM Optimal Settings For My DSL

moeller0 · July 31, 2019, 4:21pm

Looking at the speedtest-netperf.sh results, I see actually less latency with ATM(4) versus None(0), and also less goodput; both signs of ATM(40) working as intended. So your ATM(40) does not seem to be general, but rather a sporadic issue... too bad these are the least fun to debug...

Regarding the WinMTR result, all intermittent hops show 100% packet loss (indicating that they actively drop the mtr packets, probably out of a misunderstanding of the security issues of responding to these probes) that makes this trace unfortunately not very useful for this purpose. Try mtr against a more remote host like (77.190.170.40, will only be responsive for a few more hours before it is going to be recycled), and look at say 20 mtr probes before starting the speedtest, so you get a feel for packet loss and latency number without loading your link.

code403 · July 31, 2019, 5:11pm

77.190.170.40 is not responding so probably got recycled already like you mentioned

with ATM overhead im only getting like 60% of my bandwidth though, its synced at 4096/1024 right now, speedtest sqm off shows 3.3mbit down/800kbit up, what do you think would be the optimal values ? or should i trial and error that

code403 · July 31, 2019, 5:23pm

i don't know how my last reply got marked as a solution, probably a misclick

moeller0 · July 31, 2019, 5:36pm

Intersting, this is my current IPv4 address and I seem to be able to ping it, also dslreport's smokeping and line quality test seem to be able to reach it, I guess somewhere between us is an aggressive filter in place, you might have more luck with either using, UDP, TC, or ICMP probes, winmtr, I believe, by default uses ICMP packets...

From this SQM off goodput numbers I predict the following shaper bandwidth numbers:

3.3 / ((1474-14-20-10)/(ceil((1474-14+40) / 48 ) * 53)) = 3.91 Mbps
0.8 / ((1474-14-20-10)/(ceil((1474-14+40) / 48 ) * 53)) = 0.95 Mbps

to result in similar bufferbloat numbers. Realistically I wouls lower tese numbers by 5-10% to get some headroom.

Yes! That generally is the best approach IMHO. Except in your case I have a hunch that your ISP might play funny games and/or have congested uplinks (either of the DSLAM or the BRAS/BNG) and it is really hard to detect and correct this from a shaper in your homenetwork. If you have money to burn, you might want to look at evenroute's iqrouter, which is OpenWrt-based and offers to automatically try to track variable rate internet connections by repeatedly measuring actual goodput and adjusting the shaper to track that. (Caveat, while impressed by that product and the company, I have not tried this myself and I also believe this will only work if the links bandwidth/achievable goodput changes relatively slow/infrequent, so no solution for an LTE/GSM/UMTS link).

moeller0 · July 31, 2019, 5:37pm

You might be able to uncheck this box somehow, or ask the admins, other wise the thread will auto-close after 10 days of inactivity...

code403 · July 31, 2019, 5:52pm

i seem to be able to ping it too, but its not working with winmtr, how can i try different probes ?

so 3500/800 or even 3400/700 for more headroom with 40 bytes overhead should be fine i assume ? also can you please explain the used values (1474, 14, 20, 10 ..etc)?

don't think i can do that

yea i already did that

moeller0 · July 31, 2019, 7:03pm

Actually I do not know, but you should be able to install mtr on your pi:
opkg update ; opkg install mtr
and then run mtr -z -b 77.180.5.157 (my dsl link just reconnected several times in a row and every time I get a new IP address), with mtr -h it tells me:

root@router:~# mtr -h

Usage:
 mtr [options] hostname

 -F, --filename FILE        read hostname(s) from a file
 -4                         use IPv4 only
 -6                         use IPv6 only
 -u, --udp                  use UDP instead of ICMP echo
 -T, --tcp                  use TCP instead of ICMP echo
 -a, --address ADDRESS      bind the outgoing socket to ADDRESS
 -f, --first-ttl NUMBER     set what TTL to start
 -m, --max-ttl NUMBER       maximum number of hops
 -U, --max-unknown NUMBER   maximum unknown host
 -P, --port PORT            target port number for TCP, SCTP, or UDP
 -L, --localport LOCALPORT  source port number for UDP
 -s, --psize PACKETSIZE     set the packet size used for probing
 -B, --bitpattern NUMBER    set bit pattern to use in payload
 -i, --interval SECONDS     ICMP echo request interval
 -G, --gracetime SECONDS    number of seconds to wait for responses
 -Q, --tos NUMBER           type of service field in IP header
 -e, --mpls                 display information from ICMP extensions
 -Z, --timeout SECONDS      seconds to keep probe sockets open
 -M, --mark MARK            mark each sent packet
 -r, --report               output using report mode
 -w, --report-wide          output wide report
 -c, --report-cycles COUNT  set the number of pings sent
 -j, --json                 output json
 -x, --xml                  output xml
 -C, --csv                  output comma separated values
 -l, --raw                  output raw format
 -p, --split                split output
 -t, --curses               use curses terminal interface
     --displaymode MODE     select initial display mode
 -n, --no-dns               do not resove host names
 -b, --show-ips             show IP numbers and host names
 -o, --order FIELDS         select output fields
 -y, --ipinfo NUMBER        select IP information in output
 -z, --aslookup             display AS number
 -h, --help                 display this help and exit
 -v, --version              output version information and exit

so, -u or -T seem interesting. But it seems that TE simply filters a lot of things it should not filter.
I would not be surprised if your ping spikes are caused by your ISPs filtering equipment...

I assume TE from the dns-servers, in case you wonder...:

                                                                                           Packets               Pings
 Host                                                                                    Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. AS???    nacktmulle.lan (192.168.1.1)                                                 0.0%   267    1.5   2.0   0.8  74.6   5.9
 2. AS6805   loopback1.0002.acln.02.fra.de.net.telefonica.de (62.52.201.193)              0.0%   267   49.2  20.2  11.8 282.1  23.6
 3. AS6805   bundle-ether29.0002.dbrx.02.fra.de.net.telefonica.de (62.53.0.64)            0.0%   267   12.9  13.3  12.0 141.9   8.3
 4. AS6805   bundle-ether2.0004.prrx.02.fra.de.net.telefonica.de (62.53.8.189)            0.4%   267   12.4  13.4  12.1 119.2   7.8
 5. AS12956  176.52.252.28 (176.52.252.28)                                                0.0%   267   13.0  13.7  12.2  83.5   5.5
 6. AS174    be12956.agr41.fra03.atlas.cogentco.com (130.117.14.117)                      0.0%   267   13.2  13.4  12.5  42.5   1.9
 7. AS174    be3187.ccr42.fra03.atlas.cogentco.com (130.117.1.118)                        0.0%   267   13.5  14.9  12.6 137.2  13.2
 8. AS174    be2960.ccr22.muc03.atlas.cogentco.com (154.54.36.254)                        0.4%   267   19.0  19.5  18.3  83.6   5.5
 9. AS174    be3073.ccr52.zrh02.atlas.cogentco.com (130.117.0.61)                         0.0%   267   23.5  24.9  23.5 147.5  10.3
10. AS174    be3081.ccr22.mrs01.atlas.cogentco.com (130.117.49.113)                       0.0%   267   37.0  37.6  35.5 167.3  11.6
11. AS174    telecom-egypt.demarc.cogentco.com (149.14.125.170)                           0.0%   267   71.7  75.2  70.3 210.8  13.7
12. ???
13. ???
14. ???
15. AS8452   dns-cache.tedata.net (163.121.128.134)                                       0.0%   266   78.8  81.2  78.3 251.0  17.2

code403 · July 31, 2019, 8:05pm

here are the results while a download was running on a different device

sqm 3400/700, here's a concurrent test:
Screenshot_6
another run:
Screenshot_7

3.3 / ((1474-14-20-10)/(ceil((1474-14+40) / 48 ) * 53)) = 3.91 Mbps
0.8 / ((1474-14-20-10)/(ceil((1474-14+40) / 48 ) * 53)) = 0.95 Mbps

can you please explain the formula values ? like what is 53, 48, 14, 1474, 20, 10 ?

moeller0 · July 31, 2019, 9:00pm

Okay comparing Best versus Worst, I see "ping_spikes" of ~11ms already to your modem, and in the 20 - 100ms range to all other hosts, this indicates weakly, that part of your observed ping spikes might be caused by network elements outside of your home network. Then again, without seeing the last hop all of this might simply be caused by deprioritization of the ICMP probes by intermediate network elements....

Okay, out of 3400/700 I expect maximally:
3400 * ((1474-14-20-20)/(ceil((1474-14+40) / 48) * 53)) = 2846.7 Mbps
0.7 * ((1474-14-20-20)/(ceil((1474-14+40) / 48) * 53)) = 0.586 Mbps

So only seeing 2.39/0.45 and higher latencies than at 2900/600 indicates again to me that while your ADSL-link might have 4000/1000 somewhere higher your ISP only has something around 3000/600 Kbps of bandwidth available for you. In other words something upstream of your home network is congested. Unfortunately that is a situation that is hard to debloat from your home, unless you accept a hefty bandwidth sacrifice and try to set the shaper at

2.4 / ((1474-14-20-20)/(ceil((1474-14+40) / 48) * 53)) = 2.87 Mbps
0.45 / ((1474-14-20-20)/(ceil((1474-14+40) / 48) * 53)) = 0.537 Mbps
(At least that should work for the network conditions during the time you ran the speedtest).

Sure,
53: the on the wire size of an ATM cell (48 bytes payload and a 5 byte ATM header)
48: the payload size of an ATM cell (all IP/PPPoE/Ethernet packets are payload from ATM's perspective)

Now, the typical encapsulation scheme for ethernet over ATM is AAL/5 and AAL/5 requires that all individual user packets are packaged into an integer number of ATM cells, with any byte of the last ATM cell not consumed by payload data padded out. So ceil(payload/48) gives the number of ATM cells required for the given payload size, and ceil(payload/48)*53 is the effective size of the data on the ATM link. On ATM links that required bandwidth for the ATM header (and the padding) will need to be serviced out of the sync bandwidth...

14: from cake's tc -s qdisc: average network hdr offset: 14, the number of header bytes the kernel added to the IP size of your packets
1474: from cake's tc -s qdisc: max_len 0 1474 590, the size of the biggest packet the kernel saw on that interface, this size is size(IP) + size(kerel header bytes)
so 1474-14 = 1460 the size of the largest IP packet
20: one of the 20s is to account for the 20 byte IPv4 header
10: the other 20 is there to account for the 20 byte TCP header

Ah, I see I made a mistake the 10 should have been a 20 as well, sorry for that:
3.3 / ((1474-14-20-20)/(ceil((1474-14+40) / 48 ) * 53)) = 3.94 Mbps
0.8 / ((1474-14-20-20)/(ceil((1474-14+40) / 48 ) * 53)) = 0.955 Mbps
Notice that the error introduced by that mistake is relatively minor, given that you reduced the rates down to 3400/700.

I wonder, whether you had a look at gargoyle, which also offers an adaptive traffic-shaper that might be able to automatically adjust to variable rate links like your.
Also you you post the result from the Share your results box (simply copy and paste here into the forum) of https://www.speedguide.net/analyzer.ph to see how your packet headers actually look. Here is an example for my link:

« SpeedGuide.net TCP Analyzer Results » 
Tested on: 2019.07.31 16:54 
IP address: 77.180.x.xxx 
Client OS/browser: Mac OS (Safari 605.1.15) 
 
TCP options string: 020405ac010303060101080a5626d5f70000000004020000 
MSS: 1452 
MTU: 1492 
TCP Window: 131712 (not multiple of MSS) 
RWIN Scaling: 6 bits (2^6=64) 
Unscaled RWIN : 2058 
Recommended RWINs: 63888, 127776, 255552, 511104, 1022208 
BDP limit (200ms): 5268kbps (659KBytes/s)
BDP limit (500ms): 2107kbps (263KBytes/s) 
MTU Discovery: ON 
TTL: 52 
Timestamps: ON 
SACKs: ON 
IP ToS: 00000000 (0)

See in my case the TCP/IPv4 payload would be calculated as 1500-8-20-20-12, (my MTU is 1500, the 8 bytes is for PPPoE, 20 for IPv4 and 20 for TCP, and an additional 12 for TCP timestamps)
(Tip: if you paste preformatted or fixed font width text, start and end with otherwise empty lines of three backticks "`" )

Final thought, 10.45.10.23 and IP size 1460 hints at your ISP using ds-lite:

computer:~ user$ whois 10.45.10.23
% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object

inetnum:      10.0.0.0 - 10.255.255.255
organisation: IANA - Private Use
status:       RESERVED

remarks:      Reserved for Private-Use Networks [RFC1918].Complete
remarks:      registration details for 10.0.0.0/8 are found
remarks:      iniana-ipv4-special-registry.

changed:      1995-06
source:       IANA

In that case there will be another 40 Byte IPv6 header inside your packets, and far worse there will be a carrier-grade network address translation device (CG-NAT) somewhere between you and the internet. These beasts are known to be occasionally overloaded, which might effortlessly explain your spurious problems. Do you have any chance of using IPv6 instead? Because ds-lite will only slow down IPv4 and should leave IPv6 alone. (But I heard that game companies are well behind ine the whole Ipv4-2-IPv6 transition train).

code403 · July 31, 2019, 9:29pm

my line is synced at 4000/1000 but the actual speeds i get are 418kbyte down / 105kbyte up which translates to 3344kbit down / 840kbit up so thats my actual speed

so i should set the sqm to 2870/537 ? i think that would be too low with the overhead
(btw the 2.4 and 0.45 are only on concurrent tests, sequential has higher speeds)

i will look at it

« SpeedGuide.net TCP Analyzer Results » 
Tested on: 2019.07.31 17:25 
IP address: 156.222.xxx.xxx 
Client OS/browser: Windows 10 (Chrome 75.0.3770.142) 

TCP options string: 0204058c0103030801010402
MSS: 1420  
MTU: 1460 
TCP Window: 66560 (not multiple of MSS) 
RWIN Scaling: 8 bits (2^8=256) 
Unscaled RWIN : 260 
Recommended RWINs: 65320, 130640, 261280, 522560, 1045120 
BDP limit (200ms): 2662kbps (333KBytes/s)
BDP limit (500ms): 1065kbps (133KBytes/s) 
MTU Discovery: ON 
TTL: 111 
Timestamps: OFF 
SACKs: ON 
IP ToS: 00000000 (0)

my isp doesn't support IPV6

my MTU is 1460 but the analyzer recommends 1500 should i change it ?

moeller0 · July 31, 2019, 9:53pm

How did you measure the 418/105 KByte/sec numbers?

    Well that is a tradeoff, reliably low latency under load increases or full bandwidth utilization, which compromise to make between the two, really is up to your own policy, your network, your rules.

Ah, good point, plug in the numbers from the sequential tests then, with the bidirectional you also have the ACK packets that consume bandwidth but are nor counted as they do not transfer any payload.

That seems to indicate MTU to the internet 1500-40 = 1460 (40 byte IPv6 header)
MSS = MTU - IPv4 header - TCP header: 1460-20-20 = 1420

The puzzling bit is that this seems to be missing the 8 byte PPPoE header.

And that in turn might indicate your ISP using baby-jumboframes of 1508 bytes, so the PPPoE header does not eat into the normal 1500 byte ethernet MTU.
That will slightly change our calculations to:
2.4 / ((1474-14-20-20)/(ceil((1474-14+40+8) / 48) * 53)) = 2.87 Mbps
0.45 / ((1474-14-20-20)/(ceil((1474-14+40+8) / 48) * 53)) = 0.537 Mbps
due to the ATM cell quantization this actually does not change anything... The fun thing about AAL/% encapsulation is that its effects are somewhat non-intuitive.

Oh, yes they do, they might not actually hane IPv6 prefixes to endcustomers, but both the additional 40 bytes of overhead (seen on the 1460 MTU) and the use of private addresses in your ISPs network are strong indicators od dualstack-lite (ds-lite) a system where IPv4 packets are tunneled inside IPv6 packets, between the end customers and the CH-NAT appliances. I have no absolute proof that your ISP uses such system, but a very strong hypothesis... Not that this will help you very much...

code403 · July 31, 2019, 10:11pm

via Net Limiter internet monitoring, they're pretty near sqm off dslreports speed test (3.3mbit/0.8mbit)

i guess i will stick with my values for now and lower if necessary later

there's a couple users on my internet right now and i don't think that would produce the best results so maybe tomorrow i can test it if we continue with this

this went on for too long and im pretty sure its a problem with my ISP infrastructure (being on a copper line doesn't help either) but we did some good tests, thanks for the help ^^

moeller0 · July 31, 2019, 10:21pm

Intersting question is, what kind of overhead do they actually take into account, often these things go by the kernel reported packet size, which typically id IP packet size + 14 bytes (6 source mac, 6 destination mac, 2 ethertype), this is neither fish nor flesh.
Real ethernet effective overhead that a shaper needs to account for is 38 bytes.
Also 418KBytes/second can mean both 418 * 1024 Bytes (so actually KiB) or 418 * 1000 (KB) so reporting KB/s is maximally hard to interpret correctly.
That said, for all I know NetLimiter might do the right thing here.

That is the best method IMHO, iteratively test and see what latency under load increas you find tolerable given the required bandwidth sacrifice on your link. (You might also thing about simply reducing the shaper rates when you play your game and return them to higher rates afterwards, assuming other's cherish bandwidth of latency more).

Sure, you did all the work and I did all the lecturing (the easy part)

code403 · August 1, 2019, 12:07pm

3600/800 http://www.dslreports.com/speedtest/52587155
sequential test with the same values

not bad for now

moeller0 · August 1, 2019, 12:28pm

So I thought a bit more about the probably ds-lite/baby-jumbo-frame situation on your link, and I realised that with the modem doing the PPPoE/ds-lite interactions your router does not really know about this, so the calculations should be actually like the following:

2.4 / ((1474-14-20-20)/(ceil((1474-14+40+40+8) / 48) * 53)) = 2.95 Mbps
0.45 / ((1474-14-20-20)/(ceil((1474-14+40+40+8) / 48) * 53)) = 0.554 Mbps

But at the sam time you should set the overhead to 80 bytes!

Applying this logic to the new sequential numbers this would get you to:
2.71 / ((1474-14-20-20)/(ceil((1474-14+40+40+8) / 48) * 53)) = 3.337 Mbps
0.60 / ((1474-14-20-20)/(ceil((1474-14+40+40+8) / 48) * 53)) = 0.739 Mbps
Again with ATM(80), but I really believe that your ping variations are most likely caused upstream of your modem, so an adaptive method like Gargoyle's (which is free AFAIK) or the IQ router's might be the best fit for you problem...

shm0 · August 11, 2019, 12:14pm

The IQ Router thing looks interesting...
It does periodic speed tests on schedule to measure the line caps and set the shaper limits accordingly?

moeller0 · August 11, 2019, 2:11pm

Yes, as far as I know this is exactly what they do, automatic Speedtests to automatically adjust the shaper settings to the real conditions. Add to this notifications for updates, it is hard not to like the iqtouter. This is what I recommend for non-nerd family and friends, and people who just want a working router without having to constantly fiddle with in general. It is the iqrouter or the Turris Omnia for slightly more ambitious users, both solutions with a solid OpenWrt basis and automatic update/update notification systems to reasonable security an achievable goal. My immediate family probably wishes I would practice what I preach here, but OpenWrt Master is ever so tempting

slh · August 11, 2019, 8:01pm

Just be aware that speedtests do transmit a considerable amount of test data, which might become a problem on limited contracts.

ishiryoku · July 12, 2022, 12:07am

Is there a script that makes an adaptive SQM? I would like to try or see if there is something like that, because this would be more useful for 4g links that are unstable than for fixed lines, although I use adsl there are not usually so many variations, but some variations do occur and this, as far as I understand, is a dynamic configuration of SQM works better.

dtaht · July 12, 2022, 1:07am