SQM Optimal Settings For My DSL

If you are on an real ATM link (ADSL1, ADSL2, or ADSL2+) you really need to set this to ATM, no ifs and buts.

Yeah, that can be painful, especially the Uplink.

Ah, I see you used the windows version of the collector-script, that sees very little testing (I only have windows in a VM and I rarely start that up); I guess I should look inside that script again to figure out why it only seemed to have tested for around 50Bytes instead of 48*3 = 144. About the size of the files, no real idea either, I would guess that longer runtime should result in larger files, but without seeing the files I can only wildly speculate which is not going to help anybody.

Anyway, that plot now strongly indicates ATM cell encapsulation with an overhead of 40 Bytes.

You might want to add "ingress" to the list of cake keywords in the iqdisc_opts field, to be more robust against multiple concurrent ingress flows; that said at 3000/640 this is not going to make massive improvements.

BTW, what modem do you use and what statistics does your modem report? Is that modem running in bridged mode or did you have to configure a username and password (for PPPoE perhaps) on that modem?

i'm running Openwrt on a raspberry pi 3b, router is in pppoe mode wifi off and the raspberry pi connected on lan port 1 which has DMZ enabled the pi is the only AP available with SQM enabled on it, by statistics do you mean the speed (4096 down/1024 up if so) ? or full connection statistics ? would it be better if i use a modem instead and do the PPPoE connection in OpenWrt ?

Preferably the full connection statistics (and if possible the name and version of the modem).

That depends, if the Pi does the PPPoE decapsulation it will be able to detect connection hangs, has more freedom to hand out DHCP addresses and/or you do not need double NAT. But for your ping spike issue, this change will not necessarily help...

Screenshot_26 Screenshot_27

the ISP router is zxhn h108n v2.5

Thanks, now the next interssting piece of information would be the result of a dslreports speedtest, with SQM disabled and with SQM enabled. See https://forum.openwrt.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803 for thoughts how to configure the test and how to report the results here in the forum.

The error counters, which I hoped to look at in detail are unfortunately a bit too terse (though 0/104 CRC errors is really low, unless the uptime of the link was well below one hour, but you are at 21915/3600 = 6.1 hours already, so the link seems to be clean).

SQM off http://www.dslreports.com/speedtest/52480349
SQM on http://www.dslreports.com/speedtest/52480584
the speeds are fluctuating because im not the only person using the internet right now. but still even with a bufferbloat rating of A if i play a game and then try downloading something on a different device i still get huge lag spikes, i even get A+ rating sometimes but keeps lagging as well

These seem to be two copies of the same link, I would guess with SQM active.

Fair enough, makes interpretation a bit harder, but not by much. (Except the SQM off test would be best with no other traffic active).

Do you have data that shows this? I guess one issue is that at 3000/640 you might be really bandwidth starved, and then all SQM can do is move the pain around...

Silly question, there is no other device connected to the ZTE router, only the pi3b?

sorry about that, fixed

data like ? sometimes it works fine with only 5-10ms lag added but recently its been so bad

and yes there are no other devices connected to the router

Thanks, it is clear to me that SQM really does help your link to be usable under load. Now, lets try to tackle the ping spike issue.

Maybe a packet trace or the output of an mtr/winmtr session running during the experienced latency spike?

How bad exactly? And one thing to try when things run badly, would be a) a dslreports speedtest, and a dslreports speedtest after halving both the ingress and egress shaper rate (to check whether the issue might be related to your DSLAM's uplink, in that case sqm on your router will need to be set to the lowest reliably reachable shaper rates), Also it would be good to monitor the modem's error counters, line instabilities and CRC errors will cause havoc to your data transfers.

But in reality all these hypothesis are rather weak, but without alternatives maybe still worth researching.

SQM is definitely doing something but the problem is, i get way better results with None link layer adaptation than with ATM 40 bytes overhead

I see, could you run top -d 1 during one of your tests tat shows ping spikes and look at the sirq percentage of the pi? For testing you could run a dslreports speedtest, or even the fast.com speedtest (that also allows to configure upload testing as well as longer run-times) while visually monitoring sirq.
Or you could try the speedtest package under https://forum.openwrt.org/t/speedtest-new-package-to-measure-network-performance/24647 as that will also look at the CPU load during the test.

That points to a bug or overload somewhere, thanks to your measurements we have proof of the overhead being 40 bytes, and encapsulation being ATM so pretending this was not true will run the shaper at higher rates than you would think and especially with small packets, not accounting for ATM cell encapsulation will underestimate effective packet-size by 505 to 33% and that will make your shaper ineffective against bufferbloat if the packet size mix on your link contains to many small packets. That said, unless the "bug" is fixed I do not blame you for using the "none" option as the reason for running sqm is not being theoretically correct but being practically better.
I would just like to figure out why link-layer accounting does not seem to work for you...

CPU: 0% usr 0% sys 0% nic 99% idle 0% io 0% irq 0% sirq
always like this and cpu usage never goes past 5%, and sirq 2%

2900/600 SQM btw

Interesting, now run the sme for 1 minute with a concurrent mtr/winmtr to see at what part of the path the latency rises most significantly...

so i run the speedtest-netperf script alongside a winmtr to 1.1.1.1 ?

i think the upload induces higher latency

code403 Mohamed Shikhany
July 31
moeller0:
you could try the speedtest package under Speedtest: new package to measure network performance as that will also look at the CPU load during the test.

2900/600 SQM btw

If this test was with linklayer None, could you repeat with linklayer ATM and overhead 40, please?

SQM 2900/600 Link Layer Adaptation None (Sequential And Concurrent):

Screenshot_2

ATM 40 Overhead:

Screenshot_4

Looking at the speedtest-netperf.sh results, I see actually less latency with ATM(4) versus None(0), and also less goodput; both signs of ATM(40) working as intended. So your ATM(40) does not seem to be general, but rather a sporadic issue... too bad these are the least fun to debug...

Regarding the WinMTR result, all intermittent hops show 100% packet loss (indicating that they actively drop the mtr packets, probably out of a misunderstanding of the security issues of responding to these probes) that makes this trace unfortunately not very useful for this purpose. Try mtr against a more remote host like (77.190.170.40, will only be responsive for a few more hours before it is going to be recycled), and look at say 20 mtr probes before starting the speedtest, so you get a feel for packet loss and latency number without loading your link.