SQM Download Speed slower than expected

Speedtest from my provider

  • bufferbloat servers (betterspeedtest.sh)
  • iperf3 servers from speedtest.myloc.de
    All show the same behavior.

Thanks. For discussions like this I really like the dslreports speedtest (configured and shared as described in https://forum.openwrt.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803 as this allows to get a better view of how a link performs. But I assume that test will not give noticeably different results (but just a more detailed view into the test results).

Hmm.
I don't know...
Upload is also weird.
Without sqm:
[SUM] 8.00-9.00 sec 4.82 MBytes 40.4 Mbits/sec
With sqm and limit set to 41984
[SUM] 7.00-8.00 sec 2.31 MBytes 19.4 Mbits/sec
With sqm and limit set to 100000
[SUM] 7.00-7.61 sec 3.00 MBytes 41.6 Mbits/sec

What is going on?

Hard to say.

I guess my point is, without knowing your /etc/config/sqm, the output of "tc -s qdisc" and a description of how you performed the above tests, all I could do is pure speculation (and meltdown and spectre probably reminded all of us that that might have side effects :wink: ).

For testing i used a simple config

config queue
	option debug_logging '0'
	option verbosity '5'
	option linklayer 'none'
	option interface 'eth1.20'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option download '450560'
	option qdisc_advanced '0'
	option upload '41984'
	option enabled '1'

IPerf 30sec Upload Test:

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.00  sec  7.46 MBytes  2.08 Mbits/sec  173             sender
[  5]   0.00-30.00  sec  7.39 MBytes  2.07 Mbits/sec                  receiver
[  7]   0.00-30.00  sec  7.44 MBytes  2.08 Mbits/sec  166             sender
[  7]   0.00-30.00  sec  7.39 MBytes  2.07 Mbits/sec                  receiver
[  9]   0.00-30.00  sec  7.42 MBytes  2.07 Mbits/sec  171             sender
[  9]   0.00-30.00  sec  7.37 MBytes  2.06 Mbits/sec                  receiver
[ 11]   0.00-30.00  sec  7.43 MBytes  2.08 Mbits/sec  168             sender
[ 11]   0.00-30.00  sec  7.38 MBytes  2.06 Mbits/sec                  receiver
[ 13]   0.00-30.00  sec  7.39 MBytes  2.07 Mbits/sec  169             sender
[ 13]   0.00-30.00  sec  7.35 MBytes  2.05 Mbits/sec                  receiver
[ 15]   0.00-30.00  sec  7.45 MBytes  2.08 Mbits/sec  165             sender
[ 15]   0.00-30.00  sec  7.40 MBytes  2.07 Mbits/sec                  receiver
[ 17]   0.00-30.00  sec  7.46 MBytes  2.09 Mbits/sec  172             sender
[ 17]   0.00-30.00  sec  7.42 MBytes  2.07 Mbits/sec                  receiver
[ 19]   0.00-30.00  sec  7.42 MBytes  2.08 Mbits/sec  166             sender
[ 19]   0.00-30.00  sec  7.38 MBytes  2.06 Mbits/sec                  receiver
[ 21]   0.00-30.00  sec  7.43 MBytes  2.08 Mbits/sec  171             sender
[ 21]   0.00-30.00  sec  7.39 MBytes  2.07 Mbits/sec                  receiver
[ 23]   0.00-30.00  sec  7.46 MBytes  2.09 Mbits/sec  158             sender
[ 23]   0.00-30.00  sec  7.43 MBytes  2.08 Mbits/sec                  receiver
[SUM]   0.00-30.00  sec  74.4 MBytes  20.8 Mbits/sec  1679             sender
[SUM]   0.00-30.00  sec  73.9 MBytes  20.7 Mbits/sec                  receiver

tc output after test:

qdisc cake 802b: root refcnt 2 bandwidth 41984Kbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms raw overhead 0
 Sent 81238203 bytes 55223 pkt (dropped 1677, overlimits 95594 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 190400b of 4Mb
 capacity estimate: 41984Kbit
 min/max network layer size:           42 /    1474
 min/max overhead-adjusted size:       42 /    1474
 average network hdr offset:           14

                  Tin 0
  thresh      41984Kbit
  target          5.0ms
  interval      100.0ms
  pk_delay        7.2ms
  av_delay        4.0ms
  sp_delay          2us
  backlog            0b
  pkts            56900
  bytes        83710101
  way_inds            0
  way_miss           22
  way_cols            0
  drops            1677
  marks               0
  ack_drop            0
  sp_flows           12
  bk_flows            1
  un_flows            0
  max_len          7370
  quantum          1281

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 2321814 bytes 42111 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

sqm set to 100000

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-30.01  sec  13.9 MBytes  3.89 Mbits/sec   35             sender
[  5]   0.00-30.01  sec  13.7 MBytes  3.83 Mbits/sec                  receiver
[  7]   0.00-30.01  sec  14.0 MBytes  3.90 Mbits/sec   32             sender
[  7]   0.00-30.01  sec  13.8 MBytes  3.87 Mbits/sec                  receiver
[  9]   0.00-30.01  sec  13.8 MBytes  3.85 Mbits/sec   33             sender
[  9]   0.00-30.01  sec  13.6 MBytes  3.81 Mbits/sec                  receiver
[ 11]   0.00-30.01  sec  16.8 MBytes  4.68 Mbits/sec   31             sender
[ 11]   0.00-30.01  sec  16.6 MBytes  4.63 Mbits/sec                  receiver
[ 13]   0.00-30.01  sec  12.6 MBytes  3.53 Mbits/sec   33             sender
[ 13]   0.00-30.01  sec  12.5 MBytes  3.51 Mbits/sec                  receiver
[ 15]   0.00-30.01  sec  14.1 MBytes  3.93 Mbits/sec   27             sender
[ 15]   0.00-30.01  sec  14.0 MBytes  3.91 Mbits/sec                  receiver
[ 17]   0.00-30.01  sec  14.4 MBytes  4.01 Mbits/sec   29             sender
[ 17]   0.00-30.01  sec  14.2 MBytes  3.97 Mbits/sec                  receiver
[ 19]   0.00-30.01  sec  14.4 MBytes  4.03 Mbits/sec   29             sender
[ 19]   0.00-30.01  sec  14.3 MBytes  4.00 Mbits/sec                  receiver
[ 21]   0.00-30.01  sec  15.1 MBytes  4.22 Mbits/sec   34             sender
[ 21]   0.00-30.01  sec  15.0 MBytes  4.20 Mbits/sec                  receiver
[ 23]   0.00-30.01  sec  12.6 MBytes  3.51 Mbits/sec   34             sender
[ 23]   0.00-30.01  sec  12.5 MBytes  3.49 Mbits/sec                  receiver
[SUM]   0.00-30.01  sec   142 MBytes  39.6 Mbits/sec  317             sender
[SUM]   0.00-30.01  sec   140 MBytes  39.2 Mbits/sec                  receiver

iperf Done.
qdisc cake 802e: root refcnt 2 bandwidth 100Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms raw overhead 0
 Sent 154882204 bytes 105175 pkt (dropped 10, overlimits 150386 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 179Kb of 5000000b
 capacity estimate: 100Mbit
 min/max network layer size:           42 /    1474
 min/max overhead-adjusted size:       42 /    1474
 average network hdr offset:           14

                  Tin 0
  thresh        100Mbit
  target          5.0ms
  interval      100.0ms
  pk_delay        1.0ms
  av_delay        413us
  sp_delay          2us
  backlog            0b
  pkts           105185
  bytes       154896944
  way_inds            2
  way_miss           23
  way_cols            0
  drops              10
  marks               0
  ack_drop            0
  sp_flows            1
  bk_flows            1
  un_flows            0
  max_len          8844
  quantum          1514

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 3072237 bytes 57369 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Hmm i have the feeling my isp is doing some kind of QoS/AQM?
When i disable sqm and saturate the upload, pings are around ~75ms.
I would expect a much higher latency under a saturated link.
But still too high for my taste.

Do you know by chance what the max channel bandwith for docsis 3.0/QAM16 is (upstream)?

You mean perfomance wise?
I don't know how much it does effect the arm cortex a9.
I think it is not affected by meltdown only spectre v2.
But i don't think it will effect the shaping of 40Mbit/s upload....

I wanted to give the updated sqm-scripts ago with htb or hfsc.
But somehow only fq_codel and cake are showing up as available qdiscs?

//edit
Tried the old qos scripts package from openwrt.
Shows the same behavior as the sqm package.
As soon as i enable qos the bandwidth is halved.

1 Like

It was a joke, @moeller0 was saying he shouldn't speculate without data.

Your situation is almost as though the math is wrong, like something's getting multiplied by 2 in the packet size calculation...

1 Like

:smiley:

Seems like the problem was, the cpu scaling patch, i recently added.
:expressionless:

But seems like...
the segment here is overloaded.
Yesterday late at night i was able to set sqm to 100% sync rate.
And had nice low pings while saturating the uplink.
Today pings already start to raise when upload speeds reach ~ 30 MBit/s.
Yeah fast and good internet in germany.
Sorry we don't do that here...
I feel like going back in time
:expressionless:

Sorry, this one should not have crept below my quality control :wink:

Great, I get rewarded for being late, you solved the riddle yourself...

The joy of DOCSIS. Well, I take that back as this is not really cable specific, all shared-something techniques suffer from this*, the issue is segment size (measured in number of users) versus segment aggregate bandwidth. Now at least with the prospect of DOCSIS3.1 around the corner you can hope that the mandatory PIE-AQM in the modem should at least give you a tolerable worst-case bufferbloat in the egress/uplink direction...

*) The question is not "is there a shared" segment, but rather at what point in the network path the sharing starts. But DOCSIS traditionally has less favorable split than GPON...

Yeah the joy of shared mediums x)

I think it will take some time before docsis 3.1 will be available here for the wide range of users.
I read, they tested 3.1 in some cities but only had problems.

Also all upload channels are running with qam16 modulation, in their support forum someone wrote that only a small amount of connections are running with qam64 modulation.

Completely switching over to qam64 would give more bandwidth? more headroom?

Actually i don't know what they are doing.
The installation down in the cellar looks like crap. (the installation in the old house was also bad)
If i had the equipment i would fix it myself.
They use ds lite, if you were lucky you could get a native ipv4 connection.
Now they over dual stack but the ipv4 part is crippled to 1460 mtu.
If you want only ipv4, sorry we can't do that, but dual stack with ipv4 and ipv6 is no problem x)
They don't over plain modems, all routers they give out have the intel puma6 bug.
In other countries they are operating in, they offer at least a bridge/modem mode.
But sorry in germany we can't do that either.
And now vodafone want to buy that company.
When worse gets even worse x)

I assume so, QAM16 will only transmit 4 bits per symbol, while QAM64 will use 6 bits per symbol, so a QAM64 channel will have 1.5 times the bandwidth than a QAM16 one. I have very little recent experience with docsis but I belief QAM16 to be the worst case uplink modulation, this is pretty terrible.

So I assume this is UM, are they really using full dual stack or rather ds-lite? But even in the ds-lite case (where each IPv4 packet incurs an addition 40 byte IPv6 header, I believe the idea was that the CPE-CMTS connection should use baby jumbo frames so that the MTU into the internet should still be 1500, just showing how naive my beliefs seem to be...).

Well, personally I would never want IPv4 only, IPv6 is not only the future but the transition already started so it is also the present (plus IPv6 elegantly side-steps the nasty reachability-from-the-outside issues caused by CG-NAT).

Not even "fixed" firmware releases?

Not sure, I heard great things about the DOCSIS section of Vodafone in Germany, including that they seem to allow customers with non-rented modems to choose between dual-stack, ds-lite and IPv4, and they seem to have a decent information policy towards end customers, so not all in this coming change might be as bleak as you might think...

1 Like

BTW, this seems to be not uncommon, that traffic shaping and cpu scaling does not seem to harmonize very well, it could be that the shaper is bursty enough for the CPU to scale down prematurely or maybe the scaling governors might not be looking at sirq load carefully enough....

And for the most part NAT64 works great. I think just go with ipv6 only they'll shove the entirety of the ipv4 internet into a tiny corner of ipv6 and translate over for you... I've tested Android, Linux and windows machines on ipv6 only LAN with tayga on the router and it works well for most things. I do suspect a few games and things will suffer. For those devices you can run clat on the router, give out a few static ipv4s to the few machines that need it.

Or just use their dslite but have your router only give out ipv4 static reservations to the few devices that can't handle ipv6 only, game stations etc

1 Like

Yes. They offer full dual stack now but you have to add option to your plan.
Either "Power-Upload" (which obviously is useless cause segment overloads everywhere) or "Telefon Comfort" Option.

I think the MTU is 1460 because the "main" gateway is ipv6 and the ipv4 part is handled by a different gateway. So they created a tunnel between the two?

They did after 1 year or so x)

Let's see what the time brings...

1 Like

That could be a IPv6 tunnel (as the IPv6 header takes 40 bytes) or potentially a tool that reports TCP maximum segment size (MSS) instead of MTU (the 20 byte IPv4 header and the 20 bytes TCP header are deducted from the MTU to get the MSS (I am simplifying here a bit)).

Could be.

I never had DSLITE (luckily i had a native ipv4 connection in the past and now dual stack).
From forum posts (unoffical isp forum) i could infer that with dslite there are some problems...
The IP is changing alot and no port forwards are working. (cause the aftr is doing nat?)
Now the question is...
Is it possible to configure an AFTR Gateway to operate like a "normal" gateway?
So it assigns a somewhat static ipv4 address and opens up the ports ?
Then the "main" gateway has an ipv6 tunnel connection to the AFTR gateway which serves the ipv4 connection?

I replaced my isp router with a plain modem(not easy to get an eurodocsis modem in germany),
connection is much much better.
Less Errors, better latency, more download channels (32 vs 24)
I set cake to the advertised speeds (400/40 Mbit/s) which ends up in ~380/38 MBit/s (tcp/ip overhead?)
Works good, nice low pings ~20ms while the connection is saturated.
Only thing that bugs me a bit...
cake puts by default all arp traffic into the high priority tin.
On a docsis connection that can be a lot of traffic/packets that end up in the high priority tin.
I measured gigabytes over a month of arp traffic. Most of it is useless anyway.
Maybe i create an arptables rule that drops all that unneeded traffic or remove the arp to cs7 mapping in cake.

Whaaaaat? On the WAN?

Arp cache should mean you arp for your upstream gateway once on the wan and then Just renew that every few hours or something. Literally a megabyte a month would shock me, gigabytes of arp shouldn't happen.

Looks like arping more than once a minute is unlikely even if ARP packets are mtu sized that's 66MB per month. So 2 orders of magnitude smaller than what you see

This is some habit of docsis, all arp requests end up on each others wan interface.
When the node is crowded, some arp traffic is flowing...

some small update on the mtu 1460. With my own modem + openwrt router, the mtu gets reported as expected (1500).
But path mtu discovery is still forced...

Ah, you're receiving other people's ARPs yes that makes sense as the cable line is kind of a shared medium.

Once you've received the arp, there's not much to do about it. It seems unlikely that a diffserv on the download direction makes sense for SQM unless you can verify that useful DSCP tags come in on WAN. Perhaps save diffserv for just the upload direction.

This leaves a tast, that they still had you on ds-lite, at least that they still used a IPv6-tunnel.

This also would indicate ds-lite being still active somehow....

Also I believe that full dualstack does not us an AFTR... I have been told that Vodafone, allows users with their own modem to freely select from/switch between, dualstack, ds-lite, and IPv4 only, but the enduser needs to configure her/his modem accordingly, maybe it is similar with UM and the modem was still only half-way configured? Anyway since you now run your own modem and the issue seems gone, no need to spend more time on this question :wink: .

400 * (1500-20-20)/(1518) = 384.72 Mbps
40 * (1500-20-20)/(1518) = 38.47 Mbps

Seems to be in the right ballpark....

Well, unless your ISP uses meaningful DSCP markings (or does not touch the DSCPs coming in from its upstreams) you should either set the ingress cake instance to a single tier scheme or remark all packets automatically (but even that is not going to help with the arp traffic). I believe the rationale for the ARP prioritization is that if one drops arp packets (or delays them too much) in a normal situation this will cause user-visible irritations...

Maybe you could create a new issue on https://github.com/dtaht/sch_cake/issues proposing to make the ARP prioretization configurable via a keyword? The more data you supply the better. :wink: