SQM on Netgear R7800 problem

Hi all,

after some years and several updates on my R7800 I tried again to enable SQM.

So this is what I did:

  1. Upgraded to OpenWrt 19.07.2 without talking over old setting. Therefore I had to set-up the router from completely.
  2. Installed the SQM according to user-guide
  3. Registered at dslreports.com and tweaked setting according to moeller0

First tests with SQM disabled ended with similar results:


Download: 129 Megabit/s = 129.000 Kilobit/s
Upload: 6 Megabit/s = 6.000 Kilobit/s

  1. Enabled SQM
    Interface name: eth0.2 (wan,wan6)
    Download speed (kbit/s): 122550 (95%)
    Upload speed (kbit/s): 5700 (95%)
    ...
    Queuing disciplines: fq_codel (default)
    Queue setup script: simplest.qos
    ...
    Per Packet Overhead (byte): 22
    ...

  2. The new speedtest now suddenly shows a dramatic lower download that before.
    https://www.dslreports.com/speedtest/62332245

Any ideas?

I remember that there was also a thread specifically for R7800 related to the KONG firmeware, where also discussion in regards to SQM were found. Some reasons were CPU as bottleneck. However, that was on the old firmware and there are several users with the same router getting much better results than me. And somehow I cannot find this thread anymore.

Thanks for your help in advance.

One big reason is that in the "slow" test, you lost the third upstream test server, which provided quite much of the speed...

Slow:

Quick one:


Some general comments:
Your asymmetric 129/6 connection is one basic problem. 21 to 1 ratio means that with the full download speed, much of your upload bandwidth goes likely just to protocol traffic.

In that speedy test, you bufferbloat was huge, and latency during download averaged 550-600 ms, which makes internet really sluggish.

Using SQM to restrict upload, even that small amount, likely causes part of that drop. You might test going a bit up from 95% to 97% or so.

Also test the normal "simple.qos" instead of the "simplest.qos".
And try also cake.qos with piece_of_cake qdisc.

Ps. is that "per packet overhead" right (and needed) for you?

Give cake (with piece_of_cake.qos) a shot. Turn on ack-filter on the slow upstream path.

Hi you two,

thanks very much for your support. I tried your suggestions. The most difficult thing was to restart the speedtest so many times, until three server locations were sucessfully selected so that 16 streams were used. Please find below my results:

  1. no SQM: https://www.dslreports.com/speedtest/62369628
  2. SQM: 122550(down); 5700(up); fq_codel; simplest.qos: https://www.dslreports.com/speedtest/62369678
  3. SQM: 122550(down); 5700(up); fq_codel; simple.qos: https://www.dslreports.com/speedtest/62369947
  4. SQM: 122550(down); 5700(up); cake; piece_of_cake.qos: https://www.dslreports.com/speedtest/62370022

What do you think?

Your asymmetric 129/6 connection is one basic problem.
...
Ps. is that "per packet overhead" right (and needed) for you?

I'm using a cable provider with 120 MBit/s down and 6 MBit/s up according to my contract. Therefore I also selected 22 as Overhead, correct?

Turn on ack-filter on the slow upstream path.

What does that mean?

1 Like

in sqm, for cake (only) there are a few parameters not well supported by the gui. if you are natting, add nat. adding ack-filter makes sense given a >15x1 up/down ratio. I don't use the overhead parameter if it is cable, just the docsis keyword.

    option iqdisc_opts 'docsis besteffort ingress nat'
    option eqdisc_opts 'docsis ack-filter nat'

Otherwise your tests look pretty good. I'm puzzled as to why you are having trouble getting all your flows started, however.

This is what ack-filter does: http://blog.cerowrt.org/post/ack_filtering/

More detail: https://arxiv.org/pdf/1804.07617.pdf

2 Likes

You results are pretty identical. You reach something 115/5.3 with A+ on all.
Good, pretty much the maximum that you can achieve.

Just use the qdisc that feels best (and gives lowest CPU load). I guess that cake might be a bit more CPU intensive, but with your pretty low upload speed it should be manageable.

I've had similar issues on the R7800 with SQM cake enabled. Since I don't seem to have much bufferbloat on the download side, just on the upload (DOCIS 3.0), I disabled shaping on the download by setting bandwidth to 0. Now I get full speed on the speed test for download test, and very little bloat on the upload test. I was not aware of the ack-filter. I'll try disabling overhead and enabling the docsis options.

What modem do you use? Accidentally one with the notorious intel puma6/7 bug? I ask because I see weird >100 ms spikes in the idle and in the sqm shaped results (and these can be caused, by measuring over wifi, or by the puma6 bug)...

Hi everyone,

thanks very much again for your support.

Thanks for the advise. Even I read the first blog article, I don't really understand what it does :wink: How would a measure the effects this setting does? Should I give it a try, although the CPU is much more used by cake (see below)?

So I measured the %idle value via "top -d l"
no SQM: ~80% down / ~95% up
fq_codel/simplest: ~80% down / ~93% up
fq_codel/simple: ~80% down / ~93% up
cake/piece_of_cake.qos: ~40% down / ~93% up

Would you then recommend the 2nd or 3rd option?

I'm using a rather old "Cisco EPC3208 EuroDocsis 3.0 2-PORT EMTA" with firmware e3200-E10-5-v302r125562-130611c_upc.
I don't see it listed on that bug page.

I would. I have noticed earlier that cake is CPU intensive, so I highlight the CPU issue. High CPU load from qdisc might slow down the other functions of the router.

But in the end it depends on your use case, the download practices and the other load for the router. Just try both for some time and pick the one that feels best...

Personally I use simple/fq_codel, but I have something like 190/19 Mbit, so my SQM needs are rather modest (with normal browsing & office apps).

With a r7800 the best settings for the balance of throughput and latency on (speed test using master 4.19 ~2 months ago) the following:

fq_codel + simplest.qos + software offloading enabled + more aggressive CPU settings:

fq_codel + simplest_tbf.qos + software offloading enabled + more aggressive CPU settings:


root@OpenWrt:~# echo 35 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold;
 echo 10 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor

root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth0'
        option interface 'eth0.2'
        option qdisc 'fq_codel'
        option qdisc_advanced '0'
        option enabled '1'
        option download '540000'
        option debug_logging '0'
        option verbosity '5'
        option script 'simplest_tbf.qos'
        option linklayer 'none'
        option upload '34000'

I usually run with SQM only on the upload side (Same settings as above - just zero for download):

1 Like

In your last result, I'm pretty sure the bufferbloat test component of the dslreports test has been failing for a lot of people, lately, and I'm not sure why. It could be the link getting slammed so hard when you are not also inbound shaping. I'd be pretty interested in cakes' behavior with ack-filter on on the uplink and no download shaping....


root@OpenWrt:~# uname -a
Linux OpenWrt 5.4.33 #0 SMP Sun Apr 19 07:26:12 2020 armv7l GNU/Linux

r7800, Performance CPU governor + software flow offloading enabled + irqbalance enabled + packet steering enabled + these SQM settings:


root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth0'
        option interface 'eth0.2'
        option enabled '1'
        option debug_logging '0'
        option verbosity '5'
        option linklayer 'none'
        option upload '32500'
        option qdisc 'cake'
        option script 'piece_of_cake.qos'
        option qdisc_advanced '1'
        option squash_dscp '1'
        option squash_ingress '1'
        option ingress_ecn 'ECN'
        option egress_ecn 'NOECN'
        option qdisc_really_really_advanced '1'
        option eqdisc_opts 'docsis ack-filter nat'
        option download '0'

http://www.dslreports.com/speedtest/62492577

r7800, Performance CPU governor + software flow offloading enabled + irqbalance enabled + packet steering enabled + these SQM settings:


root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth0'
        option interface 'eth0.2'
        option enabled '1'
        option debug_logging '0'
        option verbosity '5'
        option linklayer 'none'
        option upload '32500'
        option download '0'
        option qdisc 'fq_codel'
        option script 'simplest_tbf.qos'
        option qdisc_advanced '0'

http://www.dslreports.com/speedtest/62492769

Analysis: Middle of the afternoon download speed is a little below normal due to usual increased network load. Cake is grouchy about software offloading and/or 0 for download. :man_shrugging:

to me it looks like yer running out of cpu, and should stick to tbf + fq_codel

and one of these damn days we should summon the moxie to inbound shape across cores.

And outbound, pls: some of us are suffering with symmetrical fibre links :wink:

How big is the suffering on symmetrical fiber?

if someone would toss at least $4k/month into https://www.patreon.com/dtaht I'd get back more F/T in fixing bloat issues. Living in yurt got old multiple years back. Terribly chilly in the winters.

1 Like

There was a smile in there. It will cost way less to throw a Xeon based server at the problem :wink:

and good for intel's sales too! :confused:

Another useful feature all these offload engines could would be a programmable completion interrupt for tx. You'd set that to what you would shape to, and completely eliminate the cpu cost of outbound shaping by treating it (with backpressure) as, (for example) a 35Mbit/sec interface.

There are a few ethernet devices that can do this, including one from intel, but I don't rmeember which one.

The use case for a symmetrical line is to work with the cloud (storage in particular): I have to upload/download a lot of data often. With fq_codel/simplest_tbf this router cannot reliably shape more than 200Mbps on the uplink on a good day (download is not shaped), so I am loosing a lot of bulk upload bandwidth in order to be able to have several video calls going through the router at the same time.
Having said that, this router might be the wrong tool for the job to begin with.

What does dsl speedtest look like with 250000 / 250000 with aggressive CPU settings (or performance governor) with zero overhead and no advanced SQM settings?