Archer 2600 - sqm, issues on Telekom VDSL2

Hummtaro · November 13, 2018, 10:59pm

Similar issues here!
Just upgraded to an Archer C2600 because of my old WDR4300 ran out of resources.
I always had a great ping and bufferbloat on my WDR4300: http://www.dslreports.com/speedtest/40858371
With the new C2600, the best result i can get is much worse: http://www.dslreports.com/speedtest/41761077
According to htop, one core is running 100% while speed test.
I have a VDSL2 Vectoring connection.

Even when i limit my speed to 50%, ping and bufferbloat keep the same.
I also changed the cpu governor to performance, so both cores are fixed to 1400mhz.

/etc/config/sqm

config queue
        option debug_logging '0'
        option verbosity '5'
        option upload '40000'
        option qdisc_advanced '0'
        option qdisc 'cake'
        option script 'piece_of_cake.qos'
        option download '95000'
        option enabled '1'
        option interface 'eth0'
        option linklayer 'ethernet'
        option overhead '22'

tc -s qdisc

qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8007: dev eth0 root refcnt 9 bandwidth 40Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100.0ms noatm overhead 22
 Sent 23711 bytes 80 pkt (dropped 0, overlimits 12 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 3264b of 4Mb
 capacity estimate: 40Mbit
 min/max network layer size:           16 /    1500
 min/max overhead-adjusted size:       38 /    1522
 average network hdr offset:            4

                  Tin 0
  thresh         40Mbit
  target          5.0ms
  interval      100.0ms
  pk_delay        285us
  av_delay         11us
  sp_delay          8us
  backlog            0b
  pkts               80
  bytes           23711
  way_inds            0
  way_miss           19
  way_cols            0
  drops               0
  marks               0
  ack_drop            0
  sp_flows            1
  bk_flows            1
  un_flows            0
  max_len          1514
  quantum          1220

qdisc ingress ffff: dev eth0 parent ffff:fff1 ----------------
 Sent 18293 bytes 79 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth1 root
 Sent 143069090 bytes 113511 pkt (dropped 0, overlimits 0 requeues 3478)
 backlog 0b 0p requeues 3478
qdisc fq_codel 0: dev eth1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 143069090 bytes 113511 pkt (dropped 0, overlimits 0 requeues 3478)
 backlog 0b 0p requeues 3478
  maxpacket 1506 drop_overlimit 0 new_flow_count 7993 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth1.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.7 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev pppoe-wan root refcnt 2 limit 10240p flows 1024 quantum 1518 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 5021086 bytes 60689 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 480 drop_overlimit 0 new_flow_count 30 ecn_mark 0
  new_flows_len 1 old_flows_len 0
qdisc cake 8008: dev ifb4eth0 root refcnt 2 bandwidth 95Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100.0ms noatm overhead 22
 Sent 19399 bytes 79 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 2Kb of 4750000b
 capacity estimate: 95Mbit
 min/max network layer size:           42 /     719
 min/max overhead-adjusted size:       64 /     741
 average network hdr offset:            4

                  Tin 0
  thresh         95Mbit
  target          5.0ms
  interval      100.0ms
  pk_delay         14us
  av_delay          2us
  sp_delay          2us
  backlog            0b
  pkts               79
  bytes           19399
  way_inds            0
  way_miss           30
  way_cols            0
  drops               0
  marks               0
  ack_drop            0
  sp_flows            4
  bk_flows            1
  un_flows            0
  max_len           733
  quantum          1514

I hope there's any chance to fix this.
This is very sad for such a otherwise very powerful device.
For now i would prefer to stay with my WDR4300, even it had much worse WiFi.

moeller0 · November 14, 2018, 9:55am

It generally is better to start a new thread for a new issue (you can post a link to a related thread if you believe things to be very similar).

Questions:
1.) Which ISP are you using?
2.) What are the reported sync values in the dsl-modem?
3.) Are you running the pppoe-client on the modem or on the router?

Hummtaro · November 14, 2018, 12:27pm

Okay, thanks. Should i create a new thread now or answer here?

My ISP is Deutsche Telekom.
My modem is a VMG1312-B30A in Full Bridge Mode, so OpenWRT has to do PPPoE.

Modem sync details

                    Mode:   VDSL2 Annex B
            VDSL Profile:   Profile 17a
                G.Vector:   Enable
            Traffic Type:   PTM Mode
===============================================================
       VDSL Port Details       Upstream         Downstream
               Line Rate:     41.998 Mbps      109.999 Mbps
    Actual Net Data Rate:     41.999 Mbps      110.000 Mbps
          Trellis Coding:         ON                ON
              SNR Margin:       15.0 dB           14.8 dB
            Actual Delay:          0 ms              0 ms
          Transmit Power:      - 4.5 dBm          12.3 dBm
           Receive Power:       -5.7 dBm           7.1 dBm
              Actual INP:       38.0 symbols      40.0 symbols
       Total Attenuation:        0.0 dB           5.6 dB
Attainable Net Data Rate:     51.124 Mbps      138.162 Mbps

moeller0 · November 14, 2018, 10:27pm

Ideally a new thread, I took the liberty to ask the moderators to move this part into a new thread to expedite this.

Great, in that case the overhead is either 34 bytes on top of the pppoe-wan interface (recommended) or 26 bytes on top of eth0.

Please enable ppp debugging (by replacing the "#debug" line in /etc/ppp/options with "debug"; you could use the following command: "sed -i 's/#debug/debug/g' /etc/ppp/options") and then do "logread | grep -e SRD". The PPP ACK will contain an estimate of the achievable goodput, so take the SRD and SRU values an multiply them:
Uplink: SRU * (1526)/(1500-8-20-20)
Downlink: SRD * (1526)/(1500-8-20-20)
Note these are the theoretical maximal shaper gross rates, so I would start with 99% of the just calculated values for the uplink and 90% for the downlink.

Please do and post test results (see https://forum.openwrt.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803/19 for recommendations for the dslreports speedtest).

The next step after getting overhead and bandwidth configured is to look at the finer details of what cake can offer (see https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details for an overview, especially the section titled " Making cake sing and dance, on a tight rope without a safety net").

Hummtaro · November 15, 2018, 12:26am

thanks for your help!
According to this wiki page, i used 38 as overhead because my ISP requires VLAN tagging.
Is this correct or was it already included in your calculation?

My new shaping values are:

daemon.info pppd[11335]: Remote message: SRU=39547#SRD=96783#
(39547 * (1526)/(1500-8-20-20)) / 100 * 99 = 41146
(96783 * (1526)/(1500-8-20-20)) / 100 * 90 = 91543

I did some dslreports tests with your recommended settings on both routers:

WDR4300:

C2600:

As you see, the C2600 still behaviors worse..
Maybe the C2600 really needs some fixes like on r7800 as you mentioned in the other thread?
Or the hardware is just not able to perform better..

I think that may a decent result now but i expected the same or a even better result with newer hardware..

moeller0 · November 15, 2018, 10:11am

I see the base line RTT seems to be almost twice as high with the C2600 than with the wdr4300, but in both cases the latency under load seems to be pretty flat, which I would chalk up as success.

Not sure, it would be interesting to look at the CPU load on the router while you perform a speedtest. You might want to have a look at https://forum.openwrt.org/t/speedtest-new-package-to-measure-network-performance/24647/36 which introduced a packet that will run a speedtest from your router that will also monitor CPU load and CPU frequency. This might give an indication about your router potentially running out of steam. I will add that this will only ever average CPU load accumulated over 1 second blocks, so it will not show all load spikes << 1 second, which still might negative influence the sqm performance. There is also https://github.com/dlakelan/routerperf by @dlakelan but this is in early alpha stage...

Anyway to get all the bells and whistles that cake offers tested:
Here is my proposed replacement for your /etc/config/sqm to enable per-internal IP fairness, nat-lookup and ingress-awareness, this also will enable ECN on outbound traffic (since your uplink seems fast enough):

config queue
       option debug_logging '0'
        option verbosity '5'
        option qdisc_advanced '1'
        option squash_dscp '0'
        option squash_ingress '0'
        option ingress_ecn 'ECN'
        option egress_ecn 'ECN'
        option qdisc_really_really_advanced '1'
        option linklayer 'ethernet'
        option overhead '34'
        option linklayer_advanced '1'
        option tcMTU '2047'
        option tcTSIZE '128'
        option tcMPU '64'
        option linklayer_adaptation_mechanism 'default'
        option iqdisc_opts 'nat dual-dsthost ingress'
        option eqdisc_opts 'nat dual-srchost'
        option interface 'pppoe-wan'
        option download '91543'
        option enabled '1'
        option upload '41146'
        option qdisc 'cake'
        option script 'layer_cake.qos'

To change this

stop the current sqm instance:
/etc/init.d/sqm stop
Edit /etc/config/sqm with the editor of your choice (I like nano, if not installed just run opkg update ; opkg install nano to get hold of an editor that is both less capable and more user-friendly than vi)
Start sqm again:
``/etc/init.d/sqm start
Check (and post) the output of:
tc -s qdisc

Give this a try and report back any comments you might have.
"nat" will allow cake to get to the true internal and external addresses wich seems important for the ingress shaper
"dual-xxxhost" will make cake first try to split the available bandwidth even between all concurrently active hosts (the way configured here will try to give each internal address an equal share of the bandwidth, this mode while super simple often comes close enough to what people want so they stop searching for the last ounce of QoS detail)
"ingress" will instruct cake to not try the customary approach where a shaper tries to enforce its outgoing bandwidth, but rather its incoming bandwidth. A subtle difference that makes cake deal better with different number of flows on the ingress side.

Hummtaro · November 15, 2018, 5:01pm

I used your sqm config for now but i think i prefer per-stream over per-IP fairness.
But anyway, here the resuslts with your config.
Looks not much different for me.

tc -s qdisc

qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev eth0 root
 Sent 96410040 bytes 128801 pkt (dropped 0, overlimits 0 requeues 4167)
 backlog 0b 0p requeues 4167
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 96410040 bytes 128801 pkt (dropped 0, overlimits 0 requeues 4167)
 backlog 0b 0p requeues 4167
  maxpacket 1514 drop_overlimit 0 new_flow_count 3255 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc mq 0: dev eth1 root
 Sent 185535310 bytes 181903 pkt (dropped 0, overlimits 0 requeues 10791)
 backlog 0b 0p requeues 10791
qdisc fq_codel 0: dev eth1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 185535310 bytes 181903 pkt (dropped 0, overlimits 0 requeues 10791)
 backlog 0b 0p requeues 10791
  maxpacket 1506 drop_overlimit 0 new_flow_count 16260 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth1.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.7 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 800a: dev pppoe-wan root refcnt 2 bandwidth 41146Kbit diffserv3 dual-srchost nat nowash no-ack-filter split-gso rtt 100.0ms noatm overhead 34 mpu 64
 Sent 93482385 bytes 127507 pkt (dropped 3271, overlimits 81701 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 440768b of 4Mb
 capacity estimate: 41146Kbit
 min/max network layer size:           40 /    1492
 min/max overhead-adjusted size:       74 /    1526
 average network hdr offset:            0

                   Bulk  Best Effort        Voice
  thresh       2571Kbit    41146Kbit    10286Kbit
  target          7.1ms        5.0ms        5.0ms
  interval      102.1ms      100.0ms      100.0ms
  pk_delay          0us        1.0ms         39us
  av_delay          0us         86us          4us
  sp_delay          0us          9us          4us
  backlog            0b           0b           0b
  pkts                0       130670          108
  bytes               0     98282250        17529
  way_inds            0           20            0
  way_miss            0          472           70
  way_cols            0            0            0
  drops               0         3271            0
  marks               0            0            0
  ack_drop            0            0            0
  sp_flows            0            1            3
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0        11936          479
  quantum           300         1255          313

qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
 Sent 182200745 bytes 178438 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 800b: dev ifb4pppoe-wan root refcnt 2 bandwidth 91543Kbit diffserv3 dual-dsthost nat nowash ingress no-ack-filter split-gso rtt 100.0ms noatm overhead 34 mpu 64
 Sent 180861765 bytes 177537 pkt (dropped 901, overlimits 27516 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 1068Kb of 4577150b
 capacity estimate: 91543Kbit
 min/max network layer size:           32 /    1492
 min/max overhead-adjusted size:       66 /    1526
 average network hdr offset:            0

                   Bulk  Best Effort        Voice
  thresh       5721Kbit    91543Kbit    22885Kbit
  target          5.0ms        5.0ms        5.0ms
  interval      100.0ms      100.0ms      100.0ms
  pk_delay          0us        203us          7us
  av_delay          0us         31us          0us
  sp_delay          0us          7us          0us
  backlog            0b           0b           0b
  pkts                0       178436            2
  bytes               0    182200473          272
  way_inds            0          253            0
  way_miss            0          521            1
  way_cols            0            0            0
  drops               0          901            0
  marks               0            0            0
  ack_drop            0            0            0
  sp_flows            0            1            0
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0         1492          136
  quantum           300         1514          698

speedtest.sh

2018-11-15 17:16:02 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download:  70.65 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  11.540
    10pct:  12.203
   Median:  18.489
      Avg:  19.734
    90pct:  26.607
      Max:  44.805
 CPU Load: [in % busy (avg +/- std dev), 57 samples]
     cpu0:  46.0% +/-  6.9%
     cpu1:  85.2% +/-  4.0%
 Overhead: [in % total CPU used]
  netperf: 21.2%
.............................................................
   Upload:  37.25 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  11.598
    10pct:  12.201
   Median:  13.659
      Avg:  14.166
    90pct:  14.788
      Max:  24.127
 CPU Load: [in % busy (avg +/- std dev), 58 samples]
     cpu0:  43.0% +/-  1.9%
     cpu1:  21.2% +/-  5.5%
 Overhead: [in % total CPU used]
  netperf:  7.4%

speedtest.sh --concurrent

2018-11-15 17:27:37 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
............................................................
 Download:  60.71 Mbps
   Upload:  34.52 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:  22.621
    10pct:  26.431
   Median:  30.994
      Avg:  34.386
    90pct:  41.274
      Max:  71.895
 CPU Load: [in % busy (avg +/- std dev), 56 samples]
     cpu0:  87.0% +/-  0.0%
     cpu1:  91.2% +/-  0.0%
 Overhead: [in % total CPU used]
  netperf: 31.2%

I've also monitored both cpu clocks while testing. Mostly stable on 1400mhz, just some drops to 1200mhz.
But how can this produce a such high load?
This device has about double clock per core and double cores than my wdr4300..

moeller0 · November 15, 2018, 6:11pm

Your network, your decision ;). I guess I shold note that the dual-xxxhost isolationmodes do both, first level fairness by host IP-address and then for each IP-address per stream/per-flow fairness, so with only one IP-address active the dual-modes behave like the per-stream.
And yes traffic shaping unfortunately is expensive....

Hummtaro · November 15, 2018, 8:46pm

Only two devices here needs multiple streams with high bandwith and low latency.
So this two devices with most streams should get the most bandwith.
All other clients only do unimportant casual stuff.
But for now i will keep your settings and watch how it acts in practice.

Thank you very much for your help and explanation of sqm settings.
Can you tell me how the overhead is calculated? I still wonder about additional bytes for vlan tagging..

Whats the conclusion now about the C2600?
It just has higher RTT and is not the best choice if 20ms are of importance?

moeller0 · November 15, 2018, 9:39pm

Sure than you might not need the dual-xxxhost isolation modes, but the might still save your bacon when one of the other devices decides to use an significant number of concurrently active flows/streams (which might never happen in reality).

Well, for VDSL2/PTM the only method I know is to look into the relevant ITU standards (in the G. series). Once you understand what is actually packaged into those PTM-frames you can start to make educated guesses...
Well you also need to take things into account you know about the link (in the DTAG case the fact that a dual VLAN tag is used at the BNG and a single VLAN tag is used on the VDSL2-link itself).

Since I was describing this in a similar thread, let me just cite myself from (https://forum.openwrt.org/t/sqm-flow-offloading-vlan-tagging-and-gaming/25113/14?u=moeller0)
"Well, on a Telekom vdsl2-link the actual overhead on top of the pppoe-wan device is actually 34 Bytes (8 bytes for PPPoE, 22 bytes for the ethernet frame (src-mac(6), dst-mac(6), ethertype(2), frame-check-sequence(4), VLAN(4))) and 4 bytes for the PTM overhead."
"DTAG actually uses a traffic shaper at the BNG/BRAS level, [...]" that accounts for the double VLAN tag which makes up for the "missing" PTM overhead.
It also helps that for the longest time 1TR112 documented 1526 as the maximum frame size on DSL-links, but that got changed recently to 1590 or so. But it is super unlikely that this has any relevance for a Telekom branded link ATM (I expect the use of baby-jumboframes in the future or MTU 1508 towards the BNG, so that the internet-visible MTU increases to 1500, but that is idle speculation).

Since I have too little experience with this router I will withhold judgement.

Klingon · November 16, 2018, 5:07pm

@Hummtaro Are you using irqbalance package?

I have an Archer C2600 running as AP, and the performance is correct (not excellent, I'm having some Wifi issues).

Hummtaro · November 16, 2018, 8:50pm

@Klingon I just installed it, thanks for putting it on record! Didn't know it before.
But it doesn't help relating to bufferbloat: http://www.dslreports.com/speedtest/41904388

The performance is good so far. it performs only slightly worse than my old router with this optimized settings thanks to @moeller0. And this is only under load on wan interface. Everything else runs as expected.
I think i wouldn't call it an issue anymore.