Similar issues here!
Just upgraded to an Archer C2600 because of my old WDR4300 ran out of resources.
I always had a great ping and bufferbloat on my WDR4300: http://www.dslreports.com/speedtest/40858371
With the new C2600, the best result i can get is much worse: http://www.dslreports.com/speedtest/41761077
According to htop, one core is running 100% while speed test.
I have a VDSL2 Vectoring connection.
Even when i limit my speed to 50%, ping and bufferbloat keep the same.
I also changed the cpu governor to performance, so both cores are fixed to 1400mhz.
I hope there's any chance to fix this.
This is very sad for such a otherwise very powerful device.
For now i would prefer to stay with my WDR4300, even it had much worse WiFi.
It generally is better to start a new thread for a new issue (you can post a link to a related thread if you believe things to be very similar).
Questions:
1.) Which ISP are you using?
2.) What are the reported sync values in the dsl-modem?
3.) Are you running the pppoe-client on the modem or on the router?
Okay, thanks. Should i create a new thread now or answer here?
My ISP is Deutsche Telekom.
My modem is a VMG1312-B30A in Full Bridge Mode, so OpenWRT has to do PPPoE.
Modem sync details
Mode: VDSL2 Annex B
VDSL Profile: Profile 17a
G.Vector: Enable
Traffic Type: PTM Mode
===============================================================
VDSL Port Details Upstream Downstream
Line Rate: 41.998 Mbps 109.999 Mbps
Actual Net Data Rate: 41.999 Mbps 110.000 Mbps
Trellis Coding: ON ON
SNR Margin: 15.0 dB 14.8 dB
Actual Delay: 0 ms 0 ms
Transmit Power: - 4.5 dBm 12.3 dBm
Receive Power: -5.7 dBm 7.1 dBm
Actual INP: 38.0 symbols 40.0 symbols
Total Attenuation: 0.0 dB 5.6 dB
Attainable Net Data Rate: 51.124 Mbps 138.162 Mbps
Ideally a new thread, I took the liberty to ask the moderators to move this part into a new thread to expedite this.
Great, in that case the overhead is either 34 bytes on top of the pppoe-wan interface (recommended) or 26 bytes on top of eth0.
Please enable ppp debugging (by replacing the "#debug" line in /etc/ppp/options with "debug"; you could use the following command: "sed -i 's/#debug/debug/g' /etc/ppp/options") and then do "logread | grep -e SRD". The PPP ACK will contain an estimate of the achievable goodput, so take the SRD and SRU values an multiply them:
Uplink: SRU * (1526)/(1500-8-20-20)
Downlink: SRD * (1526)/(1500-8-20-20)
Note these are the theoretical maximal shaper gross rates, so I would start with 99% of the just calculated values for the uplink and 90% for the downlink.
The next step after getting overhead and bandwidth configured is to look at the finer details of what cake can offer (see https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details for an overview, especially the section titled " Making cake sing and dance, on a tight rope without a safety net").
thanks for your help!
According to this wiki page, i used 38 as overhead because my ISP requires VLAN tagging.
Is this correct or was it already included in your calculation?
I did some dslreports tests with your recommended settings on both routers:
WDR4300:
C2600:
As you see, the C2600 still behaviors worse..
Maybe the C2600 really needs some fixes like on r7800 as you mentioned in the other thread?
Or the hardware is just not able to perform better..
I think that may a decent result now but i expected the same or a even better result with newer hardware..
I see the base line RTT seems to be almost twice as high with the C2600 than with the wdr4300, but in both cases the latency under load seems to be pretty flat, which I would chalk up as success.
Not sure, it would be interesting to look at the CPU load on the router while you perform a speedtest. You might want to have a look at https://forum.openwrt.org/t/speedtest-new-package-to-measure-network-performance/24647/36 which introduced a packet that will run a speedtest from your router that will also monitor CPU load and CPU frequency. This might give an indication about your router potentially running out of steam. I will add that this will only ever average CPU load accumulated over 1 second blocks, so it will not show all load spikes << 1 second, which still might negative influence the sqm performance. There is also https://github.com/dlakelan/routerperf by @dlakelan but this is in early alpha stage...
Anyway to get all the bells and whistles that cake offers tested:
Here is my proposed replacement for your /etc/config/sqm to enable per-internal IP fairness, nat-lookup and ingress-awareness, this also will enable ECN on outbound traffic (since your uplink seems fast enough):
stop the current sqm instance: /etc/init.d/sqm stop
Edit /etc/config/sqm with the editor of your choice (I like nano, if not installed just run opkg update ; opkg install nano to get hold of an editor that is both less capable and more user-friendly than vi)
Start sqm again:
``/etc/init.d/sqm start
Check (and post) the output of: tc -s qdisc
Give this a try and report back any comments you might have.
"nat" will allow cake to get to the true internal and external addresses wich seems important for the ingress shaper
"dual-xxxhost" will make cake first try to split the available bandwidth even between all concurrently active hosts (the way configured here will try to give each internal address an equal share of the bandwidth, this mode while super simple often comes close enough to what people want so they stop searching for the last ounce of QoS detail)
"ingress" will instruct cake to not try the customary approach where a shaper tries to enforce its outgoing bandwidth, but rather its incoming bandwidth. A subtle difference that makes cake deal better with different number of flows on the ingress side.
I used your sqm config for now but i think i prefer per-stream over per-IP fairness.
But anyway, here the resuslts with your config.
Looks not much different for me.
2018-11-15 17:16:02 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
Download: 70.65 Mbps
Latency: [in msec, 61 pings, 0.00% packet loss]
Min: 11.540
10pct: 12.203
Median: 18.489
Avg: 19.734
90pct: 26.607
Max: 44.805
CPU Load: [in % busy (avg +/- std dev), 57 samples]
cpu0: 46.0% +/- 6.9%
cpu1: 85.2% +/- 4.0%
Overhead: [in % total CPU used]
netperf: 21.2%
.............................................................
Upload: 37.25 Mbps
Latency: [in msec, 61 pings, 0.00% packet loss]
Min: 11.598
10pct: 12.201
Median: 13.659
Avg: 14.166
90pct: 14.788
Max: 24.127
CPU Load: [in % busy (avg +/- std dev), 58 samples]
cpu0: 43.0% +/- 1.9%
cpu1: 21.2% +/- 5.5%
Overhead: [in % total CPU used]
netperf: 7.4%
speedtest.sh --concurrent
2018-11-15 17:27:37 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are concurrent, each with 5 simultaneous streams.
............................................................
Download: 60.71 Mbps
Upload: 34.52 Mbps
Latency: [in msec, 61 pings, 0.00% packet loss]
Min: 22.621
10pct: 26.431
Median: 30.994
Avg: 34.386
90pct: 41.274
Max: 71.895
CPU Load: [in % busy (avg +/- std dev), 56 samples]
cpu0: 87.0% +/- 0.0%
cpu1: 91.2% +/- 0.0%
Overhead: [in % total CPU used]
netperf: 31.2%
I've also monitored both cpu clocks while testing. Mostly stable on 1400mhz, just some drops to 1200mhz.
But how can this produce a such high load?
This device has about double clock per core and double cores than my wdr4300..
Your network, your decision ;). I guess I shold note that the dual-xxxhost isolationmodes do both, first level fairness by host IP-address and then for each IP-address per stream/per-flow fairness, so with only one IP-address active the dual-modes behave like the per-stream.
And yes traffic shaping unfortunately is expensive....
Only two devices here needs multiple streams with high bandwith and low latency.
So this two devices with most streams should get the most bandwith.
All other clients only do unimportant casual stuff.
But for now i will keep your settings and watch how it acts in practice.
Thank you very much for your help and explanation of sqm settings.
Can you tell me how the overhead is calculated? I still wonder about additional bytes for vlan tagging..
Whats the conclusion now about the C2600?
It just has higher RTT and is not the best choice if 20ms are of importance?
Sure than you might not need the dual-xxxhost isolation modes, but the might still save your bacon when one of the other devices decides to use an significant number of concurrently active flows/streams (which might never happen in reality).
Well, for VDSL2/PTM the only method I know is to look into the relevant ITU standards (in the G. series). Once you understand what is actually packaged into those PTM-frames you can start to make educated guesses...
Well you also need to take things into account you know about the link (in the DTAG case the fact that a dual VLAN tag is used at the BNG and a single VLAN tag is used on the VDSL2-link itself).
Since I was describing this in a similar thread, let me just cite myself from (https://forum.openwrt.org/t/sqm-flow-offloading-vlan-tagging-and-gaming/25113/14?u=moeller0)
"Well, on a Telekom vdsl2-link the actual overhead on top of the pppoe-wan device is actually 34 Bytes (8 bytes for PPPoE, 22 bytes for the ethernet frame (src-mac(6), dst-mac(6), ethertype(2), frame-check-sequence(4), VLAN(4))) and 4 bytes for the PTM overhead."
"DTAG actually uses a traffic shaper at the BNG/BRAS level, [...]" that accounts for the double VLAN tag which makes up for the "missing" PTM overhead.
It also helps that for the longest time 1TR112 documented 1526 as the maximum frame size on DSL-links, but that got changed recently to 1590 or so. But it is super unlikely that this has any relevance for a Telekom branded link ATM (I expect the use of baby-jumboframes in the future or MTU 1508 towards the BNG, so that the internet-visible MTU increases to 1500, but that is idle speculation).
Since I have too little experience with this router I will withhold judgement.
The performance is good so far. it performs only slightly worse than my old router with this optimized settings thanks to @moeller0. And this is only under load on wan interface. Everything else runs as expected.
I think i wouldn't call it an issue anymore.