E8450/RT3200 gigabit speeds tweaking?

These routers are the best supported routers in OpenWrt, thanks to several developers and their great community.

3 Likes

i have a 1Gbps/500Mbps fiber connection and a MT7622 based router, like the RT3200. i don't have any particular setting in my router, no sqm, only hadware and software offloading enabled.
i do my test on fast.com , and i don't see any problem at all ....

3 Likes

Great that your are taking things with humor...

The point is low-latency traffic shaping is quite CPU demanding (not so much throughput, but low-delay), and the actual load depends on packet size, and the available CPU cycles depend on how much other work a router needs to do. Traffic shaping @1Gbps is quite a lot of work even with maximum sized packets (1538 for ethernet):
1000*1000^2/((1500+38)*8) = 81274.4 pps
or
1000/(1000*1000^2/((1500+38)*8)) = 0.012304 milliseconds per packet...

IMHO traffic shaping @~1Gbps is possible with a few consumer-grade non-x86 routers, but typically there are little reserves for the unexpected, so depending on what else a router does you will fail to achieve the ~1Gbps throughput.
For example my turris omnia, when streamlined to only do minimal other chores (like no wifi) and with all configuration tricks I managed to come up with (manually adjusted packet-Steering) allowed bidirectionally saturating traffic shaping at 550/550 Mbps (or unidirectional 1 Gbps), but adding a few more services degraded that shaping performance considerably (my access link is only running at 116/37 Mbps, so even with other duties traffic shaping with cake is no problem).

I guess the issue here is to come up with the correct expectations, since 1 Gbps ethernet interfaces have become ubiquitous and many cheap routers manage to do NAT/PPPoE/firewalling at 1 Gbps rates one intuitively assumes that network processing at 1Gbps to be a piece of cake. Once one realises that OEM firmware often only achieve throughput close to 1 Gbps by employing accelerators (which often are very specialized and only accelerate, say PPPoE, under very specific conditions and not generically, think unencrypted PPPoE running like a bat out of hell, while encrypted PPPoE (I do not know of any ISP actually using that, so this is a thought experiment) probably would be punted to the routers main CPU and hence achieve considerably less throughput) the main CPUs of those routers often are not up to the task of doing much at 1 Gbps.... Traffic shaping however typically is not something offered by those accelerators, so sqm/cake do not profit of thse and hence running cake/sqm will expose a router's raw CPU capabilities, which often are not as high as expected.

And for good reason, as far as I can tell it is the only/best-supported WiFi6 router under OpenWrt...

Well, we can have a look there if you want. I would need the output of:

  1. ifstatus wan
  2. cat /etc/config/sqm
  3. tc -s qdisc
  4. tc -d qdisc

as well as the link to a result of a dslreports speedtest, configured like this (please note the dslreports speedtest is somewhat in decline, but it still offers a few unique pieces of information).

3 Likes

It's not something we support in our drivers (afaik) and it will require quite specific tc setup (ie. using specific queuing disciplines), but it can actually be done in hardware by many common router SoCs including the MT7622:

HW QoS: Seamlessly co-work with HW NAT engine, SFQ w/ 1k queues
64 hardware queues to guarantee the min/max bandwidth of each flow

2 Likes

Is it something that could be supported in the future? Does this include the specific SOC, MT7622BV, in the RT3200?

1 Like

Afaik all MT7622 variants should support that feature.
To support it in future OpenWrt someone would need to write a tc-offloading driver. The infrastructure for this is already present in the kernel, so it's probably not terribly hard to implement this. Afaik nobody is working on that in the moment, also no idea if and how it is implemented in MediaTek SDK kernel.

4 Likes

I guess for egress it would be enough to just offload the actual traffic shaper and use BQL to create back pressure into an normal kernel qdisc. For ingress however, I am not sure that would work and we would need to move the whole qdisc into the accelerator (like for the NSS cores in the r7800). All of this is, however, far outside my area of expertise... my personal solution is to use a primary router with a sufficiently powerful CPU for the required traffic shaping needs (often a raspberry pi4b will do or one of the alternative ARM based SBCs with 2 ethernet ports), and use something like the E8450/rt3200 as AP, but I understand that this is not really as attractive as having a single WiFi-router that "does it all".

3 Likes

ALL I changed was my DHCP DNS address to 8.8.8.8 and 8.8.4.4 and: https://www.waveform.com/tools/bufferbloat?test-id=d8d170fc-c293-422c-93e2-d9f3f46b3d44

I think I found a bug

I went up 50mbps in kbits and went all the way to 1950. up to 950 I was getting a latency below +10 - above that it would go to around +90......

But, if I go back to 950, do a test with the buffer bloat site, it would hit +90 but if I add an 0 then apply then remove the 0 - it would be under +10

Well different DN-server can direct you to different nodes in cloudflare's network and these can be loaded differently. This might indicate that throughput and latency for the wavefront test might not be limited by your own router...

It would be interesting to see:

  1. output of tc -s qdisc (copy and paste the output as 'Preformatted Text')
  2. run a speedtest and post the results here (copy and paste the result link)
  3. immediately after the speedtest finishes run tc -a qdisc again (copy and paste the output as 'Preformatted Text')

Possible, but still inconclusive as it is not clear whether wavefront/cloudflare works well for you, you can realistically only address the bufferbloat on your access link, but the wavefront tests will report the aggregate bufferbloat including bloat on segments of the path outside of your home network.

1 Like

Give me 5, and it's Google not cloudfare.

Wavefront uses cloudfront for its data sinks and sources. My theory is that switching from your (ISP's?) default DN-servers to Google's DN-servers might simply steer you towards different cloudflare nodes. Sorry, I probably should have elaborated on this.

No worries :slight_smile: I'm set to 800mbps and 40mbps upload and bufferbloat test is:

But fast.com: 840Mbps and speedtest.net: https://www.speedtest.net/result/12856762505.png and settings: download - 870400 and upload - 40960 I'm happy enough.

Sometimes these test pages give me high pings by mistake.

Don't blindly trust on the first results.

Better use PingPlotter or MultiPing programs to check the bufferbloat when doing those tests.

2 Likes

Don't expect wonders but this helps a bit :slight_smile:

1 Like

These specific compilation switches could improve size, power, or performance. It would be nice if you could share the script to build with those optimizations or compare it against the standard build without them. If the optimization is effective and it is not generating any side effects, it could, and should, be included in the master.

I've given up on pushing any kind of optimization that increases size due to previous efforts simply because it's not worth the effort. I think they're still people trying to get builds using NEON optimizations accepted at least one year later. There have also been attempts trying to make x86-64 more modern with little success.

That will all depend on the number of users using a given optimization. in the R7800 I seethe NSS drivers coming soon as default due to the speed increase. The best you can do to make any optimization going into master is to promote it.
Size is not a problem in these routers. I am sure if you share here your experiments (scripts, results, etc) they will be replicated by many people. A repo can be forked in seconds and others could leverage your results to go even further.

Depending on your overhead configuration the best you can expect to measure as IPv4/TCP throughput (the stuff that on-line speedtests measure, also called goodput):

800.0 * ((1500-20-20)/(1500+14)) = 771.47 Mbps
40.0 * ((1500-20-20)/(1500+14)) = 38.57 Mbps

so wavefront still undershoots, but that appears not to be your router's fault

870.400 * ((1500-20-20)/(1500+14)) = 839.36 Mbps
40.960 * ((1500-20-20)/(1500+14)) = 39.50 Mbps

both netflix's fast.com and Ookla's speedtest.net results look within the expected. For Okkla the challenge is that individual measurement servers only need to be connected vie 1 Gbps ethernet, and if you measure against such a node and other speedtests are also running at the same time it might be hard to saturate ...

1 Like

Feel free to pick up the baton :wink:
I've moved on and spend most of my time on other projects...