E8450/RT3200 gigabit speeds tweaking?

Well, I really hope this fix in the mt76 driver goes into the first stable release for this device, it will have to go first into the master.
I have been struggling with the R7800 to have gigabit throughput, and the E8450 does it straight out of the box without heavy CPU load and without getting as hot as other devices. However, the CPU throttling still has to be added manually. It should be implemented as default for this device.

2 Likes

How're you getting 1gbps?

This is with CAKE enabled, layer cake and set to 500/50: https://www.waveform.com/tools/bufferbloat?test-id=ea7df2ef-b2c3-40b8-89f5-56e0247b3327

IRQ is installed. I set the governor as advised above to sched.
If I try fq_codel, the speeds are okayish for a 1gbps line but the latency can hit over +100

Am I doing something wrong?

Firmware Version OpenWrt SNAPSHOT r19043-c2d7896a65 / LuCI Master git-22.058.70382-d29400e

1 Like

@JimmyValentine What's your latency like with fq_codel?

using very basic settings with fq_codel and simple.qos (900 download/45 upload) I get this:

I think I might be able to get the bufferbloat fluctuations even more tamed if I start messing with some settings or just reduce more the bandwidth. But for now, it does the job.

1 Like

Can you upload your settings please?

I am getting 938 Mbps:

Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-28.65  sec   533 MBytes   156 Mbits/sec                  sender
[  4]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[  6]   0.00-28.65  sec   518 MBytes   152 Mbits/sec                  sender
[  6]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[  8]   0.00-28.65  sec   307 MBytes  89.9 Mbits/sec                  sender
[  8]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 10]   0.00-28.65  sec   287 MBytes  84.0 Mbits/sec                  sender
[ 10]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 12]   0.00-28.65  sec   269 MBytes  78.9 Mbits/sec                  sender
[ 12]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 14]   0.00-28.65  sec   248 MBytes  72.5 Mbits/sec                  sender
[ 14]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 16]   0.00-28.65  sec   258 MBytes  75.5 Mbits/sec                  sender
[ 16]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 18]   0.00-28.65  sec   274 MBytes  80.4 Mbits/sec                  sender
[ 18]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 20]   0.00-28.65  sec   225 MBytes  65.8 Mbits/sec                  sender
[ 20]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[ 22]   0.00-28.65  sec   286 MBytes  83.8 Mbits/sec                  sender
[ 22]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver
[SUM]   0.00-28.65  sec  3.13 GBytes   938 Mbits/sec                  sender
[SUM]   0.00-28.65  sec  0.00 Bytes  0.00 bits/sec                  receiver

I am using quite a synthetic method to see the highest throughput. I am using iperf3 sending 10 parallel data streams from an iperf3 server in WAN and a Windows 11 laptop connected wirelessly using 160MHz channel. While I am listening to online radio with low buffer in another PC connected to the E8450 wirelessly too. This is pretty much the performance out of the box, as I am not yed decided on what SSL to use and I am testing performance with different sets. The test was done using openssl, but I am getting the same results using wolfssl.

This is the list of custom packages I used in imagebuilder:
nano-plus htop ncdu iperf3 irqbalance auc ca-certificates -wpad-basic-wolfssl wpad-wolfssl openvpn-wolfssl luci-ssl -wpad-basic-wolfssl -libustream-wolfssl -px5g-wolfssl wpad-openssl libustream-openssl luci-ssl-openssl luci luci-compat luci-mod-dashboard luci-app-attendedsysupgrade luci-app-vnstat2 luci-app-nlbwmon luci-app-adblock luci-app-banip luci-app-bcp38 luci-app-commands luci-app-ddns ddns-scripts-noip luci-app-openvpn -luci-ssl-openssl luci-app-sqm luci-app-wireguard luci-app-upnp luci-app-uhttpd luci-app-statistics collectd-mod-conntrack collectd-mod-cpu collectd-mod-cpufreq collectd-mod-dhcpleases collectd-mod-entropy collectd-mod-exec collectd-mod-interface collectd-mod-iwinfo collectd-mod-load collectd-mod-memory collectd-mod-network collectd-mod-ping collectd-mod-rrdtool collectd-mod-sqm collectd-mod-thermal collectd-mod-uptime collectd-mod-wireless blockd cryptsetup e2fsprogs f2fs-tools kmod-fs-exfat kmod-fs-ext4 kmod-fs-f2fs kmod-fs-hfs kmod-fs-hfsplus kmod-fs-msdos kmod-fs-nfs kmod-fs-nfs-common kmod-fs-nfs-v3 kmod-fs-nfs-v4 kmod-fs-vfat kmod-nls-base kmod-nls-cp1250 kmod-nls-cp437 kmod-nls-cp850 kmod-nls-iso8859-1 kmod-nls-iso8859-15 kmod-nls-utf8 kmod-usb-storage kmod-usb-storage-uas libblkid ntfs-3g nfs-utils ip6tables-mod-nat 6in4 6rd 6to4 ip6tables-nft

@YesNO please think through the superb and carefully crafted advice by @elan above, which will have taken time to put together.

2 Likes

How would that change if I throw a VPN in the mix? If I try to maximize OpenVPN (client) or Wireguard speed on this router, should I enable flow offloading?

1 Like

fq_codel isn't that "obsolete" that it requires bolding (especially in this context).

Although cake is hyped here, it is not perfect for all situations, as it may try a bit too much classification and is thus too CPU intensive as the highest speeds, as you implicate.

For Gigabit speeds the SQM simple.qos/fq-codel (or simplest.qos) may offer the needed QoS but be much lighter for the CPU.

Fully agree with that :slight_smile:
The measured top speed is pretty irrelevant for most users, as the real-life internet traffic will never reach that except for really short bursts (or the speedtests).

4 Likes

Lol, nah. I've just paid £90 for this after people saying this is the recommended router to get on IRC. it's crap, can't reach anywhere near 1gbps speeds and the latency is a joke!

So one of cake's goals was to make setting up competent AQM for novice users simple (reducing the need and complexity of set-up scripts like sqm-script), IMHO it mostly suceeded.
The other big goal however was reducing the CPU cycle cost of the often required traffic shaper, and that part did not suceed, in the end cake is even more CPU hungry than HTB+fq_codel, in fairness it also does more. But doing more is not helpful when CPU cycles are scarce

As it stands neither is obsolete and as clear a win over the other as fq_codel was over single queue codel.

4 Likes

@elan
is this a good candidate ? X86 and 2.5gbps ports, intel i225 nics.

1 Like

No. Quote ALL of what I said, not just a bit.
People over on Reddit and IRC are recommending this router. I have no idea why, you can't reach 1gbps with it and even if you use fq_codel, your latency is crap!

I might as well go back to my old router, at least I could hit 1gbps with it. lmao

1 Like

This is WITHOUT the CAKE script running so NO fq_codel: https://www.waveform.com/tools/bufferbloat?test-id=cf9da65d-5e2b-4202-bfab-b4a12d2ae7c1

What a joke of a system!!!!!!

can you try to do your test on https://fast.com/ ?

1 Like

i have a 1Gbps/500Mbps fiber connection and a MT7622 based router, like the RT3200. i don't have any particular setting in my router, no sqm, only hadware and software offloading enabled.
i do my test on fast.com , and i don't see any problem at all ....

3 Likes

Great that your are taking things with humor...

The point is low-latency traffic shaping is quite CPU demanding (not so much throughput, but low-delay), and the actual load depends on packet size, and the available CPU cycles depend on how much other work a router needs to do. Traffic shaping @1Gbps is quite a lot of work even with maximum sized packets (1538 for ethernet):
1000*1000^2/((1500+38)*8) = 81274.4 pps
or
1000/(1000*1000^2/((1500+38)*8)) = 0.012304 milliseconds per packet...

IMHO traffic shaping @~1Gbps is possible with a few consumer-grade non-x86 routers, but typically there are little reserves for the unexpected, so depending on what else a router does you will fail to achieve the ~1Gbps throughput.
For example my turris omnia, when streamlined to only do minimal other chores (like no wifi) and with all configuration tricks I managed to come up with (manually adjusted packet-Steering) allowed bidirectionally saturating traffic shaping at 550/550 Mbps (or unidirectional 1 Gbps), but adding a few more services degraded that shaping performance considerably (my access link is only running at 116/37 Mbps, so even with other duties traffic shaping with cake is no problem).

I guess the issue here is to come up with the correct expectations, since 1 Gbps ethernet interfaces have become ubiquitous and many cheap routers manage to do NAT/PPPoE/firewalling at 1 Gbps rates one intuitively assumes that network processing at 1Gbps to be a piece of cake. Once one realises that OEM firmware often only achieve throughput close to 1 Gbps by employing accelerators (which often are very specialized and only accelerate, say PPPoE, under very specific conditions and not generically, think unencrypted PPPoE running like a bat out of hell, while encrypted PPPoE (I do not know of any ISP actually using that, so this is a thought experiment) probably would be punted to the routers main CPU and hence achieve considerably less throughput) the main CPUs of those routers often are not up to the task of doing much at 1 Gbps.... Traffic shaping however typically is not something offered by those accelerators, so sqm/cake do not profit of thse and hence running cake/sqm will expose a router's raw CPU capabilities, which often are not as high as expected.

And for good reason, as far as I can tell it is the only/best-supported WiFi6 router under OpenWrt...

Well, we can have a look there if you want. I would need the output of:

  1. ifstatus wan
  2. cat /etc/config/sqm
  3. tc -s qdisc
  4. tc -d qdisc

as well as the link to a result of a dslreports speedtest, configured like this (please note the dslreports speedtest is somewhat in decline, but it still offers a few unique pieces of information).

3 Likes

It's not something we support in our drivers (afaik) and it will require quite specific tc setup (ie. using specific queuing disciplines), but it can actually be done in hardware by many common router SoCs including the MT7622:

HW QoS: Seamlessly co-work with HW NAT engine, SFQ w/ 1k queues
64 hardware queues to guarantee the min/max bandwidth of each flow

2 Likes

Is it something that could be supported in the future? Does this include the specific SOC, MT7622BV, in the RT3200?

1 Like