I am using quite a synthetic method to see the highest throughput. I am using iperf3 sending 10 parallel data streams from an iperf3 server in WAN and a Windows 11 laptop connected wirelessly using 160MHz channel. While I am listening to online radio with low buffer in another PC connected to the E8450 wirelessly too. This is pretty much the performance out of the box, as I am not yed decided on what SSL to use and I am testing performance with different sets. The test was done using openssl, but I am getting the same results using wolfssl.
If you can't buy an x86 router to show you 1GB of speed on Speedtest when you're using CAKE, then why don't you limit your bandwidth to 400MB or 500MB and live happy?
I don't understand the urgency that you have in trying to get 1GB when most of the time or 99% of the time, you will NEVER use that speed, because you will only see it in Speedtest.
You have these two choices until you buy an x86 router:
Get more speed using the old and obsolete qdisc "fq_codel".
Use the new and better qdisc "CAKE" (but you have to limit your bandwidth to half because the router CPU can't handle that speed) and completely FIX the bufferbloat and you can also prioritize your important traffic using the newQosifypackage.
P.S. If you use packages for traffic shaping (QoS, SQM, Qosify, etc.), I recommend that you NEVER use "Software flow offloading" or "Hardware flow offloading" options to try to get more performance because it comes with drawbacks and if you want more performance, then build an x86 router.
I don't know if offloading options improves the speed when you are using VPN on the router, but I think the speed depends more on the encryption protocol you are using.
If you're not using any packages for traffic shaping (QoS, SQM, Qosify, etc.) and you just want to try to get more speed or bandwidth because the router CPU makes a bottleneck due to lack of performance, you can try the offloading options.
This is why fq_codel is obsolete:
CAKE is easy to configure.
CAKE fixes the bufferbloat better than fq_codel.
CAKE already has categories to prioritize traffic by default. (Use Qosify to use this feature)
CAKE equally divides bandwidth with all devices.
CAKE has an interface that shows all information and is easy to understand.
CAKE is one level higher than fq_codel.
It's not CAKE fault that people buy a router that can't handle 1GB speeds.
If you have 1GB of bandwidth and you want to use CAKE as queue discipline, you must have a good x86 router according to that bandwidth, switches and access points so you don't have problems and you don't end up blaming OpenWrt or CAKE because your router can't handle 1GB.
So one of cake's goals was to make setting up competent AQM for novice users simple (reducing the need and complexity of set-up scripts like sqm-script), IMHO it mostly suceeded.
The other big goal however was reducing the CPU cycle cost of the often required traffic shaper, and that part did not suceed, in the end cake is even more CPU hungry than HTB+fq_codel, in fairness it also does more. But doing more is not helpful when CPU cycles are scarce
As it stands neither is obsolete and as clear a win over the other as fq_codel was over single queue codel.
No. Quote ALL of what I said, not just a bit.
People over on Reddit and IRC are recommending this router. I have no idea why, you can't reach 1gbps with it and even if you use fq_codel, your latency is crap!
I might as well go back to my old router, at least I could hit 1gbps with it. lmao
i have a 1Gbps/500Mbps fiber connection and a MT7622 based router, like the RT3200. i don't have any particular setting in my router, no sqm, only hadware and software offloading enabled.
i do my test on fast.com , and i don't see any problem at all ....
The point is low-latency traffic shaping is quite CPU demanding (not so much throughput, but low-delay), and the actual load depends on packet size, and the available CPU cycles depend on how much other work a router needs to do. Traffic shaping @1Gbps is quite a lot of work even with maximum sized packets (1538 for ethernet): 1000*1000^2/((1500+38)*8) = 81274.4 pps
or 1000/(1000*1000^2/((1500+38)*8)) = 0.012304 milliseconds per packet...
IMHO traffic shaping @~1Gbps is possible with a few consumer-grade non-x86 routers, but typically there are little reserves for the unexpected, so depending on what else a router does you will fail to achieve the ~1Gbps throughput.
For example my turris omnia, when streamlined to only do minimal other chores (like no wifi) and with all configuration tricks I managed to come up with (manually adjusted packet-Steering) allowed bidirectionally saturating traffic shaping at 550/550 Mbps (or unidirectional 1 Gbps), but adding a few more services degraded that shaping performance considerably (my access link is only running at 116/37 Mbps, so even with other duties traffic shaping with cake is no problem).
I guess the issue here is to come up with the correct expectations, since 1 Gbps ethernet interfaces have become ubiquitous and many cheap routers manage to do NAT/PPPoE/firewalling at 1 Gbps rates one intuitively assumes that network processing at 1Gbps to be a piece of cake. Once one realises that OEM firmware often only achieve throughput close to 1 Gbps by employing accelerators (which often are very specialized and only accelerate, say PPPoE, under very specific conditions and not generically, think unencrypted PPPoE running like a bat out of hell, while encrypted PPPoE (I do not know of any ISP actually using that, so this is a thought experiment) probably would be punted to the routers main CPU and hence achieve considerably less throughput) the main CPUs of those routers often are not up to the task of doing much at 1 Gbps.... Traffic shaping however typically is not something offered by those accelerators, so sqm/cake do not profit of thse and hence running cake/sqm will expose a router's raw CPU capabilities, which often are not as high as expected.
And for good reason, as far as I can tell it is the only/best-supported WiFi6 router under OpenWrt...
Well, we can have a look there if you want. I would need the output of:
tc -s qdisc
tc -d qdisc
as well as the link to a result of a dslreports speedtest, configured like this (please note the dslreports speedtest is somewhat in decline, but it still offers a few unique pieces of information).
It's not something we support in our drivers (afaik) and it will require quite specific tc setup (ie. using specific queuing disciplines), but it can actually be done in hardware by many common router SoCs including the MT7622:
HW QoS: Seamlessly co-work with HW NAT engine, SFQ w/ 1k queues
64 hardware queues to guarantee the min/max bandwidth of each flow
Afaik all MT7622 variants should support that feature.
To support it in future OpenWrt someone would need to write a tc-offloading driver. The infrastructure for this is already present in the kernel, so it's probably not terribly hard to implement this. Afaik nobody is working on that in the moment, also no idea if and how it is implemented in MediaTek SDK kernel.
I guess for egress it would be enough to just offload the actual traffic shaper and use BQL to create back pressure into an normal kernel qdisc. For ingress however, I am not sure that would work and we would need to move the whole qdisc into the accelerator (like for the NSS cores in the r7800). All of this is, however, far outside my area of expertise... my personal solution is to use a primary router with a sufficiently powerful CPU for the required traffic shaping needs (often a raspberry pi4b will do or one of the alternative ARM based SBCs with 2 ethernet ports), and use something like the E8450/rt3200 as AP, but I understand that this is not really as attractive as having a single WiFi-router that "does it all".