Hello. I am looking to build in some redundancy for a network of about 20 devices. To do so, I am trying to get a router that would be powerful enough to throughput 2.5Gbps with SQM (cake) enabled. Before racking the brains of the forum, I wondered if there was a calculation that I could use to determine the CPU power I would need to achieve my goal. I do not know if one exists, but I couldn't find one. Currently I am thinking of getting an Intel N305 cpu driven device with two Intel i226-V NIC's ( I am sure that there are AMD devices that would fit the bracket too. Any suggestions on how the required CPU usage should be calculated and or what devices you would suggest?
What for you use sqm with this level of speed? Useless from my experience
I agree, it's even obsolete at 1gbit, at least for me.
ISP typically design their backbones without traffic shaper or AQM, but take care that the load typically does not exceed the capacity. (This can be achieved, whith e.g. traffic shapers at the ingress links and by vclosely monitoring the traffic load and increasing the capacity).
On a homelink there are tyically limits on how much capacity is available, but the same logic still applies: if you never saturate your link a traffic shaper / AQM is not going to do much for you. If you expect saturating loads occasionally or more often, then you might actually notice whether you run sqm or not.
SQM at 2.5G, sounds like something we won't see on embedded hardware until 2030...
For x86-64, short of going crazy on power consumption with an Intel i5, maybe the new N5105 (edit: N100 now), with i226-V ethernet, that might get close.
For the ARM side, probably nothing. Maybe the NanoPi R6S can get close.
My ISP is providing me a good 1 Gbit connection and they have impeccable transit / peering. I had been able to saturate that link frequently by just downloading from Steam (actually any major CDN is able to saturate my link). Without SQM, I would have frequent lags and ping spikes making any multiplayer game unplayable.
As expected, it is not the internet access rate per se, but whether that rate is ever saturated or not...
I had a look at the N5105, the N305 seems to be the successor: Intel Core i3-N305 vs. Intel Celeron N5105 - Cpu Benchmark Specs & Test (cpu-benchmark.org) with a bit more frequency to play with.
Yeah, agreed. What is your lag currently? I assume you are seeking to reduce/eliminate bufferbloat?
Good catch, I'm seeing N305 boxes in the $400-500 range with i226-V (2.5Gbit) like this one:
[https://www.amazon.com/Mrroute-i3-N305-Fanless-Screens-Computer/dp/B0CFLZH2BC/]
Would be interesting to see power draw. There are some i3-1251U setups on there too same price. It's a mixed bag with those random x64 boxes though.
Without SQM on a 900Mbps line, my bufferbloat can reach 100ms. With SQM can reach 900Mbps with no additional bufferbloat. That is on a Intel i8550u cpu.
Power draw on the N305 is 15W. Beelink is selling their version for £299 (with a £100 discount)
And don't ignore the twice as many cores part...
With sqm and failover at 2.5 GBit/s, I wouldn't even look below higher-end i3 or better medium range i5 hardware. Even slightly dated SFF (not USFF) systems from the big four (Dell, Fujitsu, HP, Lenovo) tend to have the performance and the (low-profile-) PCIe slots to have fun - obviously the sky is the limit.
From my understanding, cake is not really multi-threaded. I think an n305 with 8 cores will provide roughly ~5-10% higher performance in your use-case compared to any 4 core n95/n100/n200 due to ~400mhz higher turbo. You have to make the decision, if that ~5% performance worths double the price, as n100 devices are around $150 or less on aliexpress. Long story short, IF a n100 is inadequate, then probably an n305 won't cut it either. At that point your options are "big" core alder lake or newer cpu or some amd.

From my understanding, cake is not really multi-threaded.
Indeed, it relies on the qdisc lock IIRC (might be called differently) so at best you can put each cake instance on a separate CPU. Also if cake operates as traffic shaper, there is the inconvenient fact that even multithreaded cake instances on different CPUs would need to synchronize their sharing of the available capacity... so not sure how much independence is actually achievable....
Been a while since I tested the N5105, but I think it can do cake at 2.5Gbps in both directions during a flent rrul test. It's in use now on a 1gbit symmetrical line, so can't run 2.5Gbe tests.
And I've just tested a nanopi r6s with some irq and cpu governor tweaks that can do 1.7-1.8Gbps in both directions during a flent rrul test.
Both tested with cake, layer cake, with NAT and dual-dsthost/dual-srchost options.
If you’re considering x86 hardware, I recommend spending a little extra for something more powerful like an AMD 5800U or 5625U based miniPC. I did have an N95-based one, SQM worked up to a gigabit (my connection), but think about future proofing things like this particular one spending that much money. For example, if you decide to run snort on the network, the N95 based PC was insufficient to maintain speeds much over 500 Mbps. The 5800U can do that gigalan speeds around 40% CPU load.

What for you use sqm with this level of speed? Useless from my experience
If the connection is not symmetric then even with a very high bandwidth in the download direction, the upload direction may still present a challenge. I also wonder about ensuring flow fairness even if just over small time periods. If a steam download, windows update or other large download can saturate even very high bandwidth connections, then there is still the issue of how to prevent that from interfering with other concurrent flows for the duration of those heavy downloads.

Also if cake operates as traffic shaper, there is the inconvenient fact that even multithreaded cake instances on different CPUs would need to synchronize their sharing of the available capacity... so not sure how much independence is actually achievable....
How so? With a cake instance on a download interface and a cake instance on an upload interface why does available capacity need to be shared? How do you assign cake instances to different cores? Would this be possible on the RT3200, which has two cores?
Supposedly irqbalance can assign resources across cores, but I don't know if it actually works on the RT3200.
Receive packet steering can be used to define which CPU software interrupts can run on, that allows to to move independent cake instances on different CPUs. That works today, what people mean with multithreaded (and what I tried to address in what you cited) is that one cake instance (e.g. for internet ingress) runs not as a single thread on a single CPU, but is split into multiple threads on potentially multiple CPUs.
As @dtaht reminded recently, we (or rather I) lack precise knowledge what is so costly in cake, making hard to come up with a solution.... brute force, aka multiple beefy CPUs can certainly help....
My gut feeling is that cake's issue is not necessarily its persistent CPU load, but the fact that it needs timely CPU access , not sure though whether that is correct and how/if this helps with conversion to multiple threads...