2.5Gbps Throughput & SQM, what do I need?

At 2.5 Gbps the fancy stuff Cake does may be less important. You might do well with a simple TBF and fq_codel.

Imagine you have a pfifo with 1000 packet length (ie. a default on some linux machines). 1500 bytes per packet, 8 bits per byte, the time to completely drain the queue is :+1:

1500*8*1000/2.5e9 = .0048 seconds

So, really you shouldn't be able to get more than 5ms of delay unless your upstream ISP bottleneck device has much larger than default buffers.

A TBF keeping the rate at ~ 90% of achievable capacity and fq_codel to do fair queuing should be sufficient for most purposes.

1 Like

Can fq_codel run on multiple cores?

No it can not, same locking issue...
However the costliest thing about cake is not the fq scheduler or the AQM but the traffic shaper, and fq_codel does not have a shaper built-in. TBF is a simpler shaper than cake that is easier to rig for less CPU cost. However one can run cake (in unlimited mode) as leaf qdisc behind a TBF shaper.... But thst TBF shaper will also not be multithreaded....

I am again willing to show my lack of deep understanding, but a traffic shaper that intends to set a hard limit for egress rate is hard to run cooperatively, each thread needs to know what the other threads already committed... somehow.

Also even at 1Gbps a single packet is 1500*8/1e9 = .000012 seconds (12 microseconds). So if a packet sneaks ahead of your important packet it delays it like 1% of what you can "feel" (you can just barely feel around 1ms, according to delays that are tolerable for musicians playing MIDI instruments). On the other hand at 10Mbps if a packet sneaks ahead of you it's 1500*8/10e6 = .0012 = 1.2 ms which is exactly on the boundary of what you're going to notice.

So basically at gigabits and above you can slop around a few hundred packets and no one will likely notice, at 10Mbps you can't even get one packet misordered before someone starts to feel it.

Cake's smarts will be more important in the range 100Mbps and below, in the 300Mbps and above kind of range (1500*8/300e6 = .00004 s = 0.04 ms) you have at least tens of packets you can slop around.

For 2.5 Gbps I'd guess a TBF and fq_codel will be fine for almost everyone. The only thing I would say is that you might benefit from a QFQ with 3 or 4 bins below the TBF so you can still reorder high priority packets.

2 Likes

Answering this with some real world numbers... an i5 1235U should do it easily. I have this one. Below are some graphs of a download on my 1000/35 internet link. I am doing CAKE on both ingress and egress. You can see it uses barely any CPU and only about +8W power consumption while downloading.

Network:
network

CPU:
cpu

Power consumption:
powercap

4 Likes

Thanks. This graph gives me an idea of just how much compute power I need to look at.

Kinda pointless but if you insist to run Cake on 2.5Gbit, start with x86 model with high single-core scores (shaping is not easy to parallelize by design). I tested Cake on Intel(R) Celeron(R) N5105 @ 2.00GHz and it will almost do 1Gbit.

This is a argument that has been around for like forever (because it is not completely besides the point). However I think it helps to clarify this:
If you never noticeably saturate a link, then there is no queue built up and hence active queue management (a core component of cake/fq_codel) will essentially do nothing. That is true, but also completely decoupled from the actual link capacity... if your load never exceeds 80 Mbps cake will be useless for any link > 80 Mbps.
Now, e.g. TCP tries to fill the available pipe, so a long running TCP flow will take a considerable fraction of even a 2.5 Gbps link, add a few of these in parallel and you might even saturate a 2.5 Gbps link. At which point running an AQM might immediately make sense again.

Sidenote: if you never saturate your 2.5 Gbps link, maybe consider scaling back to a cheaper 1 Gbps link with AQM instead and save some money on the way :wink:

4 Likes

OP mentions "network of about 20 devices". Unless he is running a farm of web servers, he will never be able to saturate the line with ordinary "20 ipads/iphones surfing a web or watching Youtube".

"Never" is a big word here. Not knowing the OP's expected throughput demands all I want to make clear that the "above X Mbps capacity AQM becomes useless" argument is IMHO ill-posed, the real rule should be "if utilisation stays reliably below 100% AQM becomes useless". The point is you understand that, but others stumbling over this thread might misunderstand and I prefer less ambiguity on this topic.

4 Likes

Agree with this. Personally I almost never even saturated my 500Mbit cable modem (4K streaming, gaming, torrents, etc. still barely used it) so cut my monthly bill down and dropped to 300Mbit recently. But SQM at 500Mbit down is easy with even my ~7 year old WRT32X.

I still find the idea of running SQM at 2.5G interesting so want to see what hardware can do it :smiley:

2 Likes

I take it running sqm between two routed 2.5GbE port's isn't a good way of checking the CPU requirements ?
That kind of load I can generate via my router....

@moeller0 can answer better, but my understanding is that the CPU utilisation of cake is more about bandwidth than about numbers of flows, so I think the outcome of your proposed test would at least give a useful point of reference.

This seems pretty important in that downloads of various forms will try to eat up all the available bandwidth - a Windows update or Steam download does not care about the bandwidth and latency of other flows.

But @dlakelan's interesting point also seems relevant, and I have a question about the same:

I'm trying to figure out in my mind how this buffering would work with multiple flows from different sources and what happens when it is full, presumably resulting in resends? Wouldn't factors such as these tend to increase the 5ms somewhat? Would you expect just one buffer for multiple flows or one buffer for each flow?

I mean, if it's true that with a sufficiently high bandwidth link, latency would only increase by circa 5ms, then that does seem to give some credibility to @Gruntruck's point doesn't it?

Well, you need to set sqm a tad below link rate anyway, so if you can saturate the L2 link via one port, you can check whether the second port will properly traffic shape the incoming data, but now your poor router needs to source the data send it out via port A receive it via port B run through the qdisc's and then still sink the transmitted data somewhere. This looks like a more stringent test for sqm-utility than purely looking at data sourced and sinked from/to other end hosts. (You just need to have two ports that each have an independent connection to the CPU, or a big enough pipe).

This assumes there is no other queue that might fill up beyond these 1000 packets... and for downloads this queue actually is managed by your ISP... sure you can hope that for your ISPs fastest plan the buffer sizing is somewhat optimal (but buffersizing traditionally is recommended to be 1 BDP, that is bandwidth delay product, so to be able to buffer the full throughput over a "typical" delay, for the internet that typical delay often is conveniently taken to be 100ms, so even for a fast link your ISP might still configure dozens of milliseconds of queueing). However, this is all theoretical and I am sure that there are ISPs out there with clue and taste that out of the box deliver sufficiently low latency downstreams so you might not want/need to run sqm on downstream.

1 Like

whoa!

image

in plain Γ€nglish, please...

2 Likes

Argh. just noticed you where talking about another post:

Sure doing a test on a single device is fine as long as the traffic is routed and not simply switched between ports. Something you knew already :wink:

sure, just wanted to know if this is useful to you :wink:

1 Like

Yes, certainly, I do not see myself getting that fast a link (or a home network capable of distributing data at that rate) but I am quite curious how far up we can push sqm-scripts on modern hardware. (So that I know what class of hardware to luck for once my current router stops working)

the dual 2.5GbE card isn't installed in the router/server, so it'll take me a couple of days to install and set it up.

unless you can settle for 10GbE NICs instead :wink:

2 Likes