Hack: net: fq_codel: tune defaults for small devices

I came across this patch from @nbd that for non X86_64 devices sets the memory limit to 4 Mbytes.

I changed the patch for my ipq8074x devices to instead of using 4MBytes to use 32 MBytes and it has been running for months with no issues

wondering if the 660-fq_codel_defaults.patch (on generic/hack-5.15) can be reengineered to accomodate to non X86_64 devices that have enough memory (in my case 1GB and 512MB devices)

1 Like

It is possible in more detail what kind of patch, name?
I have not even heard of this and what devices can I put it on?
Even it became interesting, I want to check it myself

Tell me how to install it ?

if you are running openwrt the patch is applied ... if you want to modify the patch you will need to build your own version

To be honest, I did not understand anything :slight_smile:
You write commands on how to install, and then I'll figure it out myself.

FYI One doesn't need to patch fq_codel to change the memory limit option on the fly:

tc qdisc replace dev eth0 root fq_codel target 5ms interval 100ms limit 10240 quantum 1514 *memory_limit 32MB* ecn

Man page for fq_codel

yes I know about it ... but why setting this up defaulting to 4mb ... I get that in 2017 the memory was an issue but increasingly non X86 devices have enough memory to cope with 32mb... I don't think this patch is required and the default should 32mb taken ... and for devices with limited memory set that up perhaps on rc.local or an init.d file something like the command mention

I suppose it's like every other general decision, to take the highest average use case of available memory/link rate and apply the "do no harm" method.

Because that is going to be 32 MB on every interface... on my turris omnia (OpenWrt21 based) I see 17 fq_codel instances (plus two cake instances) so worst-case 3217 = 544 MB versus 324 = 128 MB (IIRC non-pageable kernel memory, but I might have this wrong). Sure most of the instances will stay (mostly) empty, but in the worst case all of these will fill up and the OOM killer goes to town...

The other question is are you actually seeing throughput "losses" due to dropping "early" due to exhausting fq_codel queue-space?

no I am not seing any throughput losses ... I have a 1 GB device and about 10 clients probably will not be enough to exhaust memory ... maybe I just keep hacking the patch just thought to raise the point that perhaps there is a better way to craft this patch as now other devices beyond X86 can have quite a lot of memory than it used to be before

Well you need both a reasonably long RTT and a highish capacity for causing issues with the default 4 MB buffers, in many cases it should be enough to increase the buffering* for the WAN interface

*) When I say buffering I really mean limit the worst-case fq_codel memory consumption.

1 Like

One of my fears was that folk would setup enormous queue trees with many, many fq_codel instances and attack it with ping floods, etc.

That said, I believe the 4MB limit has to be increased for > 1Gbit devices of any arch, but am lacking a clue on how to do it.

2 Likes

Yeah, I think we could show early on that "pinning" large amounts of kernel memory with say a source port randomizing UDP attack will quickly drive a router into OOM, and with 64 MB ram, even 4 MB was on the high side.... :wink:

Let's run the numbers for LAN traffic? Say at 4 ms RTT (to allow for WiFi in both directions, or would that be 8ms?) the BDP for 1 Gbps would be:

(1000^3) * (4 / 1000) / (1024^2) = 3.81 MB

Which appears to be roughly in the ballpark, except our 4MB limit limits true-size with actually queued data being closer to 50% of truesize*, so for single stream throughput 4 MB could already limit TCP transfers inside the LAN (at least for a classic Reno TCP). According to Appenzeller et al. for a router we might get away with dividing by square-root of N (N being the number of parallel flows), but for our LAN transfer that only helps of we say good-by to single flow throughput (but from say 4 parallel flows on (sqrt(4)=2 :wink: countering the true_size to queued data difference) we should be good even for Reno, no?).

But by the same logic 4 MB for WAN traffic is likely too tight, for maximizing single flow throughput over a 100ms internet-scale connection we would need:

(1000^3) * (100 / 1000) / (1024^2) = 95.36 MB

which seems excessive... indicating that the formulation probably is too sloppy/approximate.

The fun thing with cake is that we can actually see the instantaneous size used... (I have only a ~105/36 Mbps link and even with the default 4 MB I see full saturation for the typical short RTTs of speedtest servers, but I note almost all speedtests by now default to using a few parallel flows)
Doing a single-flow speedtest (speedtest.net) against a server in Ashburn VA I see nothing getting close to the expected throughput (but I have little insight about my ISPs peering with those speedtest.net-nodes in Ashburn... and this is with cake configured with memlimit 32Mb.

*) This is however a rough approximation only, as I do not want to dive into the kernel, I would guess that SKBs are power of two, so for a 1514 ethernet packet, we will need at least a 2K (2048B) SKB, so the expansion factor might be more like 2048/1514 = 1.35270805812 or the inverse 1514/2048 = 0.7392578125 so from 4 MB 4 * 1514/2048 = 2.96 MB would queue packet data, but 50% is easier to think about :wink:

thank you @moeller0 good explanation. I use only sqm for wan traffic not lan and wlan

reading all responses it seems this patch is a good comprimise in particular for those devices limited in memory. There is a workaround to set the memory using tc memlimit

i will close the this thread if there no further comments to improve the patch

Recently I wrote this, more or less confirming that with cake we can get very close the sqrt flows BDP - and still preserve pretty good rtt_fairness, at least on known 27ms path:

1 Like

I had wanted to point out that what you see in cake's memory usage is slow start overshoot. This gets bigger and bigger the higher the memory limit is. :frowning:

Which would be on the order of 2-4 times of the real BDP, I would guess, wildly guesstimating the amount of data in flight at the point in time the first drop/CE signal gets back to the sender?

I would have to look at it harder. cake is def better than codel in this case, but it could be better.

I also see a lot of folk putting in a stupidly low packet limit in in the field (google wifi did this), which truncates slow start also. The mikrotik thing, turns out that SFQ is used a LOT, but it has trouble scaling at all past 300mbit for single flows, and people were copy/pasting the old packet limit in to a fq_codel upgrade. And so it goes.

It has been really great to be able to simulate a zillion different common ISP plans and see the results in real time. I do have to make what tests I am actually running more visible, but the 27ms one over here is golden: https://payne.taht.net - I think there is plenty more room to improve things between 100mbit and 2gbit, now.

1 Like