Limitations on connections in ISP equipment (was LEDE packet per second performance versus other firmware)

schnappi · January 16, 2018, 10:30pm

Very useful information. Previously really only paid attention to the pid usage list and "usr" %.

This is the Tomato router (which is same as Debian output that am used to), note the problem server is currently off but want to take another look at this later:
CPU: 0% usr 0% sys 0% nic 94% idle 0% io 0% irq 4% sirq

Also have some odd version of top that apparently put on Arch desktop but don't remember doing anything like this. Think this version of top might have been bundled with XFCE (not that it is relevant but if anyone knows what version of top this is let me know).

top - 17:22:22 up  2:28,  1 user,  load average: 0.39, 0.66, 0.77
Tasks: 227 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
%Cpu0  :   1.3/1.3     3[||                                                   ]
%Cpu1  :   0.7/3.4     4[||                                                   ]
%Cpu2  :   1.3/2.6     4[||                                                   ]
%Cpu3  :   0.7/8.7     9[|||||                                                ]
%Cpu4  :   0.0/4.1     4[||                                                   ]
%Cpu5  :   2.0/1.3     3[||                                                   ]
%Cpu6  :   9.2/5.3    14[||||||||                                             ]
%Cpu7  :   0.7/4.6     5[||                                                   ]
GiB Mem : 57.5/7.722    [                                                     ]
GiB Swap:  0.0/0.000    [                                                     ]

moeller0 · January 16, 2018, 10:38pm

Basically, what @dlakelan explained:

Here are the upper two output lines of "top" run on my router with not much going on:

Mem: 45764K used, 14280K free, 1180K shrd, 3848K buff, 9132K cached
CPU: 0% usr 0% sys 0% nic 98% idle 0% io 0% irq 1% sirq

and here with using a demanding test (flent's rrul_cs8 over the router's 5GHz radio):
Mem: 44720K used, 15324K free, 1064K shrd, 2676K buff, 6952K cached
CPU: 0% usr 0% sys 0% nic 1% idle 0% io 0% irq 97% sirq

This router is operating close to its capabilities once both up- and downlink are saturated, but for my 50/10 link this still works out okay, just barely. Now you can use "top -d 1" to get updates every second (instead of every 5? seconds). If you see 0% idle too often your router simply is trying to hard. "too often" obviously is a bit subjective, but please note that traffic shaping will generally not work too well once there are no CPU cycles left...

escalade · January 17, 2018, 12:59pm

I'm easily maxing out a 250/250Mbit connection with an Archer C2600. Hell, my 9 years old WNDR3700 could hit like 180Mbits I think. So yeah, no need to go x86

dlakelan · January 17, 2018, 2:56pm

It's not so much a strict need, as it's by far the best use of time and money, particularly since OP has spare x86 boxes sitting around.

BIGFAT · January 17, 2018, 4:09pm

I doubt that. I have here an archer c7v2 and an archer c5v1 that are not able to completely manage sqm with an 100/40 mbit connection (100% sirq usage). And they both have a more powerful cpu than your wndr3700.

Now im using a spare atom n270 utm. It need 13 to 19 watt power consumption and is enough until hyper super mega vectoring will appear in my area (250mbit). Since i have up to 40 devices in my network, im happy to have switched to x86.

dlakelan · January 17, 2018, 4:31pm

Yes, the key isn't really how much can a router push through on a single connection with no queue management or QoS, using hardware offload NAT, the real question is can the router handle real world loads while doing a good job of QoS and keeping your VOIP calls from falling apart and how much specialized fiddling around with it all do you have to do?

There are many advantages to an x86 and they aren't all just packets per second or bits per second, but the typical speed tests with one machine trying to pull down a single large bitstream isn't really a good one for a small business with multiple devices.

Consider the workload when you come back from a multi day business trip where you've been working offline on your laptop a lot. You plug in the laptop and open the lid. The first thing it does is try to download 2000 email messages 95% of which are spam and the rest are important and have multiple attachments, simultaneously it tries to synchronize a whole directory full of code files to Google Drive (or whatever some similar task is). Or you do an "apt-get update; apt-get upgrade" on your desktop linux box and have to download 1500 deb files. Does your network fall over and your whole office loses its VOIP calls for 15 minutes as the router can't keep up with opening and closing connection after connection?

I keep posting this link around here but I think it's great that this guy went to the trouble of actually collecting numbers, and the numbers are pretty startling:

Those netgear nighthawks still go for $150 or $200 on amazon depending on which one you buy. Here's his speed results (damn I wish he hadn't used a stupid 3d excel bar chart though)

Suggests that a very high end expensive consumer router when synchronizing multiple medium size files (a large directory full of say 100k byte files) is pushing something like 30% of what it can achieve in a simple single-stream speed test. The x86 is pushing ~100%. The older linksys is pushing like 5% if it doesn't just fall over and reboot.

EDIT: in an era where an x86 box meant spending $800-1500 for a micro-atx mini-tower with 150watt power consumption and two spinning hard disks... little embedded routers made a LOT of sense. In an era where there are mini pcs that cost $200 and have dual NICs built-in and are fanless with no moving parts, the decision landscape is very different. I think the optimal modern scenario seems to be a low end x86 wired-only router, and an array of several 802.11ac access points, and this is true even in many home scenarios, but particularly in small business scenarios.

BIGFAT · January 17, 2018, 5:01pm

"Small business" is basicaly what im driving here. My own network with my x86 as core router, one archer c5v1 and an older wdr3600 as a wds bridge and up to 15 devices. Lot of devices streaming, gaming and downloading ect. Then i have my completely free hotspot with 6 aps (all meshing and inside a single l2tp tunnel into Freifunk to "anonymise" my own uplink) managing 10-30 clients. (All depending on time of the day. I provide free internet for the "poor" here and if i have that flea market once a month directly infront of the house). 100/40mbit are enough to get 70% sirq on an old atom n270.

Just dont get an atom 230/330. Systems based on them consume around 30-40 watt no matter of load, because no power saving available.

dlakelan · January 17, 2018, 5:15pm

Check out those netdev_budget sysctls I linked to above. By giving the machine longer to process packets and letting it process more per sirq you can often drop the overhead associated with performing sirq and might drop your sirq load somewhat, giving you more time for shaping or whatever. Or not, depending on the scenario. But worth taking a look. On my machine I use:

net.core.netdev_budget = 800
net.core.netdev_budget_usecs = 5000

and with a j1900 it hits about 40% sirq pushing 500-800mbits in dslreports speed test, with HFSC traffic shaping. Before I had those in place it was hitting substantially higher levels of sirq, can't remember exactly the numbers.

BIGFAT · January 17, 2018, 5:25pm

Which file do i have to edit? And how can i find out the right settings and see the defaults one? I dont think that i can just copy past your settings, since we have different cpus. No way to let the kernel decide automaticaly? I dont really like to mess around with kernel settings.

dlakelan · January 17, 2018, 5:37pm

sysctl -a 2>&1 | grep budget will show your current settings.

edit /etc/sysctl.conf and add the two lines I have above. then sysctl -p to reread the file (as root).

These are "budgets" so basically the maximum time or number of packets that are allowed to be processed in one sirq. You wouldn't want to set them enormously high but I think you can safely bump up to my settings, I think defaults are like 300 packets and 2000 microseconds. Your processor is slower, so it'll do fewer packets in the same time budget. If anything you might want to budget a little higher due to the slower processor.

that usecs budget is only available in later kernels: 4.12 or later I think I'm running 4.14.12

BIGFAT · January 17, 2018, 5:43pm

Thank you. Well im running stable Lede with 4.4.92 kernel. Are snapshots not at 4.9.xx at the moment? Then I will have to wait or try a custom build. I will bookmark this for later.

dlakelan · January 17, 2018, 5:55pm

Yeah, you can do the netdev_budget version, but not the _usecs version I think, still worth a try. Basically what it does is under high load allows your sirq to process more packets at once, thereby decreasing the number of sirqs you have to perform and reducing the overhead of context switching.

schnappi · January 17, 2018, 7:43pm

After replicating the bottleneck today the CPU stats are as follows during the bottleneck. This confirms it is not a CPU issue. However it was worthwhile to double check this and learn how to understand a CPU load. Am very interested to see if a dedicated hardware router (referenced as an x86 router throughout this thread) solves the issue.

Mem: 36816K used, 218760K free, 0K shrd, 5108K buff, 14372K cached
CPU:   0% usr   1% sys   0% nic  94% idle   0% io   0% irq   3% sirq
Load average: 0.00 0.00 0.00

On a related note the (RT-N66U) hardware and Tomato software are confirmed working fine as was able to tax the CPU slightly with an FTP upload (that does not bottleneck the connection):

Mem: 36472K used, 219104K free, 0K shrd, 5108K buff, 14372K cached
CPU: 0% usr 0% sys 0% nic 65% idle 0% io 0% irq 33% sirq

Will be a few days to test since bought two Gigabet Intel NIC cards instead of using the Fast Ethernet cards of which found many. Honestly though in my case it probably wouldn't have mattered since have limited use for LAN connection outside of 100/50 WAN connection and already utilize a few Fast Ethernet WRT54GL switches.

dlakelan · January 17, 2018, 8:44pm

intel NICs are just very stable and reliable and have various hardware offload features that makes them work very well in a router situation. So it's a good idea.

I'm suspicious that you're experiencing a serious network issue while your "top" stats say you have 94% idle and only 3% sirq. What are the symptoms that occur during this "bottleneck" and how are you able to trigger it?

schnappi · January 17, 2018, 8:56pm

A server spewing TCP and UDP connections causes the issue. When this is turned on without rate limiting the number of TCP and UDP (currently set to 100 and 300) connections every other machine on the network cannot load wepages, do nslookups, tracerts, a web server fails to respond to outside queries, or otherwise utilize the WAN connection. Sometimes for example an nslookup will go through instead of failing completely or a webpage might load but it will take an extended period of time and the webpage HTML will not load completely. Some things that tried include accessing websites via an IP address, directly querying a particular DNS server instead of using router DNSMasq, testing the connection via OpenVPN and an SSH socks proxy (established SSH connections appear to be okay). Some things that plan to try at some point prior to installing hardware router are reducing connection timeouts, limiting connections to 2,000, and trying to access internal LAN addresses while bottleneck is taking place.

Right now the Tomato router is at about 1750 connections. Have noticed that when this issue occurs that the connections spike to about 2400 so it is possible that about 2,000 connections or something is the entire issue here but do not know since do not often look at the connection count. Tried limiting the connections to 1,000 and regular network functionality stopped working.

dlakelan · January 17, 2018, 9:02pm

it might well just be that it's making way way too many connections and sucking up router NAT table entries as well as bandwidth. You might get good results from setting up LEDE with sqm, or by using iptables rules to limit the rate at which this device can make connections. Not sure if tomato really offers you those things, but ultimately I think you need to move beyond tomato whatever way you go.

What does this server do that it needs to open thousands of TCP and UDP connections simultaneously? Is that legit stuff it's doing or is it infected with a botnet?

schnappi · January 17, 2018, 9:08pm

Legitimate but not necessary. Run a bridge (which is hopefully used by those in Iran, China, and Turkey) to access the internet (but not exit) since have plenty of bandwidth that do not come close to utilizing. So again very unnecessary but try to do small part to bring the end of regimes in other parts of world while learning good material along the way.

Right now just rate limit everything on the individual server with iptables. Theoretically one can edit/ add iptables rules directly in Tomato. Also had tried increasing something called a "hash table size" from 2048 to 4096 but this did nothing and did not know what was actually changing.

Do you know where "NAT table rules" might be stored on any Linux based router or where these can be viewed?

iptables --table nat --list
only shows firewall rules (aka port forwarding rules)

dlakelan · January 17, 2018, 9:28pm

you want to adjust sysctl settings, no idea how that works on tomato, but in a regular linux system you'd want something like to edit /etc/sysctl.conf

https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txt

documents sysctls relative to connection tracking. You'd probably do well with something like:

net.netfilter.nf_conntrack_max = 20000
net.netfilter.nf_conntrack_buckets=5000

more or less nf_conntrack_max says how many simultaneous connections you can have in your connection tracking, and the buckets should be about 1/4 as big (for quick lookup of connection information)

You also would do well to limit the connections your gateway box will be allowed to use, and its bandwidth usage should go into a lower tier in your QoS.

for example

iptables -A FORWARD -p tcp --syn -m connlimit --connlimit-above 2000 --connlimit-mask 32 -j REJECT --reject-with tcp-reset

I think will prevent more than 2000 connections coming from any one machine on your LAN. With 20k connections in your conntrack_max this means no machine can use more than 10% of the total connections.

dlakelan · January 17, 2018, 9:37pm

IPFire has various QoS settings and you could create a lower priority tier exclusively for use by this machine. When you need bandwidth, all the users of the bridge server will just wait until you're done. This can be done in LEDE but it really requires a custom config there's not an easy GUI or anything like that. There were some older qos scripts but it seems they've been deprecated in favor of sqm which handles the typical case for a home router well, but doesn't have this kind of customizability really (that I know of).

dlakelan · January 17, 2018, 9:51pm

Another thing I'd do if I were you in this situation is put that onion router or whatever it is into a separate VLAN and treat it like a guest network. Firewall it off from your main LAN so it just gets access to the internet connection.

You might think about putting in a managed switch, if you don't have tons of cash to put towards all this, you could do fine with something like:

Connect your router LAN to port 1, put your regular LAN desktop machines on port say 2-6 (or hang further switches), put your guest/onion router on port 7, and hang a PoE switch off port 8 for any VOIP experiments.

Make port 1 belong tagged to all the VLANs, make 2-6 be tagged VLAN 1, make port 7 tagged vlan 10 for guests, and VOIP tagged VLAN say 2. Prioritize the VOIP VLAN highest, VLAN 1 medium, and guest VLAN lowest. You can also probably configure bandwidth limitations on ports, so you could limit port 7 to say 30 Mbit so it can never saturate your connection. It's one thing to donate some resources it's another to have that donation completely eliminate usability for your main purpose.