Recommendation for 10G NAT

Hey friends, how are you doing.
I am a quite big fan of OpenWrt and using it in my home enviroment for almost 10 years.
I am working as a network engineering consultant, and in my company we are facing an issue now.
We have purchased the wrong router ( a 24k$ mx series juniper one) for our case, and now because of a time critical project, we have to provide a solution as soon as possible.
So i was thinking about something. In our usecase we got more or less 700 VLANs and one public ip address.
The only thing the router has to do is NAT from the private lan space to the available public ip address using dynamic PAT (like in common home enviroments), and also DHCP for 65000 clients scattered over all these 700 VLANs. And that's it. These VLANs are configured as /26 subnets each. Also a /48 IPv6 prefix is provided and should be distributed to the clients. The WAN interface is configured as a static address interface and uses a static route to the next-hop upstream router. Furthermore i perform Inter-VLAN routing from the LAN zone subnets to the WAN interface. No further features necessary.

Now i need some help for a proper hardware recommendation. Currently we've got a 6 Core Xeon Socket 1151 CPU available with Supermicro X11 Mini-ITX motherboard. Furthermore an HP NC523SFP NIC and i want to set up a link aggregation on both interfaces, and create VLAN interfaces on the bonding interface.

Is this configuration sufficient in order to perform the required tasks? And is there anything else i have to take into consideration for this setup?
May i run into issues, due to the high amount of clients, and PAT? Risk of running out of ports? Does IPv6/IPv4 mixed addressing help me in this case?

Thanks a lot for you guys in advance!

It should work, whether it meets the full 10G speed is a question that can only be answered by testing I think, but it has a good chance. Enable receive packet steering, irqbalance, and do your vlans on top of bond as you suggest. That should spread the workload across the cores very well.

I'd recommend to switch over to nftables so you can customize the firewall nicely but it's not absolutely necessary, still with 700 vlans and maybe various inter vlan routing rules it could be better to have one great firewall script. There are some active threads on nftables and it seems to work flawlessly on current openwrt.

I'm always amazed these Juniper type devices are so popular. A Linux server seems to be a better solution in general.

2 Likes

Also you might consider running kea DHCP server for a network this size. dnsmasq is not designed for this.

2 Likes

I know it's FreeBSD but you can probably get a few hints about hardware limitations and design

Ahh, nm... You're asking about "just" 10G :wink:

Be aware packet processing adds a lot of load (overhead), I would estimate somewhere that your avg throughput would drop ~30-60% or so compared to unprocessed. You'll also need to do a lot of tweaking to get it running somewhat optimal and that includes disabling/removing patches in OpenWrt to get an idea of what's going on in terms of performance and any are stripped out by default.

@dlakelan
You buy networking gear because of SLA, support etc not because you want some random whitebox that you "hope" will do the trick. When it will crash and burn (which most likely will happen at some point) it's not going to be a fun experience.

2 Likes

Thanks a lot friends! The plan was to use the Juniper router, and use the Server which is now left, as an isc-dhcp-server DHCP server with Debian. Now plans have changed, and it's going to be used for OpenWrt. The max. bandwidth is 10G, but the average link utilization is around 1Gbps and peak 2Gbps.
Is there some good way to measure the current flow rate through an interface?

Furthermore we were able to reduce the amount of vlans to 31.
Regarding DHCP, i think you are right, i didn't see dnsmasq in such an enviroment. Is there some good tool in order to benchmark DHCP servers? I'd like to simulate some things before going into production.
I know perfdhcp vom "kea-admin" package on Ubuntu and Debian distro, but i didn't find any template or something else to properly stress test the DHCP.
Also the max. client amount is 60k (Limited due to the subnets), average daily is 15k.

I would recommend looking at a distro/OS that's not mainly targeting embedded devices but that's up to you.

I guess you could write up something relatively easily using dhcping https://www.mavetju.org/unix/dhcping-man.php
What do you mean by "flow rate"? Do you want something like ntop(ng) which will need a lot of horsepower, netflow exporter etc?

1 Like

This server will handle that in spades. I agree that a more "full" distro like Debian might be a good alternative. OpenWrt is designed for embedded devices with a few tens of megabytes of flash and ~50-100MB of RAM.

This is like the "No one ever got fired for buying IBM" approach. On the other hand, when Google came along it was standard to buy all your "heavy servers" from HP Enterprise or Dell Enterprise or whatever. For similar reasons. But they took the approach "let's buy craploads of cheap white boxes and they'll fail daily but our architecture is so heavily redundant that it wont' matter. As a router, those $24k Juniper devices could be replaced at similar cost with $1k devices, and you could buy 4 of them and still save 83% of the cost. IF you have the brainpower to make it work, and I think the real reason people buy Juniper is because they don't hire the engineering talent that could design that 4x redundant box solution. That makes fine sense for a door and window manufacturer, but I've never understood why ISPs or such would go that route.

Of course switches are a different story. There the hardware itself is extremely specialized, with the switching fabric, and if you want to switch 10Gb or 40Gb or 100Gb you'll just need to fork out the cash.

2 Likes

If you seriously have that many clients...then this won't work without more Public IPs. A single IP only provides ~64k TCP and UDP ports...and ~2k are reserved by the router's kernel.

You need a much larger "NAT pool" of SRC addresses than a single Public IP.

2 Likes

Thanks a lot. I think i have to address this issue soon, i've ordered an IPv6 /48 subnet due to this issue.
What might happen in case i use dnsmasq, is there any experience of someone when using 10000+ clients?
Is it worth a try? I personally love isc-dhcp-server , is it capable of handling this?

i'd say give it a miss...

Have you considered switching to an ipv6 only LAN? I ran that in a home office environment for over a year and it was totally fine. (Using DNS64 and NAT64 on the router, with tayga for the NAT64)

It is, honestly, a LOT easier to work with just one protocol that actually functions properly rather than a mess of broken NAT bullshit and DHCP configs and etc.

Even if there are a few VLANs where you absolutely need ipv4 due to some older proprietary software or something, isolating those to a small subset shrinks the complexity a lot.

1 Like

Thanks a lot, i will try that out.
After all these suggestion, im thinking about using Debian as a router for this simple scenario, OR, use isc-dhcp-server on OpenWrt instead of the default one.

Using dnsmasq and stress-testing it through perfdhcp i got a drop rate of around 49% after 25 leases per second.
Using isc-dhcp-server it works flawlessly on up to 1500 leases per second.

On dnsmasq it looks like that, even if i set up the whole 10.0.0.0/8 space for DHCP.

Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPOFFER(eth0) 10.128.30.35 24:9f:b2:44:5f:75
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPREQUEST(eth0) 10.128.30.35 2c:84:2c:c1:2a:b8
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPNAK(eth0) 10.128.30.35 2c:84:2c:c1:2a:b8 no leases left
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPDISCOVER(eth0) 2e:a0:69:26:a6:98
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPOFFER(eth0) 10.128.30.35 2e:a0:69:26:a6:98
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPREQUEST(eth0) 10.128.30.35 d2:fc:e3:d8:bc:7f
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPNAK(eth0) 10.128.30.35 d2:fc:e3:d8:bc:7f no leases left
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPDISCOVER(eth0) f8:32:92:61:ae:03
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPOFFER(eth0) 10.128.30.35 f8:32:92:61:ae:03
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPDISCOVER(eth0) 27:97:61:ac:fb:f8
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPOFFER(eth0) 10.128.30.35 27:97:61:ac:fb:f8
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPREQUEST(eth0) 10.128.30.35 03:92:d7:c6:9b:4b
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPNAK(eth0) 10.128.30.35 03:92:d7:c6:9b:4b no leases left
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPDISCOVER(eth0) 58:a2:e0:a1:27:75
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPOFFER(eth0) 10.128.30.35 58:a2:e0:a1:27:75
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPREQUEST(eth0) 10.128.30.35 b6:55:ba:87:e7:10
Mon Dec  6 12:53:47 2021 daemon.info dnsmasq-dhcp[3852]: DHCPNAK(eth0) 10.128.30.35 b6:55:ba:87:e7:10 no leases left

No, they could not. If you really think you can do that, then you definitely should You would get every single ISP in the world as a customer and make a fortune.

But there are usecases which cannot be solved by a $24k router. NATing 65k clients to a single address is one such case.

2 Likes

You realize that the juniper devices run Linux internally right?

I'm not talking about their 100G or 400G stuff with fancy hardware offloading (but aren't those a lot more expensive than $24k?) for 10Gbps or less routing I promise you a $1k or $2k box can do what their $24k box can do. Except connect you to their support line.

And if the box is $1k then you can easily have 2 of them running simultaneously with hot failover and a third unplugged in the rack for when one of them dies a hardware fault. That would cost you $75k with juniper gear and $4k with a white box solution.

All cars are not the same because they have 4 wheels... I would highly recommend you to follow bmork's suggest if you think that's possible.

As a sidenote, many Juniper devices runs Junos which is based on FreeBSD not Linux :wink:
https://www.juniper.net/documentation/us/en/software/junos/junos-install-upgrade/topics/topic-map/junos-os-overview.html

No, but all cars sold today can transport 1-2 people at 40MPH. Routing 10Gbps is a little like that. Routing 100+ is NOT. just like track racing a car requires something nonstandard.

I'm surprised anyone thinks this is controversial. There are plenty of places that route stuff with cheaper boxes. Buying Juniper for this level stuff is about buying a service contract and someone will promise to show up with a new box in less than 12Hrs.

The lesson is "No One Ever Got Fired For Buying Juniper/IBM/Cisco/Microsoft" is a powerful sales technique even at the low end. Im not even saying it's wrong, for many businesses a $24k router is cheaper than hiring an engineer who could make you a $3k redundant solution.

I'm just surprised there aren't more people doing what the OP is doing where the talent is already available.

1 Like

Hmm, our MXes still run FreeBSD on the routing engine :slight_smile: Dunno what they run on the different line cards, but I guess it's more than likely that some of them run Linux.

This doesn't matter. It's like the Qualcomm SDK is really OpenWrt.

In that pric range I assume we're talking about something like the MX204, which is a quite decent box with 400Gbps bandwidth. AFAIK, this small box also comes with their Trio chipset "with fancy hardware offloading". You pay for the integrated silicon and software, not for the 1U server chassis.

I guess it's the "JunOS Evolved" line: https://www.juniper.net/documentation/us/en/software/junos/overview-evo/topics/concept/evo-overview.html

maybe that's not super common, I'm not a guy who specs and buys Juniper gear, happy to be ignorant of their full product line, I just went on the basis of what the website says.

Yeah, at 400Gbps I'm with you 100% that is definitely not white box territory. I'm not a purchaser of high end networking gear, so maybe this $24k box they're talking about is that level of thing, but then why buy it for what OP said was:

It seems quite overkill to buy a 400Gbps routing engine for 1-2Gbps of real load. A $300 supermicro board can handle that.

I currently have a Juniper MX150 box with useless licenses so the cost was about 24k$ with licenses.
It had some FreeBSD based OS installed, but i've upgraded to the latest one which is JunOS Evolved based on Linux.

1 Like

You could also change out Debian with vyOS if you like. It's also based on Debian bullseye as well and comes with the isc-dhcp server and is a lot more specific to routing and firewalling. Might be worth a look at :slight_smile:

They also use nftables for NAT as well since v1.3 :wink: