Ubiquiti EdgeRouter X, Loading OpenWrt and performance numbers

This seems clearly wrong for WANs less than 200Mbps unless I'm missing something? The cpu easily has the power to handle say 100Mbps and cake gives smooth as butter results for that.

So the question is if your WAN is faster than about 200Mbps should you

  1. Use an ERX with cake set to 200Mbps?
  2. Use an ERX with HWFO and no SQM?
  3. Use something else that has more CPU for cake?
  4. Something else?

A lot of it comes down to how much faster is your WAN and how much tolerance for bufferbloat you have. If you mostly download videos in batches and then watch them off your LAN, you'd probably want option 2. If you have to talk to people on your VOIP phone you will DEFINITELY want option 1 or 3.

In my opinion bufferbloat control is one of the main things a router should do, and cake or HFSC + fq_codel are the only schemes I'd put up with, so it comes down to choosing enough horsepower to be able to run one of those schemes at bandwidth rates you actually have... if you care about latency (basically if you are a gamer or a voiper or you watch an IPTV service or similar)

The thing is, to throttle your WAN requires you to drop packets. You can do it by policing the packets in a smart switch, but then the smart switch itself is interfering with your measurements (it's a Heisenberg Uncertainty Principle kind of thing).

Hard coded my core switch down to 100Mbps. My internet is CenturyLink Fiber 1000/1000. I can get 900Mbps+ on both up/download. Also used the advances setting on dslreport's speed to to get high resolutuion Bufferbloat. I love it when a hypothesis is tested, and fails, even at 100Mbps. SQM wins!

HW @ 100Mbps http://www.dslreports.com/speedtest/44463330
HW @ 100Mbps http://www.dslreports.com/speedtest/44463377
HW @ 100Mbps http://www.dslreports.com/speedtest/44463780

(Rebooted to make sure setting are correct, also confirmed by CPU load)
piece_of_cake @ 100Mbps http://www.dslreports.com/speedtest/44463083
piece_of_cake @ 100Mbps http://www.dslreports.com/speedtest/44463140
piece_of_cake @ 100Mbps http://www.dslreports.com/speedtest/44463197

root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth0'
        option qdisc_advanced '0'
        option linklayer 'none'
        option interface 'eth0.201'
        option verbosity '5'
        option debug_logging '0'
        option qdisc 'cake'
        option script 'piece_of_cake.qos'
        option download '200000'
        option upload '200000'
        option enabled '1'

Again reboot to make sure these settings are applied correctly
Cake, simple.qos @ 100Mbps http://www.dslreports.com/speedtest/44463574
Cake, simple.qos @ 100Mbps http://www.dslreports.com/speedtest/44463620
Cake, simple.qos @ 100Mbps http://www.dslreports.com/speedtest/44463675

root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth0'
        option qdisc_advanced '0'
        option linklayer 'none'
        option interface 'eth0.201'
        option verbosity '5'
        option debug_logging '0'
        option qdisc 'cake'
        option download '200000'
        option upload '200000'
        option enabled '1'
        option script 'simple.qos'

If you throttle at your switch to 100Mbps, you should really set cake to up and download about 95000 not 200000 at which point you'll see those small number of lag spikes on upload probably go away.

it's always going to be when the bandwidth is too much for cake that you'll have a problem with cake, so long as CPU idle stays above 5% at all times cake or other shapers are going to give the very best responsiveness, HW acceleration will be way worse.

when your bandwidth is very high though, you'll have to throttle it back to let cake handle the CPU load, or get more CPU.

Hey all, glad to see some active discussion about this stuff

I'm looking for a new NAT router to connect to my DOCSIS 3.1 modem

My download speed is half a gigabit - so I'm concerned about the ERX's ability to perform, given my requirements:

  • QoS at line rate
  • efficient & low power

With stock software and config, the smart queuing is good for what.. 200 Mbps at best?

@Lochnair from the ubnt forums is why I'm even considering the ERX:

I spent a good part of the morning today reading through these entire threads:

So there's this thing called Qualcomm Fast Path (SFE), and it does HW accelerated QoS. Some people in that thread have achieved ~500 Mbps speeds with QoS

If that's possible I would like to pick up an ERX

@SIG I've been working on setting up an espressobin as a hot-spare router. I have gigabit line rate and I want something that works at a large fraction of that and can run various important infrastructure on my network when the main router is down. Anyway this morning I got it working at least enough to test, and did some tests with my custom HFSC shaper. It shapes at least 450 Mbps of iperf3 client output no problem (so both generating and shaping packets)

You can buy a $50 board, ~ $10 micro-SD, ~ $10 power supply, and stick it in a plastic food tub with holes drilled in it and put it in a closet (or download a 3d printable case and head over to a public library or buy a small metal project box or whatnot), It's way more capable than the ERX if only just because you can use the SD card to install tons of stuff and it has a Gig of RAM, it's unbrickable because you can always yank the SD card and rewrite the whole thing. I'm not running OpenWrt but it is listed as supported in the OpenWrt table of hardware.

If you want something under $100 that's the way I'd go based on my experience. the ERX seems reasonable, but not quite in the same speed class with SQM/QoS.

1 Like

QoS precludes using acceleration (hardware and software), as it always needs to have the full netfilter stack (the slow path) to be in charge of every single packet, rather than being able to classify them as being part of a safe flow and offload them. While a good offloading mechanism will cope with QoS in the sense that it will hand over control to the QoS mechanism, you'll lose the speedup of offloading that way - so you do need a platform that is powerful enough to cope with the system load without relying on offloading and the additional tasks of actually doing SQM, that leaves you with x86 and maybe mvebu (yes, the NSS cores on ipq80xx also provide a QoS implementation in hardware (firmware), but that isn't available for OpenWrt yet).

1 Like

my understanding is the software flow offload does work with shaping as it still hands the packet to the output queue, but you won't be able to use DSCP tagging using iptables unless you do it before the flow-offload command.

1 Like

If you have a proper DOCSIS 3.1 deployment, it might have AQM already enabled and you may not need to shape on your router to address bufferbloat. At least that's the theory and I would test it before spending a bunch of money to get something that will handle SQM at those rates.

Yes, I believe it's mandatory in the DOCSIS 3.1 standard. On the other hand, the algorithm they use allows substantially more bufferbloat than does cake, on the third hand, it's still not a huge amount, 20ms or something like that is typical, enough that VOIP jitter buffers can usually handle it. Games might be a different story.

In any case, it seems like the two nice price points right now are espressobin at $50 + case + power supply for anything less than say 500Mbps, and x86 mini PC at around $200 for over 500 Mbps

For wifi consider an external access point, separate, replaceable, expandable, and you can situate it separately from the router to get better signal.

1 Like

@dlakelan ESPRESSObin is a temping alternative - going to consider it. How do you think it's going to hold up with long term reliability and support? I'd have to get a good heatsink for it

@slh While I understand what you're saying, how do you explain the people seeing these insane throughputs with SQM+SFE

Theory is nice, ain't it? However, I don't know of any proper deployment of any DOCSIS system that's not in a CableLabs' lab. I see about 250ms while loaded

I'm looking to go very deep down the rabbit hole regarding the manual fine-tuning of the latest and greatest QoS/SQM software

I need the first hop behind the DOCSIS 3.1 modem to be SQM protecting bandwidth sharing among my family

To bring this back on topic regarding ERX+SQM+SFE:

Is this Qualcomm Fast Path thing worth getting into with the ERX? If not, I cannot consider the ERX as a viable option. I read there's an issue with UDP offloading (is that issue with ingress or egress?)

From what I've read on the ubnt forums - the currently-used Cavium offloading is better.. but it can't help with SQM

In this case espressobin is the minimum and x86 if you can afford it. The hardware fastpath stuff simply isn't compatible with QoS at all

It's been running on my desk for several days continuously and isn't even very warm. I'm going to print a little case for it at my library mostly because I want to try their printer. Otherwise I'd probably throw it in a Ziploc tub and drill some holes in the case for airflow and shove it in my closet and forget it.

It's very low power at idle. Even running a 20Mbps 4k video stream is basically idle for this board.

As we run down the rabbit hole. I have an original APU board, the crappy one, with the Realtek nics, and only dual core. Looks like the Realtek nics, or something deep, is not taking advantage of the two cores. So a single core running cake maxes out around 300Mbps.

https://www.pcengines.ch/apu.htm

Stock APU board: http://www.dslreports.com/speedtest/44581168
APU board with Cake: http://www.dslreports.com/speedtest/44581272

root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth1'
        option interface 'eth1'
        option script 'simple.qos'
        option qdisc_advanced '0'
        option linklayer 'none'
        option enabled '1'
        option debug_logging '0'
        option verbosity '5'
        option qdisc 'cake'
        option download '300000'
        option upload '300000'

Thanks for the additional data point. I thought cake in general didn't take advantage of multiple cores?

1 Like

I think that's right at least for single instances. I'm not sure if separate instances might have separate threads?

1 Like

Keep in mind that a router doesn't only have to deal with cake, there's also the need to serve the incoming hardware IRQs, do the actual routing, NAT, wireless, etc. If you have a computationally heavy process pegging one core, it's good to have another one 'free' for the normal business (meaning at least two cores really make sense on a router, especially for VPN, cake or similar uses).

2 Likes

Thank you all for this thread. It made me go and buy one to replace my E4200v2 Linksys as my home router.

Here are some performance numbers regarding this router with OpenVPN.

Remote MacBook Pro (OpenVPN/iperf3 client) > (tcp over wan) via EdgeRouterX (OpenVPN server) > iMac at home (iperf3 server)

iperf3 -c 10.66.77.22
Connecting to host 10.66.77.22, port 5201
[  5] local 10.2.1.2 port 51385 connected to 10.66.77.22 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  2.99 MBytes  25.0 Mbits/sec
[  5]   1.00-2.00   sec  3.07 MBytes  25.9 Mbits/sec
[  5]   2.00-3.00   sec  2.92 MBytes  24.5 Mbits/sec
[  5]   3.00-4.00   sec  3.11 MBytes  26.1 Mbits/sec
[  5]   4.00-5.00   sec  2.62 MBytes  21.9 Mbits/sec
[  5]   5.00-6.00   sec  2.79 MBytes  23.4 Mbits/sec
[  5]   6.00-7.00   sec  2.96 MBytes  24.8 Mbits/sec
[  5]   7.00-8.00   sec  2.81 MBytes  23.6 Mbits/sec
[  5]   8.00-9.00   sec  2.79 MBytes  23.4 Mbits/sec
[  5]   9.00-10.00  sec  2.96 MBytes  24.9 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  29.0 MBytes  24.3 Mbits/sec                  sender
[  5]   0.00-10.19  sec  28.4 MBytes  23.4 Mbits/sec                  receiver

iperf Done.

Note: I have FTTH connection at home.

For comparison, on the Linksys, I was getting slightly slower speeds.

[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  25.9 MBytes  21.7 Mbits/sec                  sender
[  5]   0.00-10.14  sec  25.5 MBytes  21.1 Mbits/sec                  receiver

Next thing on my list, is looking into switching from OpenVPN to Wireguard for better performance.

The latest version of the Espressobin is available with an enclosure and a faster 1.2GHz processor from Amazon for $79.

One other person reported some instability on that v7 espressobin. I'd be interested to hear from anyone who tries it, in a separate thread.

I own an ERX and an Asus ac68u which has different architectures. The ac68u has a cpu which is similar to the one in espressobin hence I thought it might be relevant to bring previous posts I've made in another thread.

I wanted to compare sqm performance in these routers so I've run a couple of tests.

The asus ac68u has a dual core ARM which I've tried using frequencies between 400 mhz and 1400 mhz. Results are posted in this thread

This is a massive thread.. glad I found it..

Asking this here for it's relevance to this device.. When trying to make the serial connection..

https://www.db9-pinout.com/

I need pins 2, 3, and 5 to make the serial connection work right?

Not having success and wondering where I'm doing whatever wrong..