Since the question comes up more or less frequently, which hardware to buy for a x mbit line and OpenVPN / SQM: Would it make sense to provide this information in form of a graphic or as tabular display in the wiki?
This graphic / table could help users choosing the right device for their usecase.
Quick and dirty draft (no guaranty for correctness):
In terms of routing performance mvebu should be (significantly, close to 1 GBit/s line speed) above ipq806x (at least for as long as ipq806x' NSS cores aren't supported in OpenWrt; that also applies equivalently to SQM), for VPN uses both ipq806x and mvebu should be quite similar.
On x86_64 hardware, I'd further split into with and without AES when you get to VPN performance.
Also, I'm guessing you'll find that you need a minimum of "a core per thing", perhaps plus one.
I like the "blob" view, or a variant of it, or using ranges of some sort, since there certainly will be people who say "well, the graph said that I could get 300 mbps with an IPQ806x and it hits 100% on a 250-mbps line"
root@OpenWrt:~# openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 85740063 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 28905987 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 7736828 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 2011451 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 254323 aes-256-cbc's in 3.00s
OpenSSL 1.0.2p 14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,64) rc4(16x,int) des(idx,cisc,2,int) aes(partial) blowfish(idx)
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 457280.34k 616661.06k 660209.32k 686575.27k 694471.34k
root@OpenWrt:~#
Orange Pi Zero Plus H5 Quad-core 64-bit Cortex-A53
root@OpenWrt:~# openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 16699551 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 9812427 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 3634138 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 1061078 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 139464 aes-256-cbc's in 3.00s
OpenSSL 1.0.2p 14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,char) des(idx,cisc,2,int) aes(partial) blowfish(ptr)
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 89064.27k 209331.78k 310113.11k 362181.29k 380829.70k
root@OpenWrt:~#
While I don't think that this was intended to be a thread around determining "the numbers", the idea that there is a consistent benchmark for "VPN performance" is valuable over "with XXXXX VPN provider I'm getting YY Mbps throughput". Especially with WireGuard and OpenVPN presently using very different ciphers when it comes to computational speed (and OpenVPN likely to be able to use the "faster" ciphers when OpenSSL v1.1 becomes widely available), I'd favor "encryption speed" measures over in situ measurements of VPN throughput.
While encryption does affect speed, it's not the primary speed blocker. Test it yourself by turning off encryption.
What does affect speed is interfacing with the kernel through the TUN/TAP interface as far as I can tell. IPsec is magnitudes faster, and I reckon Wireguard is as well although I haven't tested it.
root@syno1:/# openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 7155616 aes-256-cbc's in 2.98s
Doing aes-256-cbc for 3s on 64 size blocks: 2144111 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 576481 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 145743 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 18331 aes-256-cbc's in 3.00s
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 38419.41k 45741.03k 49193.05k 49746.94k 50055.85k
humbly suggest "speedtester" script/url et.al. benchmarks devices and optionally submits ( via json? ) and plots dynamically serverside ....
new device hits marked and shazam.... pudding proof!.... too many variables to advise on chipset alone..... crappy heatsink etc. etc.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 270951 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 261191 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 228262 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 153599 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 33611 aes-256-cbc's in 3.00s
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 1445.07k 5572.07k 19478.36k 52428.46k 91780.44k
i got recently a internet upgrade to 100/40mbit from my isp and because of that, i thought it was time to upgrade my old tp-link wdr3600 (ar71xx) to a zyxel nbg6617 (ipq40xx), because of the better sqm performance (atleast thats what i expected).
now the graph shows both are on the same level when it comes to sqm performance? is this realy true or do i maybe understand somerthing wrong.
I'm thinking of taking this to the next level by adding performance indicator numbers to the dataentries, in order to be able to filter the devices easily according the user's criterias.
If your were to chose 3..5 performance indicators which should help the user search for a device suitable for his needs, which indicators would that be?