Which hardware for x Mbit OpenVPN

While encryption does affect speed, it's not the primary speed blocker. Test it yourself by turning off encryption.

What does affect speed is interfacing with the kernel through the TUN/TAP interface as far as I can tell. IPsec is magnitudes faster, and I reckon Wireguard is as well although I haven't tested it.

1 Like

ipq8065

root@syno1:/# openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 7155616 aes-256-cbc's in 2.98s
Doing aes-256-cbc for 3s on 64 size blocks: 2144111 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 576481 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 145743 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 18331 aes-256-cbc's in 3.00s

The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc      38419.41k    45741.03k    49193.05k    49746.94k    50055.85k

humbly suggest "speedtester" script/url et.al. benchmarks devices and optionally submits ( via json? ) and plots dynamically serverside ....

new device hits marked and shazam.... pudding proof!.... too many variables to advise on chipset alone..... crappy heatsink etc. etc.

with mvebu - wrt3200acm was able to go over your x86-64 benchmark

Doing aes-256-cbc for 3s on 16 size blocks: 290904 aes-256-cbc's in 0.05s
Doing aes-256-cbc for 3s on 64 size blocks: 285983 aes-256-cbc's in 0.09s
Doing aes-256-cbc for 3s on 256 size blocks: 247111 aes-256-cbc's in 0.05s
Doing aes-256-cbc for 3s on 1024 size blocks: 162460 aes-256-cbc's in 0.11s
Doing aes-256-cbc for 3s on 8192 size blocks: 33972 aes-256-cbc's in 0.02s
OpenSSL 1.0.2p  14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr)
compiler: ccache_cc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DZLIB_SHARED -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/home/rmandrad/openwrt/staging_dir/target-arm_cortex-a9+vfpv3_musl_eabi/usr/include -I/home/rmandrad/openwrt/staging_dir/target-arm_cortex-a9+vfpv3_musl_eabi/include -I/home/rmandrad/openwrt/staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-8.2.0_musl_eabi/usr/include -I/home/rmandrad/openwrt/staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-8.2.0_musl_eabi/include/fortify -I/home/rmandrad/openwrt/staging_dir/toolchain-arm_cortex-a9+vfpv3_gcc-8.2.0_musl_eabi/include -znow -zrelro -DHAVE_CRYPTODEV -DUSE_CRYPTODEV_DIGESTS -DOPENSSL_NO_ERR -DTERMIOS -pipe -mcpu=cortex-a9 -mfpu=vfpv3-d16 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -fmacro-prefix-map=/home/rmandrad/openwrt/build_dir/target-arm_cortex-a9+vfpv3_musl_eabi/openssl-1.0.2p=openssl-1.0.2p -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-z,now -Wl,-z,relro -O3 -fpic -I/home/rmandrad/openwrt/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc      93089.28k   203365.69k  1265208.32k  1512354.91k 13914931.20k

Please add -elapsed parameter, or the results are inaccurate

here you go

You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 270951 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 261191 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 228262 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 153599 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 33611 aes-256-cbc's in 3.00s

The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc       1445.07k     5572.07k    19478.36k    52428.46k    91780.44k

Also from a rango but with PR1547 in play which would indicate there are some extra cycles to be had:

********  Test openssl  *********
Tue Jan  8 10:09:24 MST 2019

*********************************
DISTRIB_ID='OpenWrt'
DISTRIB_RELEASE='SNAPSHOT'
DISTRIB_REVISION='r9007-529c95cc15'
DISTRIB_TARGET='mvebu/cortexa9'
DISTRIB_ARCH='arm_cortex-a9_vfpv3'
DISTRIB_DESCRIPTION='OpenWrt SNAPSHOT r9007-529c95cc15'
DISTRIB_TAINTS='no-all busybox'
Linux bsaedgy 4.14.91 #0 SMP Mon Jan 7 16:13:59 2019 armv7l GNU/Linux

*********************************
(devcrypto) /dev/crypto engine
 [DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB, MD5, SHA1, SHA224, SHA256, SHA384, SHA512]
     [ available ]
(dynamic) Dynamic engine loading support
     [ unavailable ]

*********************************

Running *--> time -v openssl speed -elapsed -evp AES-256-CBC <--*

You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 291174 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 284503 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 248323 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 162513 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 34142 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 18064 aes-256-cbc's in 3.00s
OpenSSL 1.1.1a  20 Nov 2018
built on: Thu Jan  1 00:00:01 1970 UTC
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr) 
compiler: ccache_cc -fPIC -pthread -Wa,--noexecstack -Wall -O3 -pipe -mcpu=cortex-a9 -mfpu=vfpv3-d16 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -O3 -fpic -ffunction-sections -fdata-sections -znow -zrelro -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DZLIB -DZLIB_SHARED -DNDEBUG -DOPENSSL_PREFER_CHACHA_OVER_GCM
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-256-cbc       1552.93k     6069.40k    21190.23k    55471.10k    93230.42k    98653.53k
	Command being timed: "openssl speed -elapsed -evp AES-256-CBC"
	User time (seconds): 0.41
	System time (seconds): 4.88
	Percent of CPU this job got: 29%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0m 18.03s
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 13440
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 0
	Minor (reclaiming a frame) page faults: 167
	Voluntary context switches: 1038976
	Involuntary context switches: 108
	Swaps: 0
	File system inputs: 0
	File system outputs: 0
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

@rmandra on my wrt3200 results are distinct

OpenWrt
Model	Linksys WRT3200ACM
Architecture	ARMv7 Processor rev 1 (v7l)
Firmware Version	OpenWrt SNAPSHOT r9008-ff62e83211 / LuCI Master (git-19.007.66460-4edac36)
Kernel Version	4.14.91

root@OpenWrt:~# openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 7878406 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 64 size blocks: 2259065 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 256 size blocks: 590554 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 149235 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 18732 aes-256 cbc's in 2.99s
OpenSSL 1.0.2p  14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr) 
compiler: arm-openwrt-linux-muslgnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256 cbc      42018.17k    48193.39k    50393.94k    50938.88k    51321.92k
  
root@OpenWrt:~# openssl speed aes-128-cbc

OpenSSL 1.0.2p  14 Aug 2018
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) blowfish(ptr) 
compiler: arm-openwrt-linux-muslgnueabi-gcc -I. -I.. -I../include  -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H 
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-128 cbc      59213.73k    66783.47k    69758.20k    70582.61k    70836.22k

this graphic brings more question than it solves.

i got recently a internet upgrade to 100/40mbit from my isp and because of that, i thought it was time to upgrade my old tp-link wdr3600 (ar71xx) to a zyxel nbg6617 (ipq40xx), because of the better sqm performance (atleast thats what i expected).

now the graph shows both are on the same level when it comes to sqm performance? is this realy true or do i maybe understand somerthing wrong.

I don't think the graph is too precise.

@mezo
It will be faster but probably not by much as ipq4*** isn't that fast per core but if you combine 4....

I'm thinking of taking this to the next level by adding performance indicator numbers to the dataentries, in order to be able to filter the devices easily according the user's criterias.

If your were to chose 3..5 performance indicators which should help the user search for a device suitable for his needs, which indicators would that be?

1 Like

well, in case of sqm performance, it would be great to filter devices by wan dl/upload rate.

The TL-WDR3600 can still cope with 100/40 MBit/s, but not much more (I'd say 120-150 MBit/s at most; it does regularly hit 0% idle intermittently under load - without SQM), the ipq40xx SOCs have more headroom (especially as it has multiple cores, with one serving the IRQs of ethernet, another taking care of WLAN, the third doing PPPoE, the fourth SQM, etc.).

Alternatives for graphs I can offer:


Offline generated

  • more / better options for visualization
  • updates depending on the one who created the graphic


wiki integrated

https://openwrt.org/inbox/openvpn_performance

grafik


Questions to you, the forum and wiki users:

  • Which graphic do you prefer and why: Offline generated or wiki integrated?
  • Are the numbers shown reasonable? If not, what numbers shall be shown?
  • Does the split in up/download make sense?
  • Any other targets to be added?
  • Any other performance criteria to be added (perhaps in a separate graphic?)
  • Other comments?
  • Sketches how such graphics should look like are also welcome

Please let me know your thoughts.

1 Like

No. X86 with AES-NI will probably do 300Mbps, without 100Mbps, mvebu I'd be surprised if it were doing 100-200.

For example https://forum.netgate.com/topic/103216/pfsense-hardware-for-home-router-openvpn-performance/2

My espressobin will do about 400Mbps routing and SQM, I can't see turning on OpenVPN and off SQM and suddenly it's twice as fast. Encryption and moving packets to and from userspace are easily going to be more demanding than SQM.

My j1900 will route and SQM about 900Mbps, but that site above suggests 80-100Mbps OpenVPN

Wiki, because it's easier to maintain/update and perform revision check.

No point, if the difference is less than the measurement error.
Added "Up/Down" note.

Optimized the color a bit.

1 Like

The lighter color for the columns is good!

@ all:

  • Feel free to add other relevant targets or accompanying text to https://openwrt.org/inbox/openvpn_performance, or adapt the values to more realistic ones
  • Feel free to create similar graphics in the inbox for other performance indicators, which allow the user to get a first rough direction on which target to select for his requirements
1 Like

Coming back to https://openwrt.org/inbox/openvpn_performance

  1. Any more comments, additions, edits necessary gefore we move this page to the final location?
  2. Where shall we put it?
    2.1) https://openwrt.org/docs/guide-user/services/vpn/openvpn/performance
    2.2) https://openwrt.org/docs/guide-user/perf_and_log/openvpn_performance

My vote would be 2.1

Would a similar graphic make sense for wireguard? Does anybody have some data on this?

2 Likes

Page moved to https://openwrt.org/docs/guide-user/services/vpn/openvpn/performance

Feel free to add / update the data.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.