I recently adopted OpenWRT for secure travel router project, using a Raspberry Pi CM4 on a DFRobot 2x gbe carrier board. I have SMB setup for media sharing off the SD card, and OpenVPN (PIA) set as the only routable connection allowed through the firewall for the LAN network.
I have seen some older posts indicating close to 100mbps OpenVPN performance on the Pi4. Currently, even with tuning I am only seeing 38mbps via speedtest-cli (running directly from OpenWRT to eliminate any routing overhead). The tuning parameters I have added are:
compress (no compression)
I also tried adding "fast-io" and toying with compression (both lzo and lz4) options to no effect.
Is there something I'm missing, a driver, setting, etc.? Or is the 38mbps I'm seeing more or less expected from the Pi4's SOC?
Could IRQBalance help? Is it possible OpenVPN/SSL is sticking to the same CPU as NIC traffic?
The main workload for Openvpn is the encryption and compression, both heavily dependent on CPU and memory speed. The command openssl speed aes will give you insight about the processing power of the CPU.
An example of my SOC:
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128 cbc 20903.87k 22645.43k 23465.61k 23576.58k 23661.23k
aes-192 cbc 18097.18k 19552.68k 19998.89k 20114.77k 20149.59k
aes-256 cbc 16147.54k 17266.71k 17638.49k 17737.39k 17760.26k
a comparison to core I7 laptop:
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128 cbc 183127.75k 194145.97k 189470.38k 192785.41k 198868.99k 200447.32k
aes-192 cbc 165710.87k 171087.18k 170660.61k 171523.07k 172512.60k 172321.45k
aes-256 cbc 145358.28k 148898.03k 148209.92k 146003.63k 144547.84k 144703.49k
This means roughly 16MB for a second for an Arm A20 chip and 145MB for core I7 computer.
The 16MB/s is 108Mbps and 145MB/s is 1160 Mbps.
And this is just the processing power for encryption. You will have also to add overhead for the traffic handling and compression.
To add - I am able to route full gigabit (940mbps) between the pi's two interfaces without VPN connected, as well as max out my internet connection (~450mbps) with speedtest-cli. I can certainly accept an answer that the combination of overhead between the two services results in a large reduction of performance. However, seeing less than 1/10th the speed compared to the OpenSSL benchmark (and 1/20th non-tunneled routing speed) seems that I may be missing something.
Are you using active cooling? I am on the DFRobot customized build (21.02.0) which is one version behind, my numbers are considerably worse than yours... though still I feel like openvpn could perform better.
Unfortunately PIA doesn't offer wireguard outside of their official client (without some fairly expert / officially unsupported configuration) - but TCP is something I can do.
Changing over to TCP has massively improved throughput 2.5-3x! I am now seeing 75-90mbps over the tunnel.
It seems either this build or the raspi itself may struggle with UDP. I even made sure flood protection was off to be sure that wasn't an issue. Would be interested to see if there is anything I can look at to improve UDP performance, but I am more or less satisfied being my throughput is much higher than I could expect from any public/hotel wifi.
Speedtest-CLI seems to be having peering issues over the TCP VPN interestingly, and is sending me to hosts with 90-100+ms ping between 300-3000km away from my VPN host. I'd be curious to see if this is due to slower tcp session negotiation and only a single ping being sent, or some other configuration issue with the CLI version, but, when run from the browser I see ping times consistent with UDP tunnels (~+10ms compared to VPN off) and around 90mbps.
For anyone else setting this up on an Rpi, I dropped my tuning parameters as well, commenting out snd/rcvbuf and txqueuelen gave a modest improvement over TCP.
Back to tuning! Thanks for pointing me in the right direction.
Wigeguard is actually remarkably simple to setup -- you don't need an "official client" application from PIA. Basically you just need a few bits of basic information (the endpoint domain name or IP and port, peer public key, preshared key if any, and your tunnel IP. You'll create a key-pair on your side and upload the public key from your set to the PIA system. I don't know how easy or hard PIA makes this process (getting that info and providing your key), but you don't need to be an expert to set this up.
However, if you are happy with OpenVPN now that you are getting better performance there, you can stick with that option.
From what I have read there are issues both with tunnel setup, as well as keepalive, both of which require scripting.
The keepalive portion is easy enough, just need to send pings to PIA's gateway every so often. The bigger issue is they destroy your public keys after seeing no sessions for more than X minutes.
There is a public github repository (https://github.com/hsand/pia-wg) for generating watchguard config files, however given PIA's policy on key destruction, I'd either have to invoke this manually every time I set up the travel router, or hardcode my credentials in to the python file and create a startup task or something.
For a home setup I could see this working, since downtime would be limited to router reboots and wouldn't expire my keys - but with the ephemeral nature of the travel router, I'd need to go the extra step of adapting those WatchGuard scripts (which rely on PIA's proprietary API) to run automatically, which seems like maybe a bridge too far.
Thank you all for your help on this. Would love to hear any additional recommendations on performance tuning, and will definitely be looking to switch to a provider with better support for watchguard (I've heard Nord is good) - but for now 70-90mbs should be good!