Wireguard performance

FYI - I have created https://openwrt.org/inbox/wireguard_performance (similar to https://openwrt.org/inbox/openvpn_performance).

Feel free to add more data, it's easy.

4 Likes

Cool!
Question: Up/Down, does this denote bidirectional saturating traffic, or the "uni-directiona"l maximum for each direction? With "uni-directional" I accept that there is required reverse traffic (ACKs and such) but that there is no considerable data traffic in the reverse direction.

1 Like

Despite a fair amount of knowledge of networking, I was never able to set up WireGuard. The interface gets RX but no actual traffic happens.

Same as in the OpenVPN performance page: unidirectional

1 Like

Since WireGuard can reach the limits of GigE with an x86_64/AMD64 and AES-NI, from my WIP notes on testing:

Technology Limits

Packets vs. Bits

Commercial routers are generally rated in terms of packets per second (PPS), rather than bits per second of throughput. This is probably a better metric when one considers that the majority of the processing load is related to understanding the headers and making decisions about them, rather than the managing payload itself. The payload is generally copied to a buffer from the interface and later written out to another interface, without modification (with minor exceptions for the headers and checksums, especially when NAT is in play).

As many home users think of bandwidth and not switching speed, and that their packets are often large, or at the MTU of their upstream link, these tests will use bandwidth as a measure.

Ethernet interfaces have their default, 1500 MTU.

GigE Throughput

A GigE link can't provide a throughput of 1,000 Mbps using TCP or UDP. There are headers, inter-packet gaps, and other overhead at the various layers that limit throughput. For typical IPv4 links, 940-950 Mbps is the highest achievable throughput for GigE without using "jumbo frames". See, for example, http://rickardnobel.se/actual-throughput-on-gigabit-ethernet/

For the purposes of this discussion, the rough numbers of

  • TCP -- 940 Mbps throughput limit
  • UDP -- 950 Mbps throughput limit

will be used.

IPv6 vs. IPv4 and Other Effects

In many cases, IPv6 has slightly larger headers than does IPv4. An IPv6 link will have slightly lower throughput than the IPv4 links tested here.

VLAN tagging, QinQ, or the like often add a few bytes to the on-wire packet. These impacts are on the order of a percent and are not examined in this study. For example, an 802.1Q (VLAN) tag adds 4 bytes to the over 1500 bytes of a "full" Ethernet frame, a fraction of a percent.

This isn't a "scholarly research paper", but more intended to provide general guidance. If you're within, say, 10% of the limits, you're probably too close for robust operation.

WireGuard

WireGuard has its own set of encapsulation, which typically reduces the achievable bandwidth further.

WireGuard sets the interface MTU to 1420. This reduces the throughput by a factor of roughly 1420/1500 ~ 94% (ignoring fragmentation overhead)

  • WireGuard -- 900 Mbps throughput limit
3 Likes

I've written a quick n' dirty tutorial here which might help

1 Like

Great summary. The actual limits are relatively easy to calculate though:

IPv4, Ethernet, TCP/IPv4 goodput:
1000 * ((1500-20-20)/(1500+38)) = 949.28 Mbps
IPv6, Ethernet, TCP/IPv4 goodput:
1000 * ((1500-40-20)/(1500+38)) = 936.28 Mbps
That is a reduction by ~1.3 percentage-points. Any additional "games" like VLAN tags or RFC 1323 timestamps will result in lower throughput.
e.g. for IPv4
1000 * ((1500-20-20-12)/(1500+38+4)) = 939.04 Mbps
and IPv6
1000 * ((1500-40-20-12)/(1500+38+4)) = 926.07 Mbps

Any VPN will essentially add one more layer of TP/IP headers as well as its own headers...
With payload MTU 1420 on a MTU1500 carrier, I would expect
1000 * ((1420-20-20)/(1500+38)) = 897.27 Mbps for IPv4
1000 * ((1420-40-20)/(1500+38)) = 884.27 Mbps for IPv6
as best-case payload goodput through the VPN tunnel.
That essentially confirms your numbers (but also shows how to calculate them)

Why 1500 + 38? Well, that is simply the sum of:
7 byte preamble
1 byte start of frame delimiter
6 bytes destination MAC address
6 byte source MAC address
2 byte Ethertype header
4 byte frame check sequence
12 byte equivalent inter frame gap
7+1+6+6+2+4+12 = 38 (see e.g. https://en.wikipedia.org/wiki/Ethernet_frame)

1 Like

gl-inet b1300(ipq4029), as receiver, Destktop computer as sender, could reach 320Mbits.
b1300 as receiver, iPhone8 as sender, could reach 220Mbits.

From the result table from Mikrotik
80kpps is sufficient to reach Gigabit speeds on 1500MTU

Ethernet test results
RB750GL AR7242 1G all port test
Mode Configuration 1518 byte 512 byte 64 byte
kpps Mbps kpps Mbps kpps Mbps
Bridging none (fast path) 81.2 986.1 178.4 730.9 194 99.3
Bridging 25 bridge filter rules 51.5 625.5 52.3 214.3 53.7 27.5
Routing none (fast path) 81.2 986.1 167 684 183.7 94.1
Routing 25 simple queues 81.2 986.1 88.5 362.4 92.8 47.5
Routing 25 ip filter rules 37.6 457 38.4 157.1 37.5 19.2