A Wireguard comparison DB

A bit surprise that 2 CPUs belonging to same family, J1900 has higher clock rate but result is slightly slower than N2930 (mine is on Jetway NF9HG-N2930 industrial ITX).

Doesn't work

root@QNAP:~# cd /tmp
root@QNAP:/tmp# ./clean-up.sh
Cannot remove namespace file "/var/run/netns/wg-bench": No such file or directory
Cannot find device "wg-bench"
Cannot find device "wg-bench-wg"
root@QNAP:/tmp# ./setup-netns.sh
Cannot open init namespace: No such file or directory
RTNETLINK answers: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
setting the network namespace "wg-bench" failed: Invalid argument
root@QNAP:/tmp#

ubus call system board

root@OpenWrt:~# ubus call system board

    "kernel": "6.1.86",
    "hostname": "OpenWrt",
    "system": "ARMv8 Processor rev 0",
    "model": "Bananapi BPI-R4",
    "board_name": "bananapi,bpi-r4",
    "rootfs_type": "squashfs",
    "release": {
            "distribution": "OpenWrt",
            "version": "SNAPSHOT",
            "revision": "r25942-12137cb460",
            "target": "mediatek/filogic",
            "description": "OpenWrt SNAPSHOT r25942-12137cb460"
./benchmark.sh

root@OpenWrt:~/wg-bench# ./benchmark.sh
Connecting to host 169.254.200.2, port 5201
[ 5] local 169.254.200.1 port 41866 connected to 169.254.200.2 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 144 MBytes 1.21 Gbits/sec 0 1.58 MBytes
[ 5] 1.00-2.00 sec 140 MBytes 1.17 Gbits/sec 0 1.75 MBytes
[ 5] 2.00-3.00 sec 137 MBytes 1.15 Gbits/sec 0 1.99 MBytes
[ 5] 3.00-4.00 sec 138 MBytes 1.16 Gbits/sec 0 2.09 MBytes
[ 5] 4.00-5.00 sec 136 MBytes 1.14 Gbits/sec 0 2.09 MBytes
[ 5] 5.00-6.00 sec 137 MBytes 1.15 Gbits/sec 0 2.32 MBytes
[ 5] 6.00-7.00 sec 137 MBytes 1.15 Gbits/sec 0 2.45 MBytes
[ 5] 7.00-8.00 sec 137 MBytes 1.15 Gbits/sec 0 2.45 MBytes
[ 5] 8.00-9.00 sec 136 MBytes 1.14 Gbits/sec 0 2.45 MBytes
[ 5] 9.00-10.00 sec 136 MBytes 1.14 Gbits/sec 0 2.45 MBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.35 GBytes 1.16 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 1.34 GBytes 1.15 Gbits/sec receiver

./benchmark.sh -R

root@OpenWrt:~/wg-bench# ./benchmark.sh -R
Connecting to host 169.254.200.2, port 5201
Reverse mode, remote host 169.254.200.2 is sending
[ 5] local 169.254.200.1 port 39390 connected to 169.254.200.2 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 147 MBytes 1.23 Gbits/sec
[ 5] 1.00-2.00 sec 149 MBytes 1.25 Gbits/sec
[ 5] 2.00-3.00 sec 147 MBytes 1.23 Gbits/sec
[ 5] 3.00-4.00 sec 149 MBytes 1.25 Gbits/sec
[ 5] 4.00-5.00 sec 149 MBytes 1.25 Gbits/sec
[ 5] 5.00-6.00 sec 148 MBytes 1.25 Gbits/sec
[ 5] 6.00-7.00 sec 148 MBytes 1.24 Gbits/sec
[ 5] 7.00-8.00 sec 146 MBytes 1.23 Gbits/sec
[ 5] 8.00-9.00 sec 151 MBytes 1.26 Gbits/sec
[ 5] 9.00-10.00 sec 151 MBytes 1.26 Gbits/sec


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 1.45 GBytes 1.25 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 1.45 GBytes 1.25 Gbits/sec receiver

root@OpenWrt:~#

1 Like

Now it is better :slight_smile:
|fujitsu-futro-s920 | AMD GX-415GA SOC with Radeon(tm) HD Graphics | 23.05.3 | 799 Mbps|

ubus call system board
{
        "kernel": "5.15.150",
        "hostname": "OpenWrt",
        "system": "AMD GX-415GA SOC with Radeon(tm) HD Graphics",
        "model": "FUJITSU FUTRO S920",
        "board_name": "fujitsu-futro-s920",
        "rootfs_type": "ext4",
        "release": {
                "distribution": "OpenWrt",
                "version": "23.05.3",
                "revision": "r23809-234f1a2efa",
                "target": "x86/64",
                "description": "OpenWrt 23.05.3 r23809-234f1a2efa"

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  94.1 MBytes   789 Mbits/sec    0    502 KBytes
[  5]   1.00-2.00   sec  95.2 MBytes   799 Mbits/sec    0    538 KBytes
[  5]   2.00-3.00   sec  96.0 MBytes   805 Mbits/sec    0    573 KBytes
[  5]   3.00-4.00   sec  95.8 MBytes   803 Mbits/sec    0    573 KBytes
[  5]   4.00-5.00   sec  94.8 MBytes   795 Mbits/sec    0    573 KBytes
[  5]   5.00-6.00   sec  95.2 MBytes   799 Mbits/sec    0    573 KBytes
[  5]   6.00-7.00   sec  94.8 MBytes   795 Mbits/sec    0    573 KBytes
[  5]   7.00-8.00   sec  95.6 MBytes   802 Mbits/sec    0    573 KBytes
[  5]   8.00-9.00   sec  96.0 MBytes   805 Mbits/sec    0    573 KBytes
[  5]   9.00-10.00  sec  95.4 MBytes   800 Mbits/sec    0    573 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   953 MBytes   799 Mbits/sec    0             sender
[  5]   0.00-10.00  sec   952 MBytes   798 Mbits/sec                  receiver

50% clock rate jump + 2 more cores, this result is a lot better.

1 Like

@daniel

There seems to be a regression with kernel 6.6 and before I report it I'd like to know how widespread the issue is. So could a few people compare a snapshot or two with kernel 6.1 to kernel 6.6?

My images were compiled with the same packages and commits. No settings were changed.

r26156-fb2475e6bd (kernel 6.1.89)

root@GL-MT6000:~/wg-bench# ./benchmark.sh
Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 55986 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  95.6 MBytes   801 Mbits/sec    0    492 KBytes
[  5]   1.00-2.00   sec  95.8 MBytes   803 Mbits/sec    0    516 KBytes
[  5]   2.00-3.00   sec  95.1 MBytes   798 Mbits/sec    0    540 KBytes
[  5]   3.00-4.00   sec  95.6 MBytes   802 Mbits/sec    0    540 KBytes
[  5]   4.00-5.00   sec  96.2 MBytes   807 Mbits/sec    0    566 KBytes
[  5]   5.00-6.00   sec  96.6 MBytes   810 Mbits/sec    0    566 KBytes
[  5]   6.00-7.00   sec  95.2 MBytes   799 Mbits/sec    0    566 KBytes
[  5]   7.00-8.00   sec  96.0 MBytes   805 Mbits/sec    0    566 KBytes
[  5]   8.00-9.00   sec  96.0 MBytes   805 Mbits/sec    0    566 KBytes
[  5]   9.00-10.00  sec  96.0 MBytes   805 Mbits/sec    0    566 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   958 MBytes   804 Mbits/sec    0             sender
[  5]   0.00-10.00  sec   957 MBytes   803 Mbits/sec                  receiver

r26156-fb2475e6bd (kernel 6.6.29)

root@GL-MT6000:~/wg-bench# ./benchmark.sh
Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 40992 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  91.8 MBytes   769 Mbits/sec    0    434 KBytes
[  5]   1.00-2.00   sec  91.8 MBytes   770 Mbits/sec    0    537 KBytes
[  5]   2.00-3.00   sec  90.6 MBytes   760 Mbits/sec    0    564 KBytes
[  5]   3.00-4.00   sec  90.6 MBytes   760 Mbits/sec    0    564 KBytes
[  5]   4.00-5.00   sec  88.9 MBytes   745 Mbits/sec    0    633 KBytes
[  5]   5.00-6.00   sec  88.2 MBytes   741 Mbits/sec    0    633 KBytes
[  5]   6.00-7.00   sec  88.9 MBytes   745 Mbits/sec    0    633 KBytes
[  5]   7.00-8.00   sec  88.8 MBytes   744 Mbits/sec    0    633 KBytes
[  5]   8.00-9.00   sec  89.0 MBytes   747 Mbits/sec    0    633 KBytes
[  5]   9.00-10.00  sec  89.2 MBytes   749 Mbits/sec    0    633 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   898 MBytes   753 Mbits/sec    0             sender
[  5]   0.00-10.00  sec   897 MBytes   752 Mbits/sec                  receiver

r26199+5-0b0e3e22f8 (kernel 6.1.89)

root@GL-MT6000:~/wg-bench# ./benchmark.sh
Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 40830 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  95.4 MBytes   799 Mbits/sec    0    481 KBytes
[  5]   1.00-2.00   sec  96.0 MBytes   805 Mbits/sec    0    502 KBytes
[  5]   2.00-3.00   sec  95.5 MBytes   801 Mbits/sec    0    502 KBytes
[  5]   3.00-4.00   sec  94.8 MBytes   795 Mbits/sec    0    529 KBytes
[  5]   4.00-5.00   sec  96.0 MBytes   805 Mbits/sec    0    529 KBytes
[  5]   5.00-6.00   sec  95.9 MBytes   804 Mbits/sec    0    529 KBytes
[  5]   6.00-7.00   sec  95.9 MBytes   804 Mbits/sec    0    553 KBytes
[  5]   7.00-8.00   sec  97.0 MBytes   814 Mbits/sec    0    553 KBytes
[  5]   8.00-9.00   sec  96.1 MBytes   807 Mbits/sec    0    553 KBytes
[  5]   9.00-10.00  sec  96.0 MBytes   805 Mbits/sec    0    553 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   958 MBytes   804 Mbits/sec    0             sender
[  5]   0.00-10.00  sec   957 MBytes   803 Mbits/sec                  receiver

r26199+5-0b0e3e22f8 (kernel 6.6.30)

root@GL-MT6000:~/wg-bench# ./benchmark.sh
Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 55376 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  92.9 MBytes   778 Mbits/sec    0    564 KBytes
[  5]   1.00-2.00   sec  90.0 MBytes   755 Mbits/sec    0    564 KBytes
[  5]   2.00-3.00   sec  90.5 MBytes   759 Mbits/sec    0    564 KBytes
[  5]   3.00-4.00   sec  90.5 MBytes   759 Mbits/sec    0    564 KBytes
[  5]   4.00-5.00   sec  91.5 MBytes   768 Mbits/sec    0    632 KBytes
[  5]   5.00-6.00   sec  89.1 MBytes   748 Mbits/sec    0    632 KBytes
[  5]   6.00-7.00   sec  88.5 MBytes   742 Mbits/sec    0    632 KBytes
[  5]   7.00-8.00   sec  88.0 MBytes   739 Mbits/sec    0    632 KBytes
[  5]   8.00-9.00   sec  88.4 MBytes   741 Mbits/sec    0    632 KBytes
[  5]   9.00-10.00  sec  89.0 MBytes   746 Mbits/sec    0    632 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   898 MBytes   753 Mbits/sec    0             sender
[  5]   0.00-10.00  sec   897 MBytes   752 Mbits/sec                  receiver
4 Likes

Just to back this up I note the same regression on my GL-MT6000. I tested this script last week 6.1.88 had 805 Mbps average, new snapshot 6.6.30 is 755 Mbps average.

Meanwhile performance for all other tests (SQM, Ksmbd, etc.) is the same or better.

This cpu should have better numbers with wg, also intel qat should speed up wg a lot.

By default OpenWrt doesn't have QAT compiled in AFAIK.

ubus call system board
{
	"kernel": "6.6.30",
	"hostname": "Dell_x64",
	"system": "Intel(R) Core(TM) i5-7500T CPU @ 2.70GHz",
	"model": "QEMU Standard PC (i440FX + PIIX, 1996)",
	"board_name": "qemu-standard-pc-i440fx-piix-1996",
	"rootfs_type": "ext4",
	"release": {
		"distribution": "OpenWrt",
		"version": "SNAPSHOT",
		"revision": "r0+26238-b1cb9a0713",
		"target": "x86/64",
		"description": "OpenWrt SNAPSHOT r0+26238-b1cb9a0713"
	}
}
root@Dell_x64:/tmp/wg-bench# ./benchmark.sh 
Connecting to host 169.254.200.2, port 5201
[  5] local 169.254.200.1 port 36754 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   385 MBytes  3.22 Gbits/sec    0    991 KBytes       
[  5]   1.00-2.00   sec   389 MBytes  3.26 Gbits/sec    0   1.12 MBytes       
[  5]   2.00-3.00   sec   390 MBytes  3.27 Gbits/sec    0   1.12 MBytes       
[  5]   3.00-4.00   sec   391 MBytes  3.28 Gbits/sec    0   1.26 MBytes       
[  5]   4.00-5.00   sec   386 MBytes  3.24 Gbits/sec    0   1.26 MBytes       
[  5]   5.00-6.00   sec   374 MBytes  3.14 Gbits/sec    0   1.26 MBytes       
[  5]   6.00-7.00   sec   388 MBytes  3.26 Gbits/sec    0   1.26 MBytes       
[  5]   7.00-8.00   sec   382 MBytes  3.21 Gbits/sec    0   2.36 MBytes       
[  5]   8.00-9.00   sec   387 MBytes  3.25 Gbits/sec    0   2.36 MBytes       
[  5]   9.00-10.00  sec   387 MBytes  3.25 Gbits/sec    0   2.48 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  3.77 GBytes  3.24 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  3.77 GBytes  3.23 Gbits/sec                  receiver

iperf Done.

root@Dell_x64:/tmp/wg-bench# ./benchmark.sh -R
Connecting to host 169.254.200.2, port 5201
Reverse mode, remote host 169.254.200.2 is sending
[  5] local 169.254.200.1 port 51960 connected to 169.254.200.2 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   343 MBytes  2.88 Gbits/sec                  
[  5]   1.00-2.00   sec   350 MBytes  2.94 Gbits/sec                  
[  5]   2.00-3.00   sec   349 MBytes  2.93 Gbits/sec                  
[  5]   3.00-4.00   sec   349 MBytes  2.93 Gbits/sec                  
[  5]   4.00-5.00   sec   350 MBytes  2.93 Gbits/sec                  
[  5]   5.00-6.00   sec   355 MBytes  2.98 Gbits/sec                  
[  5]   6.00-7.00   sec   352 MBytes  2.95 Gbits/sec                  
[  5]   7.00-8.00   sec   356 MBytes  2.99 Gbits/sec                  
[  5]   8.00-9.00   sec   358 MBytes  3.00 Gbits/sec                  
[  5]   9.00-10.00  sec   357 MBytes  3.00 Gbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  3.44 GBytes  2.95 Gbits/sec   63             sender
[  5]   0.00-10.00  sec  3.44 GBytes  2.95 Gbits/sec                  receiver

iperf Done.

I see that you are running qemu, so how many core/threads in use? Or you gave all resources to the VM?

P.S. Snapshot is now on kernel 6.6? Or you built manually?

All CPU resources given, using Proxmox.

Not got it into production just yet. Still tinkering.

Built manually.

1 Like

Thanks for confirmation, wow.....comparing the same 4C4T x86-64, the i5-7500T loses to N100, and it's not just a little bit, quite a surprise to me.

1 Like

Not really. QAT doesn't offer any acceleration for WireGuard.

Offtopic... Anyways, it's pretty slow compared to the CPU in AES, so I'm running my C3558 SSD NAS with QAT disabled. It's so slow that I can't saturate these drives. No problem with CPU encryption. I've read somewhere that higher core count Atom parts have faster QAT, so maybe it makes sense to enable it there.

Yes it does chacha20 but only in gen4 qat engines as i now found infos, so not for 2***/3*** atoms. This.means that this is worthless for us since we don't have cheap hw with those hw accelerators.

1 Like

Yeah. The concern is that if it impacts the GL-MT6000 this much then it might be even worse for lower-end routers. But so far nobody else has shared benchmarks from both 6.1 and 6.6.

It's probably safe to assume that the issue can be seen with all MediaTek devices, so maybe @nbd would be able to look into this.

Does anyone have statistics on how Intel J4125/N100 performs with Wireguard? Looking to maximize Wireguard throughput with 1Gbps internet speed

You didn't check the table? There is one for N100 already.
For J4125, let me find some time to do the test, I've got a mini PC but not tested yet, however you can see that even my super old Celeron N2930 can do 700Mbps+, J4125 which is much faster so I don't think it will be less than 1G

1 Like

Kernel 6.6 needs manual compiling and not many people wants to (or knowing how to) do it, I am also waiting for master snapshot to move on to 6.6.

I somehow missed the N100 in the table, thanks for pointing it out.
BTW, I found some reports on Google that says J4125 can't handle 1Gbps Wireguard (though they are running OPNsense, not sure if this is relevant)