what is size of firmware you sysupgrade and size of RAM r4s you have?
Please explain why you ask this, as I fail to understand the reason ?
Size of firmware is around 128MB. 4GB RAM R4S. My SD Card is 32GB.
I use @1715173329 PR code to build firmware with the latest trunk, it works very well.
Also, try setup nic bonding with two port onboard, iperf3 speedtest reach > 1.8gbps. r8168 driver @BotoX noted above gives better speed than r8169 kernel driver about 200-300mbps.
iperf3 -V -i 5 -t 30 -c 192.168.16.11
iperf 3.9
Linux NanopiR4S 5.4.99 #0 SMP PREEMPT Fri Feb 19 01:25:49 2021 aarch64
Control connection MSS 1448
Time: Fri, 19 Feb 2021 12:03:52 UTC
Connecting to host 192.168.16.11, port 5201
Cookie: qylwd6hdwtjsutpso2b3rsny2i72xtgavsgs
TCP MSS: 1448 (default)
[ 5] local 192.168.16.1 port 53742 connected to 192.168.16.11 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 30 second test, tos 0
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-5.00 sec 1.06 GBytes 1.82 Gbits/sec 2159 390 KBytes
[ 5] 5.00-10.00 sec 1.07 GBytes 1.85 Gbits/sec 978 382 KBytes
[ 5] 10.00-15.00 sec 1.05 GBytes 1.80 Gbits/sec 247 379 KBytes
[ 5] 15.00-20.00 sec 1.09 GBytes 1.87 Gbits/sec 165 410 KBytes
[ 5] 20.00-25.00 sec 1.05 GBytes 1.80 Gbits/sec 73 376 KBytes
[ 5] 25.00-30.00 sec 1.09 GBytes 1.87 Gbits/sec 60 379 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 6.41 GBytes 1.84 Gbits/sec 3682 sender
[ 5] 0.00-30.00 sec 6.41 GBytes 1.84 Gbits/sec receiver
CPU Utilization: local/sender 17.0% (0.2%u/16.8%s), remote/receiver 19.6% (1.9%u/17.7%s)
snd_tcp_congestion bbr
rcv_tcp_congestion bbr
Can someone please share the output from cryptsetup benchmark
?
Here result of my board, overcloked 2Ghz & cpu governor set to performance:
cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 207721 iterations per second for 256-bit key
PBKDF2-sha256 331408 iterations per second for 256-bit key
PBKDF2-sha512 279173 iterations per second for 256-bit key
PBKDF2-ripemd160 N/A
PBKDF2-whirlpool 164870 iterations per second for 256-bit key
argon2i 4 iterations, 296217 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id 4 iterations, 358570 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 84.4 MiB/s 86.6 MiB/s
serpent-cbc 128b 47.5 MiB/s 47.5 MiB/s
twofish-cbc 128b 77.2 MiB/s 74.0 MiB/s
aes-cbc 256b 65.6 MiB/s 66.7 MiB/s
serpent-cbc 256b 48.0 MiB/s 47.4 MiB/s
twofish-cbc 256b 77.8 MiB/s 73.9 MiB/s
aes-xts 256b 90.5 MiB/s 95.0 MiB/s
serpent-xts 256b 48.4 MiB/s 52.0 MiB/s
twofish-xts 256b 82.9 MiB/s 86.1 MiB/s
aes-xts 512b 71.8 MiB/s 71.4 MiB/s
serpent-xts 512b 51.6 MiB/s 52.0 MiB/s
twofish-xts 512b 87.8 MiB/s 86.0 MiB/s
not too shabby, but obviously without crypto acceleration. Thanks!
Thanks for your amazing work
I just setup a NanoPi R4S as 'in place replacement' of the 'internet box' provided by my ISP (for a 10G-EPON fiber connexion).
I'm brand new to OpenWrt, so I still need to learn lot's of things.
I generated my image based on the fork available on github: 1715173329 / openwrt-official
Just FYI: I face an issue with the wan interface not acquiring an IPv6 from the ISP. But as soon as I set the wan interface into promiscuous mode, it's working well.
I posted more details, context, logs, etc... here:
Note I use the basic r8169 driver, not the 8168-8.048.03 realtek kernel module yet (for the simple reason I don't know yet how to build my image with this driver )
Again: many thanks for you work and for sharing it
Why the optimize for RK3399 is cortex-a73.cortex-a53, not cortex-a72.cortex-a53 ?
@nouknouk: "I face an issue with the wan interface not acquiring an IPv6 from the ISP. But as soon as I set the wan interface into promiscuous mode, it's working well."
Some more tests done about issue for vlan support of WAN interface:
- same issue with a rebuild FriendlyWrt image (based on kernel 5.10)
- same issue with a rebuild OpenWrt image with r8168-8.048.03 realtek kernel module
Performance / load difference between r8168 and r8169 may result from the fact that r8168 has interrupt coalescing enabled by default (drawback is that this increases latency), and r8169 has not.
To deal with this you can either use ethtool to enable irq coalescing with r8169, or better and easier (from kernel 5.10):
echo 20000 > /sys/class/net//gro_flush_timeout
echo 1 > /sys/class/net//napi_defer_hard_irqs
Enabling TSO may also provide a benefit. It's disabled per default because of hw bugs on some chip versions.
ethtool -K sg on tso on
In general r8169 has the more modern design and a much smaller memory footprint. On the other hand r8168 has a lot of undocumented magic that may help to work around problematic board / BIOS / network chip version combinations.
The result on my r4s seems promising? Overclocked to 2.2Ghz/1.8Ghz and with following kernel config.
CONFIG_CRYPTO_DEV_ROCKCHIP=y
CONFIG_HW_RANDOM_ROCKCHIP=y
[root@R4S:/tmp/downloads]$ cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 227951 iterations per second for 256-bit key
PBKDF2-sha256 442064 iterations per second for 256-bit key
PBKDF2-sha512 382134 iterations per second for 256-bit key
PBKDF2-ripemd160 N/A
PBKDF2-whirlpool 160627 iterations per second for 256-bit key
argon2i 4 iterations, 298677 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id 4 iterations, 320365 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 653.1 MiB/s 907.7 MiB/s
serpent-cbc 128b 51.2 MiB/s 55.5 MiB/s
twofish-cbc 128b 82.5 MiB/s 90.4 MiB/s
aes-cbc 256b 558.5 MiB/s 800.9 MiB/s
serpent-cbc 256b 51.4 MiB/s 55.7 MiB/s
twofish-cbc 256b 83.3 MiB/s 89.5 MiB/s
aes-xts 256b 736.0 MiB/s 734.9 MiB/s
serpent-xts 256b 56.2 MiB/s 56.6 MiB/s
twofish-xts 256b 95.8 MiB/s 93.6 MiB/s
aes-xts 512b 661.7 MiB/s 669.0 MiB/s
serpent-xts 512b 56.4 MiB/s 56.4 MiB/s
twofish-xts 512b 95.7 MiB/s 93.5 MiB/s
Openssl result seems not right here.
[root@R4S:/tmp/downloads]$ openssl engine -t -c
(dynamic) Dynamic engine loading support
[ unavailable ]
(devcrypto) /dev/crypto engine
[ available ]
and speed test.
[root@R4S:/tmp/downloads]$ time openssl speed -evp aes-128-cbc
Doing aes-128-cbc for 3s on 16 size blocks: 50776931 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 64 size blocks: 34280079 aes-128-cbc's in 2.99s
Doing aes-128-cbc for 3s on 256 size blocks: 14864233 aes-128-cbc's in 2.97s
Doing aes-128-cbc for 3s on 1024 size blocks: 4466057 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 8192 size blocks: 599587 aes-128-cbc's in 2.96s
Doing aes-128-cbc for 3s on 16384 size blocks: 300172 aes-128-cbc's in 2.97s
OpenSSL 1.1.1k 25 Mar 2021
built on: Mon Mar 29 20:26:27 2021 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) blowfish(ptr)
compiler: aarch64-openwrt-linux-musl-gcc -fPIC -pthread -Wa,--noexecstack -Wall -O3 -pipe -march=armv8-a+crypto+crc -mabi=lp64 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -O3 -fPIC -ffunction-sections -fdata-sections -znow -zrelro -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_PREFER_CHACHA_OVER_GCM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-128-cbc 274469.90k 733754.20k 1281226.82k 1545014.31k 1659397.54k 1655898.33k
real 0m18.039s
user 0m17.818s
sys 0m0.098s
I just received the 1GB version and would like to run some benchmarks. I have iperf3 running -s on my laptop and when run in client mode on the device I'm getting really slow speeds. I'm connected via a short ethernet cable directly to the device. When I run it vise versa (server on the device and laptop in client mode I get ~950/450). I have SQM disabled.
I'm using this image:https://github.com/quintus-lab/NanoPi-R4S-OpenWRT
root@OpenWrt:~# iperf3 -c 192.168.1.143 -f M
Connecting to host 192.168.1.143, port 5201
[ 5] local 192.168.1.1 port 58832 connected to 192.168.1.143 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 49.1 MBytes 49.1 MBytes/sec 0 282 KBytes
[ 5] 1.00-2.00 sec 48.4 MBytes 48.4 MBytes/sec 0 294 KBytes
[ 5] 2.00-3.00 sec 48.1 MBytes 48.1 MBytes/sec 0 297 KBytes
[ 5] 3.00-4.00 sec 48.3 MBytes 48.3 MBytes/sec 0 288 KBytes
[ 5] 4.00-5.00 sec 48.6 MBytes 48.6 MBytes/sec 0 282 KBytes
[ 5] 5.00-6.00 sec 48.1 MBytes 48.1 MBytes/sec 0 277 KBytes
[ 5] 6.00-7.00 sec 48.4 MBytes 48.4 MBytes/sec 0 245 KBytes
[ 5] 7.00-8.00 sec 48.3 MBytes 48.3 MBytes/sec 0 279 KBytes
[ 5] 8.00-9.00 sec 47.9 MBytes 47.9 MBytes/sec 0 274 KBytes
[ 5] 9.00-10.00 sec 48.7 MBytes 48.7 MBytes/sec 0 282 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 484 MBytes 48.4 MBytes/sec 0 sender
[ 5] 0.00-10.00 sec 482 MBytes 48.2 MBytes/sec receiver
root@OpenWrt:~# iperf3 -c 192.168.1.143 -f M -R
Connecting to host 192.168.1.143, port 5201
Reverse mode, remote host 192.168.1.143 is sending
[ 5] local 192.168.1.1 port 58836 connected to 192.168.1.143 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 112 MBytes 112 MBytes/sec
[ 5] 1.00-2.00 sec 113 MBytes 113 MBytes/sec
[ 5] 2.00-3.00 sec 113 MBytes 113 MBytes/sec
[ 5] 3.00-4.00 sec 113 MBytes 113 MBytes/sec
[ 5] 4.00-5.00 sec 113 MBytes 113 MBytes/sec
[ 5] 5.00-6.00 sec 113 MBytes 113 MBytes/sec
[ 5] 6.00-7.00 sec 113 MBytes 113 MBytes/sec
[ 5] 7.00-8.00 sec 113 MBytes 113 MBytes/sec
[ 5] 8.00-9.00 sec 113 MBytes 113 MBytes/sec
[ 5] 9.00-10.00 sec 113 MBytes 113 MBytes/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 1.10 GBytes 113 MBytes/sec sender
[ 5] 0.00-10.00 sec 1.10 GBytes 113 MBytes/sec receiver
Am I doing this wrong? I'm extremely new to openwrt and networking in general. Any help would be appreciated!
EDIT: I just realized this is benchmarking on the device itself which is probably limited by the SD card? How should I properly run these benchmarks?
@Fauks ideally you should get 2 pcs, and your device (R4S between them), or a PC with two interfaces, the idea is to send from one pc and receive in the other passing through the device.
That said, your measurements look quite low, in the repository where you get the image, there are some benchmark with different settings to iperf3, maybe try those?
@xiaobo Thanks for the images, I took the last one and noticed that the wan and lan port are inverted?
I'd like to try to build my own image, if someone has any documentation about the process, I'll appreciate it.
I also noticed that the MR for this device were closed, but the changes (unless I saw it wrong) are already present in mainline kernel and u-boot. If that's the case, is that the case?
I've been messing around with a few different images, although I've settled down with "ImmortalWrt" - Slim.img from: https://github.com/klever1988/nanopi-openwrt which has been awesome.
I figured out what my problem was I think, I had LLA overhead set to 22. I'm on a network with another router locally, so I was not sure what I needed. After disabling it completely, I now get ~950Mbps up or down. This is with SQM enabled using Cake - Piece of Cake.qos. I have the server running on the device and the client running on my computer, I'm not able to test using another machine here but I will once I finish wiring my home.
Using CPUFreq I have the the two main cores locked in at 1200Mhz and the smaller 4 cores locked to ~800MHz using the schedutil governor scaling for both. Anything less, speeds starts to drop although it doesn't seem bad considering the cores can clock much higher.
Am I doing this right? Here is my output:
root@DCGateway:~# tc -d qdisc
qdisc noqueue 0: dev lo root refcnt 2
qdisc mq 0: dev eth0 root
qdisc fq_codel 0: dev eth0 parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
qdisc cake 8007: dev eth1 root refcnt 2 bandwidth 1Gbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms raw overhead 0
qdisc ingress ffff: dev eth1 parent ffff:fff1 ----------------
qdisc fq_codel 0: dev ztks5427oj root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 4Mb ecn drop_batch 64
qdisc noqueue 0: dev br-lan root refcnt 2
qdisc cake 8008: dev ifb4eth1 root refcnt 2 bandwidth 1Gbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100ms raw overhead 0
c:\iperf>iperf3 -c 10.10.10.1
Connecting to host 10.10.10.1, port 5201
[ 5] local 10.10.10.143 port 64789 connected to 10.10.10.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 110 MBytes 920 Mbits/sec
[ 5] 1.00-2.00 sec 113 MBytes 949 Mbits/sec
[ 5] 2.00-3.00 sec 113 MBytes 949 Mbits/sec
[ 5] 3.00-4.00 sec 113 MBytes 949 Mbits/sec
[ 5] 4.00-5.00 sec 113 MBytes 949 Mbits/sec
[ 5] 5.00-6.00 sec 113 MBytes 949 Mbits/sec
[ 5] 6.00-7.00 sec 113 MBytes 949 Mbits/sec
[ 5] 7.00-8.00 sec 113 MBytes 949 Mbits/sec
[ 5] 8.00-9.00 sec 113 MBytes 949 Mbits/sec
[ 5] 9.00-10.00 sec 114 MBytes 958 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 1.10 GBytes 947 Mbits/sec sender
[ 5] 0.00-10.06 sec 1.10 GBytes 941 Mbits/sec receiver
iperf Done.
c:\iperf>iperf3 -c 10.10.10.1 -R
Connecting to host 10.10.10.1, port 5201
Reverse mode, remote host 10.10.10.1 is sending
[ 5] local 10.10.10.143 port 64795 connected to 10.10.10.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 111 MBytes 929 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 936 Mbits/sec
[ 5] 2.00-3.00 sec 111 MBytes 930 Mbits/sec
[ 5] 3.00-4.00 sec 111 MBytes 935 Mbits/sec
[ 5] 4.00-5.00 sec 110 MBytes 925 Mbits/sec
[ 5] 5.00-6.00 sec 111 MBytes 928 Mbits/sec
[ 5] 6.00-7.00 sec 110 MBytes 924 Mbits/sec
[ 5] 7.00-8.00 sec 110 MBytes 926 Mbits/sec
[ 5] 8.00-9.00 sec 111 MBytes 928 Mbits/sec
[ 5] 9.00-10.00 sec 111 MBytes 933 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.05 sec 1.09 GBytes 927 Mbits/sec 0 sender
[ 5] 0.00-10.00 sec 1.08 GBytes 929 Mbits/sec receiver
iperf Done.
I can't seem to get LAN interface working with 5.10 kernel, could someone share the kernel configuration for it? (I think it's the one that goes via the PCIe bus).
dmesg doesn't say anything about it nor lspci.
Could someone do a benchmark of SQM 1g/s bidirectional, WITHOUT an accompanying WAN -> LAN workload?
This for an L2 Transparent SQM bridge https://apenwarr.ca/log/?m=201808#openwrt which in my previous testing requires less CPU than a full-on router.
I have made a OpenWrt 21.02 build for R2S / R4S from vanilla Openwrt + rockchip patches from ImmortalWrt + r8168 driver.
If you want to give a try : https://github.com/anaelorlinski/OpenWRT-Rockchip/releases