The biggest problem I see with those (e.g. CESA or NSS on QCA platforms) is not necessarily the performance (althout it would be for the NSS case), but the compatibility - these hardware implementations remain exotic and need special support for openssl crypto engines. This is fine if available, but easily breaks - especially once your chosen platform becomes less common. x86's AES-NI has the advantage of being too big to ignore, making support for it ubiquituous ((non-mandatory, hellp RPi…) AES for ARMv8 might become common enough in the future as well, CESA/ NSS probably not).
That is a good point, which I had not researched in detail. To the best of your knowledge, what is the current state of support in openssl in OpenWrt for a) Marvell CESA, b) Qualcomm NSS and c) AES for ARMv8?
I have not looked into NSS a lot, beyond the fact that it is used by IPQ806x to offload packet inspection/switching to it. Is QCA NSS a general purpose crypto engine, similar to Marvell's CESA, that can be used by user-space applications like openssl for hardware acceleration of AES? If so, can you elaborate a bit on your comment above on why you see NSS having a performance issue compared to CESA?
I think to remember that it's technically available, but non-default and neither quite trivial to set up. It won't particularly help that mvebu is effectively dead as a consumer routing platform due to the state of mwlwifi (ARMADA 3070/ 7040 and 8040 are mostly wired-only, high priced devices).
AFAIK that's the current state, the NSS cores can do this - but re-writing it in a away that OpenSSL can really make use of it at full speed (and faster than in software) will be difficult, particularly for ipq806x (apparently easier for ipq807x), and non-proprietary (by this I mean cooperating with upstream OpenSSL, without needing a ton of out-of-tree patches) cryptoengine drivers for OpenSSL don't exist yet.
I don't know, but this is probably most likely to see 'usable' support.
Ergo, the only target where this will (for the most part) "just work™", are x86_64 and potentially ARMv8 (except for the RPi4, because Broadcom (and maybe some other chip makers too) didn't license AES acceleration).
Yes, I recently found that out about the RPi4. A real pity.
So based on this analysis, do you know of any ARMv8 WiFi routers, which are currently supported, or in the pipeline of being supported, by OpenWrt?
Anybody on this forum who can share, or point me in the right direction, of what is needed to setup openssl in OpenWrt to utilize the CESA crypto accelerators in Marvell based routers?
Alternately, anybody who has the Linksys WRT series routers, who has figured out how to accelerate OpenVPN using the CESA crypto accelerators, can you guide me about it please? Thanks.
Do not(!) ask me about AES acceleration for mt7622bv though, while I'm interested in the device, I don't know (and AES acceleration is not a deciding factor for me). Support for this device is still very new, so there are still some pending issues, but those seem to be rather straight forward.
--
In terms of CPU performance ipq8074 should be faster, but no ipq807x devices are supported yet (and it doesn't seem to be within range either) - and we'll have to see how far it will go without NSS support.
The CPU of Linksys E8450 seems to have AES feature:
root@OpenWrt:~# cat /proc/cpuinfo
processor : 0
BogoMIPS : 25.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 1
BogoMIPS : 25.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
Pinky promise, I won't
Jokes aside, thanks for bringing this device and the mt7622bv cpu to my attention. Certainly interesting device.
Another interesting observation is that there are a large number of ARMv8-based Broadcom BCM4906/8 routers in the market, which have both AES support and have WiFi chipsets BCM4365/66 which are supported by brcmfmac driver. In theory, they should all be supportable by OpenWrt. In practice, I do not see a single one currently supported by OpenWrt! Am I just missing them in the TOH?
If not, do you know what is the potential issue in their lack of support? The usual reason for devices not being supported is that developers are not buying it, or they are expensive, or they are niche, or they are not available, or they have no WiFi driver support, etc.
But in this case, these type of routers are available from almost all the major vendors e.g. Linksys (EA9300, EA9350, EA9400, EA9500, etc), TP-Link (A20, C2300, C4000, C5400, etc), Asus (AC86U, AC2900, AC5300, etc), Netgear(R7900P, R7960P, R8000P, etc), etc. across multiple price points, in multiple markets. They appear to all have upstream kernel support for the ARMv8 CPU as well as upstream WiFi driver support through brcmfmac driver.
But still zero such devices in TOH of OpenWrt. Are there any technical issues which are constraining support for Broadcom-based ARMv8 based WiFi routers released from 2016 onwards, which may explain this pattern, while on the other hand a Mediatek-based ARMv8 based WiFi router like the E8450/RT3200 that you have linked above, is released in 2020/21 and is chugging along quite well for OpenWrt support?
What is it that I am perhaps missing, which may explain this dichotomy?
Nice! Thanks for sharing, @ jiegec.
Since you are one of the few out here who has this router with OpenWrt loaded on it, would you perhaps also be kind enough to run some benchmarks for OpenSSL, if possible with AES and without AES support, and share them? Would appreciate it.
Broadcom is… 'difficult'…
…but there is quite some work in progress (but not finished yet), https://git.openwrt.org/?p=openwrt/openwrt.git;a=tree;f=target/linux/bcm4908/image;hb=HEAD
openssl from opkg install openssl-util
:
root@OpenWrt:~# openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 4852594 aes-256 cbc's in 2.97s
Doing aes-256 cbc for 3s on 64 size blocks: 1306710 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 256 size blocks: 333596 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 83857 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 8192 size blocks: 10495 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 16384 size blocks: 5249 aes-256 cbc's in 3.00s
OpenSSL 1.1.1j 16 Feb 2021
built on: Tue Mar 16 11:27:55 2021 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) blowfish(ptr)
compiler: aarch64-openwrt-linux-musl-gcc -fPIC -pthread -Wa,--noexecstack -Wall -O3 -Os -pipe -mcpu=cortex-a53 -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -fPIC -ffunction-sections -fdata-sections -znow -zrelro -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_SMALL_FOOTPRINT
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256 cbc 26141.92k 27969.71k 28466.86k 28718.92k 28658.35k 28666.54k
Manually compiled from another arm64 machine (w/ VPAES_ASM):
root@OpenWrt:~# ./openssl-static speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 6532145 aes-256 cbc's in 2.98s
Doing aes-256 cbc for 3s on 64 size blocks: 1740426 aes-256 cbc's in 2.99s
Doing aes-256 cbc for 3s on 256 size blocks: 445092 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 1024 size blocks: 111835 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 8192 size blocks: 14007 aes-256 cbc's in 3.00s
Doing aes-256 cbc for 3s on 16384 size blocks: 7000 aes-256 cbc's in 2.99s
OpenSSL 1.1.1j 16 Feb 2021
built on: Thu Mar 18 08:16:31 2021 UTC
options:bn(64,64) rc4(char) des(int) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -pthread -Wa,--noexecstack -Wall -O3 -DOPENSSL_USE_NODELETE -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DVPAES_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256 cbc 35071.92k 37253.27k 37981.18k 38173.01k 38248.45k 38357.19k
That's less than 2% throughput of my Apple M1 MacBookAir, by the way.
Thanks for the prompt feedback.
I am not an expert, however the results look positively anemic. Not sure if AES instruction set is being used.
How does one verify that AES instruction set is indeed being used by OpenSSL? Would 'openssl engine -t -c' reveal it, as shown in the OpenWrt documentation for crypto engines?
Just for comparison (Xiaomi Mi AIoT Router AX3600, ipq8071a, 4*1.38GHz, cortex a53/ ARMv8, with the rather unusable and unoptimized OEM firmware):
# cat /proc/cpuinfo
processor : 0
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 1
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 2
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
processor : 3
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 4
# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq
1017600
# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
1382400
# openssl engine -t -c
(dynamic) Dynamic engine loading support
[ unavailable ]
# openssl speed aes-256-cbc
Doing aes-256 cbc for 3s on 16 size blocks: 4630455 aes-256 cbc's in 2.86s
Doing aes-256 cbc for 3s on 64 size blocks: 1313578 aes-256 cbc's in 2.73s
Doing aes-256 cbc for 3s on 256 size blocks: 332944 aes-256 cbc's in 2.59s
Doing aes-256 cbc for 3s on 1024 size blocks: 83464 aes-256 cbc's in 2.70s
Doing aes-256 cbc for 3s on 8192 size blocks: 10427 aes-256 cbc's in 2.75s
OpenSSL 1.0.2q 20 Nov 2018
built on: reproducible build, date unspecified
options:bn(64,64) rc4(ptr,char) des(idx,cisc,2,int) aes(partial) blowfish(ptr)
compiler: aarch64-openwrt-linux-gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -I/home/jenkins/romdaily_new_openwrt/system/staging_dir/target-aarch64-openwrt-linux_musl/usr/include -I/home/jenkins/romdaily_new_openwrt/system/staging_dir/target-aarch64-openwrt-linux_musl/include -I/home/jenkins/Xiaoqiangtoolchain/toolchain/external_toolchain/toolchain-aarch64_cortex-a53_gcc-5.5.0_musl//usr/include -I/home/jenkins/Xiaoqiangtoolchain/toolchain/external_toolchain/toolchain-aarch64_cortex-a53_gcc-5.5.0_musl//include -specs=/home/jenkins/romdaily_new_openwrt/system/include/hardened-ld-pie.specs -znow -zrelro -DOPENSSL_SMALL_FOOTPRINT -DMIWIFI_FEATURE -DHAVE_CRYPTODEV -DOPENSSL_NO_ERR -DTERMIOS -Os -pipe -march=armv8-a -mcpu=cortex-a53+crypto -fno-caller-saves -Wformat -fpic -fstack-protector -D_FORTIFY_SOURCE=2 -Wl,-z,now -Wl,-z,relro -fpic -I/home/jenkins/romdaily_new_openwrt/system/package/libs/openssl/include -ffunction-sections -fdata-sections -fomit-frame-pointer -Wall -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256 cbc 25904.64k 30794.50k 32908.75k 31654.49k 31061.09k
ipq8074a would be clocked at 4*2.2 GHz, cortex a53/ ARMv8
Here are the results from IPQ8065 (1.7 GHz, 2 cores), with no NSS support and no AES, since it is 32-bit ARM-v7A
openssl speed -evp aes-256-cbc
Doing aes-256-cbc for 3s on 16 size blocks: 6794589 aes-256-cbc's in 2.95s
Doing aes-256-cbc for 3s on 64 size blocks: 2381327 aes-256-cbc's in 2.99s
Doing aes-256-cbc for 3s on 256 size blocks: 634833 aes-256-cbc's in 2.97s
Doing aes-256-cbc for 3s on 1024 size blocks: 161890 aes-256-cbc's in 2.99s
Doing aes-256-cbc for 3s on 8192 size blocks: 20269 aes-256-cbc's in 2.98s
Doing aes-256-cbc for 3s on 16384 size blocks: 10131 aes-256-cbc's in 3.00s
OpenSSL 1.1.1i 8 Dec 2020
built on: Fri Jan 22 23:53:44 2021 UTC
options:bn(64,32) rc4(char) des(long) aes(partial) blowfish(ptr)
compiler: ccache_cc -fPIC -pthread -Wa,--noexecstack -Wall -O3 -Os -pipe -fno-caller-saves -fno-plt -fhonour-copts -Wno-error=unused-but-set-variable -Wno-error=unused-result -mfloat-abi=hard -Wformat -Werror=format-security -fstack-protector -D_FORTIFY_SOURCE=1 -Wl,-z,now -Wl,-z,relro -fpic -ffunction-sections -fdata-sections -znow -zrelro -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG -DOPENSSL_PREFER_CHACHA_OVER_GCM -DOPENSSL_SMALL_FOOTPRINT
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 36852.01k 50971.55k 54719.61k 55443.26k 55719.34k 55328.77k
cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq
384000
cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
1725000
cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 12.50
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x04d
CPU revision : 0
processor : 1
model name : ARMv7 Processor rev 0 (v7l)
BogoMIPS : 26.04
Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32
CPU implementer : 0x51
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x04d
CPU revision : 0
Based on above, I am not confident that the previous benchmarks from E8450 and Xiaomi AX3200, are using AES.
Yes, I am recommending that you go for something with AES-NI. That would be either x86_64 or ARMv8. Because the penalty for shunting the data over the bus to the crypto silicon makes it perform worse for small, synchronous operations than AES-NI.
Correct. It's also a pita to integrate - it required a lot of work to port the code, including a large number of patches, most of it to kernel code, along with crypto drivers, contiguous memory drivers and others, as well as a port of a patched openssl version designed to work with the hardware.
As @slh pointed out, actually using it is also non-trivial and the configuration of the hardware itself, while complicated, is the least of the issues.
To use Intel QuickAssist requires a patched asynchronous version of openssl, which was also a gigantic pita to compile and get working (for some reason Intel likes to write software designed for embedded systems that simply cannot be cross-compiled; pretty bizarre and yes, I pointed this out to the Intel folk responsible for maintaining the software). Using QuickAssist in nginx requires significant patches to nginx as well. It's not an "out of the box" experience by any means
If you're curious to look at what it takes to get hardware like this working, the code is here
For typical Openwrt synchronous workloads on smallish buffers (something like Openvpn), performance using the crypto hardware on AES-CBC was about 70% - 80% of the performance of AES-NI. For larger buffers, the performance started to approach parity. For multiple (36+ threads) asynchronous operations, the speeds was about 10x as fast as you'd get using AES-NI.
It would be real hassle to give you benchmarks, as I compiled the AES acceleration out of the Intel QuickAssist drivers. I'd need to recompile a half dozen kernel modules to be able to get you a benchmark.
The performance on RSA is very good, particularly signing operations, which performs much better than software regardless of whether it's sychronous or not.
On core AES-NI definitely, no doubt in my mind.
@slh and @jiegec, can you run "cat /proc/crypto" and share what is the priority you get under aes section?
For the benchmark I posted above, the priority is 100, which indicates no AES and no crypto engines, as below.
cat /proc/crypto
name : aes
driver : aes-generic
module : kernel
**priority : 100**
refcnt : 7
selftest : passed
internal : no
type : cipher
blocksize : 16
min keysize : 16
max keysize : 32
If AES instruction set was being used on your router, one would expect the priority to be > 100.
ax3600/ ipq8071a:
# cat /proc/crypto
name : hmac(sha512)
driver : nss-hmac-sha512
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 128
digestsize : 64
name : hmac(sha384)
driver : nss-hmac-sha384
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 128
digestsize : 48
name : hmac(sha256)
driver : nss-hmac-sha256
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 32
name : hmac(sha1)
driver : nss-hmac-sha1
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 20
name : hmac(md5)
driver : nss-hmac-md5
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 16
name : sha512
driver : nss-sha512
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 128
digestsize : 64
name : sha384
driver : nss-sha384
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 128
digestsize : 48
name : sha256
driver : nss-sha256
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 32
name : sha224
driver : nss-sha224
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 28
name : sha1
driver : nss-sha1
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 20
name : md5
driver : nss-md5
module : qca_nss_cfi_cryptoapi
priority : 1000
refcnt : 1
selftest : passed
internal : no
type : ahash
async : yes
blocksize : 64
digestsize : 16
name : gcm(aes)
driver : nss-gcm
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 12
maxauthsize : 16
geniv : <none>
name : seqiv(rfc4106(gcm(aes)))
driver : nss-rfc4106-gcm
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 8
maxauthsize : 16
geniv : <none>
name : rfc4106(gcm(aes))
driver : nss-rfc4106-gcm
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 8
maxauthsize : 16
geniv : <none>
name : authenc(hmac(sha256),cbc(des3_ede))
driver : nss-hmac-sha256-cbc-3des
module : qca_nss_cfi_cryptoapi
priority : 300
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 8
ivsize : 8
maxauthsize : 32
geniv : <none>
name : authenc(hmac(sha1),cbc(des3_ede))
driver : nss-hmac-sha1-cbc-3des
module : qca_nss_cfi_cryptoapi
priority : 300
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 8
ivsize : 8
maxauthsize : 20
geniv : <none>
name : authenc(hmac(sha256),cbc(aes))
driver : nss-hmac-sha256-cbc-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 16
maxauthsize : 32
geniv : <none>
name : authenc(hmac(sha1),cbc(aes))
driver : nss-hmac-sha1-cbc-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 16
maxauthsize : 20
geniv : <none>
name : echainiv(authenc(hmac(sha256),cbc(des3_ede)))
driver : nss-hmac-sha256-cbc-3des
module : qca_nss_cfi_cryptoapi
priority : 300
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 8
ivsize : 8
maxauthsize : 32
geniv : <none>
name : echainiv(authenc(hmac(sha1),cbc(des3_ede)))
driver : nss-hmac-sha1-cbc-3des
module : qca_nss_cfi_cryptoapi
priority : 300
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 8
ivsize : 8
maxauthsize : 20
geniv : <none>
name : echainiv(authenc(hmac(md5),cbc(des3_ede)))
driver : nss-hmac-md5-cbc-3des
module : qca_nss_cfi_cryptoapi
priority : 300
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 8
ivsize : 8
maxauthsize : 16
geniv : <none>
name : seqiv(authenc(hmac(sha256),rfc3686(ctr(aes))))
driver : nss-hmac-sha256-rfc3686-ctr-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 8
maxauthsize : 32
geniv : <none>
name : echainiv(authenc(hmac(sha256),cbc(aes)))
driver : nss-hmac-sha256-cbc-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 16
maxauthsize : 32
geniv : <none>
name : seqiv(authenc(hmac(sha1),rfc3686(ctr(aes))))
driver : nss-hmac-sha1-rfc3686-ctr-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 8
maxauthsize : 20
geniv : <none>
name : seqiv(authenc(hmac(md5),rfc3686(ctr(aes))))
driver : nss-hmac-md5-rfc3686-ctr-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 8
maxauthsize : 16
geniv : <none>
name : echainiv(authenc(hmac(sha1),cbc(aes)))
driver : nss-hmac-sha1-cbc-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 16
maxauthsize : 20
geniv : <none>
name : echainiv(authenc(hmac(md5),cbc(aes)))
driver : nss-hmac-md5-cbc-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : aead
async : yes
blocksize : 16
ivsize : 16
maxauthsize : 16
geniv : <none>
name : cbc(des3_ede)
driver : nss-cbc-des-ede
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : ablkcipher
async : yes
blocksize : 8
min keysize : 24
max keysize : 24
ivsize : 8
geniv : <default>
name : ecb(aes)
driver : nss-ecb-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : ablkcipher
async : yes
blocksize : 16
min keysize : 16
max keysize : 32
ivsize : 0
geniv : <default>
name : rfc3686(ctr(aes))
driver : nss-rfc3686-ctr-aes
module : qca_nss_cfi_cryptoapi
priority : 30000
refcnt : 1
selftest : passed
internal : no
type : ablkcipher
async : yes
blocksize : 16
min keysize : 20
max keysize : 36
ivsize : 8
geniv : seqiv
name : cbc(aes)
driver : nss-cbc-aes
module : qca_nss_cfi_cryptoapi
priority : 10000
refcnt : 1
selftest : passed
internal : no
type : ablkcipher
async : yes
blocksize : 16
min keysize : 16
max keysize : 32
ivsize : 16
geniv : <default>
name : hmac(sha512)
driver : hmac(sha512-generic)
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 128
digestsize : 64
name : hmac(sha384)
driver : hmac(sha384-generic)
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 128
digestsize : 48
name : hmac(sha256)
driver : hmac(sha256-generic)
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 32
name : cbc(cipher_null)
driver : cbc(cipher_null-generic)
module : cbc
priority : 0
refcnt : 1
selftest : passed
internal : no
type : blkcipher
blocksize : 1
min keysize : 0
max keysize : 0
ivsize : 1
geniv : <default>
name : cbc(aes)
driver : cbc(aes-generic)
module : cbc
priority : 100
refcnt : 1
selftest : passed
internal : no
type : blkcipher
blocksize : 16
min keysize : 16
max keysize : 32
ivsize : 16
geniv : <default>
name : hmac(sha1)
driver : hmac(sha1-generic)
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 20
name : sha1
driver : sha1-generic
module : sha1_generic
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 20
name : hmac(md5)
driver : hmac(md5-generic)
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 16
name : cbc(des3_ede)
driver : cbc(des3_ede-generic)
module : cbc
priority : 100
refcnt : 1
selftest : passed
internal : no
type : blkcipher
blocksize : 8
min keysize : 24
max keysize : 24
ivsize : 8
geniv : <default>
name : cbc(des)
driver : cbc(des-generic)
module : cbc
priority : 100
refcnt : 1
selftest : passed
internal : no
type : blkcipher
blocksize : 8
min keysize : 8
max keysize : 8
ivsize : 8
geniv : <default>
name : md5
driver : md5-generic
module : md5
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 16
name : des3_ede
driver : des3_ede-generic
module : des_generic
priority : 100
refcnt : 1
selftest : passed
internal : no
type : cipher
blocksize : 8
min keysize : 24
max keysize : 24
name : des
driver : des-generic
module : des_generic
priority : 100
refcnt : 1
selftest : passed
internal : no
type : cipher
blocksize : 8
min keysize : 8
max keysize : 8
name : ghash
driver : ghash-generic
module : kernel
priority : 100
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 16
digestsize : 16
name : jitterentropy_rng
driver : jitterentropy_rng
module : kernel
priority : 100
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_nopr_hmac_sha256
module : kernel
priority : 207
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_nopr_hmac_sha512
module : kernel
priority : 206
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_nopr_hmac_sha384
module : kernel
priority : 205
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_nopr_hmac_sha1
module : kernel
priority : 204
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_pr_hmac_sha256
module : kernel
priority : 203
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_pr_hmac_sha512
module : kernel
priority : 202
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_pr_hmac_sha384
module : kernel
priority : 201
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : stdrng
driver : drbg_pr_hmac_sha1
module : kernel
priority : 200
refcnt : 1
selftest : passed
internal : no
type : rng
seedsize : 0
name : xz
driver : xz-generic
module : kernel
priority : 0
refcnt : 2
selftest : passed
internal : no
type : compression
name : lzo
driver : lzo-generic
module : kernel
priority : 0
refcnt : 2
selftest : passed
internal : no
type : compression
name : crc32c
driver : crc32c-generic
module : kernel
priority : 100
refcnt : 2
selftest : passed
internal : no
type : shash
blocksize : 1
digestsize : 4
name : deflate
driver : deflate-generic
module : kernel
priority : 0
refcnt : 2
selftest : passed
internal : no
type : compression
name : ecb(arc4)
driver : ecb(arc4)-generic
module : kernel
priority : 100
refcnt : 1
selftest : passed
internal : no
type : blkcipher
blocksize : 1
min keysize : 1
max keysize : 256
ivsize : 0
geniv : <default>
name : arc4
driver : arc4-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : cipher
blocksize : 1
min keysize : 1
max keysize : 256
name : aes
driver : aes-generic
module : kernel
priority : 100
refcnt : 2
selftest : passed
internal : no
type : cipher
blocksize : 16
min keysize : 16
max keysize : 32
name : sha384
driver : sha384-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 128
digestsize : 48
name : sha512
driver : sha512-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 128
digestsize : 64
name : sha224
driver : sha224-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 28
name : sha256
driver : sha256-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 64
digestsize : 32
name : digest_null
driver : digest_null-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : shash
blocksize : 1
digestsize : 0
name : compress_null
driver : compress_null-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : compression
name : ecb(cipher_null)
driver : ecb-cipher_null
module : kernel
priority : 100
refcnt : 1
selftest : passed
internal : no
type : blkcipher
blocksize : 1
min keysize : 0
max keysize : 0
ivsize : 0
geniv : <default>
name : cipher_null
driver : cipher_null-generic
module : kernel
priority : 0
refcnt : 1
selftest : passed
internal : no
type : cipher
blocksize : 1
min keysize : 0
max keysize : 0
There are some benchmarks in this thread
Those benchmarks are for the C2000 SoCs. I updated the code for the C3000 SoCs, which perform better than the benchmarks in that thread
Thanks a ton for sharing your detailed insights as well as the rough benchmarks above! You have certainly enlightened me today. Very kind of you. Appreciate it.
Of course, please ignore. I was just hoping to get whatever you had off the top of your mind, which you have already done above.
Thanks for confirming it.
Do you think you might have any insight on how to measure these benchmarks of on-core AES-NI performance impact on openssl? If you'll see in the thread above, I am struggling a bit to do so, since the benchmarks with AES-NI support seem to be poorer than those without AES-NI. It feels to me that either we have a measurement issue, or AES is somehow not being invoked. It is not clear how to figure it out.
openssl -elapsed -evp aes-128-cbc-hmac-sha1
Or with AES-NI enabled
openssl speed -elapsed -evp aes-128-cbc
With AES-NI disabled
OPENSSL_ia32cap=”~0x200000200000000″ openssl speed -elapsed -evp aes-128-cbc