I've been experiencing crashes with openssl and openvpn under certain circumstances. Affects the R7800 (using master) but not another device running openwrt (lantiq/mips based router also on master). I have recompiled master with only modification being to enable openssl-util and libopenssl and the issue remains. hnyman's build also exhibits this issue on my unit.
It can be reproduced using "openssl speed dsa", but openssl speed rsa gives different responses. OpenVPN is unpredictable, occasionally crashing with "bus error", though more often simply reboots the router.
root@LEDE:~# openssl speed dsa
Doing 512 bit sign dsa's for 10s: 30303 512 bit DSA signs in 9.79s
Doing 512 bit verify dsa's for 10s: 35511 512 bit DSA verify in 9.88s
Doing 1024 bit sign dsa's for 10s: 15462 1024 bit DSA signs in 9.84s
Doing 1024 bit verify dsa's for 10s: 16674 1024 bit DSA verify in 9.89s
Doing 2048 bit sign dsa's for 10s: packet_write_wait: Connection to 172.18.0.1 port 22: Broken pipe
root@LEDE:~# openssl speed rsa
Doing 512 bit private rsa's for 10s: 24375 512 bit private RSA's in 9.48s
Doing 512 bit public rsa's for 10s: 291839 512 bit public RSA's in 9.45s
Doing 1024 bit private rsa's for 10s: 6554 1024 bit private RSA's in 9.42s
Doing 1024 bit public rsa's for 10s: 157651 1024 bit public RSA's in 9.51s
Doing 2048 bit private rsa's for 10s: 1196 2048 bit private RSA's in 9.55s
Doing 2048 bit public rsa's for 10s: RSA verify failure
3069887736:error:0407006A:lib(4):func(112):reason(106):NA:0:
3069887736:error:04067072:lib(4):func(103):reason(114):NA:0:
1 2048 bit public RSA's in 0.31s
Doing 4096 bit private rsa's for 10s: 221 4096 bit private RSA's in 9.53s
Doing 4096 bit public rsa's for 10s:
Using gdb (which I am not experienced with), an error is always given on /usr/bin/openssl. The following is openssl speed aes (rsa/dsa output is the same)
from /opt/r7800/r5645/lede/scripts/../staging_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/root-ipq806x/lib/ld-musl-armhf.so.1
(gdb) c
Continuing.
Program received signal SIGILL, Illegal instruction.
_armv7_tick () at armv4cpuid.S:94
94 mrrc p15,1,r0,r1,c14 @ CNTVCT
(gdb) bt
#0 _armv7_tick () at armv4cpuid.S:94
#1 0xb6e353b0 in OPENSSL_cpuid_setup () at armcap.c:157
#2 0xb6fde374 in do_init_fini (p=0xb6f31a60) at ldso/dynlink.c:1280
#3 0xb6faacd4 in __libc_start_main (main=0x23c6c <main>, argc=3,
argv=0xbefffde4) at src/env/__libc_start_main.c:71
#4 0x000242b0 in _start_c (p=<optimized out>) at crt/crt1.c:17
#5 0x00024284 in _start ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.
[Inferior 1 (process 6887) exited normally]
(gdb)
Could someone be kind enough to double check either on another R7800 or ARM based unit?
The illegal instruction signal is expected, the OPENSSL_cpuid_setup() function tries to poke the ARM processor in various ways and catches the resulting illegal instruction errors when it attempted to access some functionality not supported by the processor.
I was able to catch 2 bus errors in gdb, both with the same issue as follows:
Program received signal SIGBUS, Bus error.
bn_mul8x_mont_neon () at armv4-mont.S:407
407 vmlal.u32 q13,d29,d7[1]
(gdb) bt #0 bn_mul8x_mont_neon () at armv4-mont.S:407 #1 0xb6e5adb4 in BN_mod_mul_montgomery (r=0x256051, a=0xb86ce917, b=0x22adc3,
mont=0xf2f55237, ctx=0x7a00e3e3) at bn_mont.c:140 #2 0x0021e678 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
I recompiled openssl with the "no-asm" option and the issue seems to be mitigated, though unfortunately with a big hit on performance.
Would appreciate if someone can test with their R7800. Unless I have a peculiar hardware issue, running "openssl speed dsa2048" should cause a crash.
Please can you attempt what I have with "openssl speed dsa" and gdb, if possible, and inform if you experience the same issue I reported? If you can, please compile and test with the no-asm option enabled in openssl to see if the same issue occurs, or if not I can link you to my personal build.
Thank you, I will try the command as suggested a little later. Don't want the router to reboot whilst I am working. If it does happen I will take you up on your build offer.
Your issue seems to be very similar to mine. Here is my custom build with the offending openssl asm code disabled - but all other asm is left enabled so as not to impact performance as much. Can you try and inform if the rsa and dsa commands now work? Please backup and familiarise yourself with tftp recovery method before installing this build as it is heavily customised from 17.01.4.