[Solved] OpenSSL crashing on R7800 (openssl speed rsa/dsa)

I've been experiencing crashes with openssl and openvpn under certain circumstances. Affects the R7800 (using master) but not another device running openwrt (lantiq/mips based router also on master). I have recompiled master with only modification being to enable openssl-util and libopenssl and the issue remains. hnyman's build also exhibits this issue on my unit.
It can be reproduced using "openssl speed dsa", but openssl speed rsa gives different responses. OpenVPN is unpredictable, occasionally crashing with "bus error", though more often simply reboots the router.

root@LEDE:~# openssl speed dsa
Doing 512 bit sign dsa's for 10s: 30303 512 bit DSA signs in 9.79s
Doing 512 bit verify dsa's for 10s: 35511 512 bit DSA verify in 9.88s
Doing 1024 bit sign dsa's for 10s: 15462 1024 bit DSA signs in 9.84s
Doing 1024 bit verify dsa's for 10s: 16674 1024 bit DSA verify in 9.89s
Doing 2048 bit sign dsa's for 10s: packet_write_wait: Connection to 172.18.0.1 port 22: Broken pipe

root@LEDE:~# openssl speed rsa
Doing 512 bit private rsa's for 10s: 24375 512 bit private RSA's in 9.48s
Doing 512 bit public rsa's for 10s: 291839 512 bit public RSA's in 9.45s
Doing 1024 bit private rsa's for 10s: 6554 1024 bit private RSA's in 9.42s
Doing 1024 bit public rsa's for 10s: 157651 1024 bit public RSA's in 9.51s
Doing 2048 bit private rsa's for 10s: 1196 2048 bit private RSA's in 9.55s
Doing 2048 bit public rsa's for 10s: RSA verify failure
3069887736:error:0407006A:lib(4):func(112):reason(106):NA:0:
3069887736:error:04067072:lib(4):func(103):reason(114):NA:0:
1 2048 bit public RSA's in 0.31s
Doing 4096 bit private rsa's for 10s: 221 4096 bit private RSA's in 9.53s
Doing 4096 bit public rsa's for 10s: 

Using gdb (which I am not experienced with), an error is always given on /usr/bin/openssl. The following is openssl speed aes (rsa/dsa output is the same)

from /opt/r7800/r5645/lede/scripts/../staging_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/root-ipq806x/lib/ld-musl-armhf.so.1
(gdb) c
Continuing.

Program received signal SIGILL, Illegal instruction.
_armv7_tick () at armv4cpuid.S:94
94		mrrc	p15,1,r0,r1,c14		@ CNTVCT
(gdb) bt
#0  _armv7_tick () at armv4cpuid.S:94
#1  0xb6e353b0 in OPENSSL_cpuid_setup () at armcap.c:157
#2  0xb6fde374 in do_init_fini (p=0xb6f31a60) at ldso/dynlink.c:1280
#3  0xb6faacd4 in __libc_start_main (main=0x23c6c <main>, argc=3, 
    argv=0xbefffde4) at src/env/__libc_start_main.c:71
#4  0x000242b0 in _start_c (p=<optimized out>) at crt/crt1.c:17
#5  0x00024284 in _start ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.
[Inferior 1 (process 6887) exited normally]
(gdb)

Could someone be kind enough to double check either on another R7800 or ARM based unit?

1 Like

The illegal instruction signal is expected, the OPENSSL_cpuid_setup() function tries to poke the ARM processor in various ways and catches the resulting illegal instruction errors when it attempted to access some functionality not supported by the processor.

The both errors reported by your speed run test can be decoded on a desktop, using the openssl errstr command:

$ openssl errstr 0407006A
error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01
$ openssl errstr 04067072
error:04067072:rsa routines:rsa_ossl_public_decrypt:padding check failed

These errors are unusual and might hint at some underlying hardware problem.

I was able to catch 2 bus errors in gdb, both with the same issue as follows:

Program received signal SIGBUS, Bus error.
bn_mul8x_mont_neon () at armv4-mont.S:407
407 vmlal.u32 q13,d29,d7[1]
(gdb) bt
#0 bn_mul8x_mont_neon () at armv4-mont.S:407
#1 0xb6e5adb4 in BN_mod_mul_montgomery (r=0x256051, a=0xb86ce917, b=0x22adc3,
mont=0xf2f55237, ctx=0x7a00e3e3) at bn_mont.c:140
#2 0x0021e678 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I recompiled openssl with the "no-asm" option and the issue seems to be mitigated, though unfortunately with a big hit on performance.

Would appreciate if someone can test with their R7800. Unless I have a peculiar hardware issue, running "openssl speed dsa2048" should cause a crash.

No issues with openssl speed tests on my C2600 (IPQ8064 vs your IPQ8065) running my own build.

1 Like

No crash on stable 17.01.

1 Like

No crash on R7800 running r5740-012d20eebe

1 Like

Thanks a lot for the help, looks like a replacement is needed.

I am seeing errors with my 7800 when using some open ssl commands,

openssl dhparam -out dh2048.pem 2048

It will either segfault gracefully, or in some cases take down the router.

Please can you attempt what I have with "openssl speed dsa" and gdb, if possible, and inform if you experience the same issue I reported? If you can, please compile and test with the no-asm option enabled in openssl to see if the same issue occurs, or if not I can link you to my personal build.

Thank you, I will try the command as suggested a little later. Don't want the router to reboot whilst I am working. If it does happen I will take you up on your build offer.

It's kind of interesting, 512 bit and 1024 bit sign and verify work, however with 2048 bit it takes down the router. It fails on the sign phase.

Which version 17.01.4? Or another?

Your issue seems to be very similar to mine. Here is my custom build with the offending openssl asm code disabled - but all other asm is left enabled so as not to impact performance as much. Can you try and inform if the rsa and dsa commands now work? Please backup and familiarise yourself with tftp recovery method before installing this build as it is heavily customised from 17.01.4.

I have the same issue on SSL.

the command 'openssl speed dsa2048' crash my router

The only version enough stable for me is LEDE Reboot 17.01-SNAPSHOT r3971 where I can install a package list without reboot.

@clayface Have you still a custom build with openssl asm code disabled ?

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.