[Solved] [master] mt7621 with kernel 4.14: Dropbear doesn't play ball

I just built an image for my DIR-860L from master with kernel 4.14 which runs fine (thanks @blogic and @nbd). However, Dropbear does not let me in. I first thought it was because I forgot to enable ECC support, so I built a new image, flashed it (have LuCI built-in), but no dice.

SSH with debugging prints the info below, but eventually hangs without any response whatsoever from the router. There appears to be some communication with Dropbear, but it then hangs pretty quickly at expecting SSH2_MSG_KEX_ECDH_REPLY.Restarting the Dropbear service through LuCI, fiddling with the options (re-enabling/disabling password and root logins etc.) do not seem to sollicit any response at all.

For the record: key only SSH logins to all clients in my network still work; so this does not look like a bug on the client's end.

OpenSSH_7.4p1 Debian-10+deb9u2, OpenSSL 1.0.2l  25 May 2017
debug1: Reading configuration data /home/anonymous/.ssh/config
debug1: /home/anonymous/.ssh/config line 67: Applying options for zeus
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug2: resolving "zeus" port 22
debug2: ssh_connect_direct: needpriv 0
debug1: Connecting to zeus [10.0.0.1] port 22.
debug1: Connection established.
debug1: identity file /home/anonymous/.ssh/id_rsa-zeus type 1
debug1: key_load_public: No such file or directory
debug1: identity file /home/anonymous/.ssh/id_rsa-zeus-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.4p1 Debian-10+deb9u2
debug1: Remote protocol version 2.0, remote software version dropbear
debug1: no match: dropbear
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to zeus:22 as 'root'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,curve25519-sha256@libssh.org,ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,ext-info-c
debug2: host key algorithms: ssh-rsa-cert-v01@openssh.com,rsa-sha2-512,rsa-sha2-256,ssh-rsa,ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521-cert-v01@openssh.com,ssh-ed25519-cert-v01@openssh.com,ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,ssh-ed25519
debug2: ciphers ctos: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,aes128-cbc,aes192-cbc,aes256-cbc
debug2: ciphers stoc: chacha20-poly1305@openssh.com,aes128-ctr,aes192-ctr,aes256-ctr,aes128-gcm@openssh.com,aes256-gcm@openssh.com,aes128-cbc,aes192-cbc,aes256-cbc
debug2: MACs ctos: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,zlib@openssh.com,zlib
debug2: compression stoc: none,zlib@openssh.com,zlib
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256@libssh.org,ecdh-sha2-nistp521,ecdh-sha2-nistp384,ecdh-sha2-nistp256,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1,kexguess2@matt.ucc.asn.au
debug2: host key algorithms: ssh-rsa
debug2: ciphers ctos: aes128-ctr,aes256-ctr
debug2: ciphers stoc: aes128-ctr,aes256-ctr
debug2: MACs ctos: hmac-sha1,hmac-sha2-256
debug2: MACs stoc: hmac-sha1,hmac-sha2-256
debug2: compression ctos: none
debug2: compression stoc: none
debug2: languages ctos: 
debug2: languages stoc: 
debug2: first_kex_follows 0 
debug2: reserved 0 
debug1: kex: algorithm: curve25519-sha256@libssh.org
debug1: kex: host key algorithm: ssh-rsa
debug1: kex: server->client cipher: aes128-ctr MAC: hmac-sha2-256 compression: none
debug1: kex: client->server cipher: aes128-ctr MAC: hmac-sha2-256 compression: none
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY

Try making your client connect using regular DH and see if it still fails.

What GCC version are you using? Last I checked same problem happens if compiled with version 7. It's fine with versions <= 6.3.

Is that -c cipher_spec? If so, what do I need to specify to use Diffie-Hellman? diffie-hellman-group14-sha1,diffie-hellman-group1-sha1? Because that doesn't seem to work with -c.

I'm on GCC 7.x, but afaik that bug applied to ar71xx? Or is ramips affected as well?

I have encoutered the same problem on my builds for ramips, ar71xx, lantiq, so I think it's sort of a mips24kc issue.
Since you have luci, try to remove the installed dropbear package and then install the one from LEDE repo. You should have ssh after a reboot.

Thanks. I suppose 18.03 will use GCC 6 for MIPS, so that problem shouldn't present itself... I have also found @wongsyrone's Dropbear-test repo but it's not clear what exactly makes that work on MIPS where vanilla dropbear breaks.

Seems you need to edit your ssh_config file instead. Look for KexAlgorithms. -c is for encryption.

Well it looks like it's a GCC bug with -Os and not a Dropbear issue.

Edit: I just built with -O2 and Dropbear is responsive again. That feels pretty conclusive.

@neheb: I added KexAlgorithms to my ~/.ssh/config prior to flashing a recompiled image, but that didn't make any difference. Thanks for the tips though.

I would rather say it's libtom* bug instead of GCC bug since it works well after upgrading libs.

You can check commit in my dropbear-test repo.

@wongsyrone I just rebuilt with -Os and the GCC patch that Felix pushed to master. Dropbear is working again. Maybe the newer lib provided some (unintended) workaround, but it does look like it was a broader problem (that looks like it's fixed now).

I'm using GCC 7.3. I've seen reports about ar71xx being affected by such issues, but not ramips... I'll try and see what -O2 does instead of -Os.

I also saw this commit in Matthias' git tree: https://git.openwrt.org/?p=openwrt/staging/mkresin.git;a=commit;h=849306d0308e3439bcb67531e613682b0ed8aef9
I have googled the KEX stall and it seems on some systems the MTU setting might have something to do with that. So I'm going to check that as well.