WireGuard link with 2 OpenWrts get stuck when client tries to reconnect

Hello
3 locations.
1 - WG "server" with real ip, router Model:Beeline SmartBox GIGA Architecture:MediaTek MT7621 ver:1 eco:3
Firmware Version:OpenWrt SNAPSHOT r16435-f2c8c62d98
2 - WG "client" with no real ip, mobile link via modem. TP-Link TL-WR1043N/ND v1 Architecture Atheros AR9132 rev 2 Firmware Version OpenWrt 18.06.9 r8077-7cbbab7246 / LuCI openwrt-18.06 branch (git-20.319.49209-ab22243)
3 - another WG "client with no real ip, mobile link via modem. Keenetic KN-1210 with latest stock firmware.

I have problem when client #2 drops mobile connection (not very stable) and tries to reestablish WG tunnel to server. It seems that server just ignores its packets, while client #3 doesn't suffer from this issue.
"server" log shows just client #2 attempts to initiate tunnel with no response:

root@server-router:~# tcpdump -n -i any port 50812
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked v1), capture size 262144 bytes
20:45:31.456585 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:45:37.216781 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:45:42.976432 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:45:45.929195 IP 176.59.x.xxx.48950 > server_real_ip.50812: UDP, length 32
20:45:48.728613 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:45:53.848533 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:45:59.609683 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:46:05.361391 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:46:11.123037 IP 188.170.xx.xxx.54748 > server_real_ip.50812: UDP, length 148
20:46:12.570872 IP 176.59.x.xxx.48950 > server_real_ip.50812: UDP, length 32
^C
10 packets captured
10 packets received by filter

In log above client #2 with address 188.170.xx.xxx tries to intiate tunnel while client #3 with address 176.59.x.xxx just performing re-handshake.

Only reboot of server rooter helps.

Here are my configs, working state. When server doesn't reply they only diifers with no bytes sent/received accordingly:
Server:

interface: wg0
  public key: q959...
  private key: (hidden)
  listening port: 50812

peer: nLr7...
  preshared key: (hidden)
  endpoint: 188.170.xx.xxx:42889
  allowed ips: 172.22.0.33/32, 192.168.33.0/24, 192.168.8.0/24
  latest handshake: 19 seconds ago
  transfer: 24.40 MiB received, 144.60 MiB sent

peer: CguL...
  preshared key: (hidden)
  endpoint: 176.59.x.xxx:48950
  allowed ips: 172.22.0.2/32, 192.168.0.0/24
  latest handshake: 46 seconds ago
  transfer: 2.40 MiB received, 177.13 KiB sent

Client #2

root@client2-router:~# wg
interface: wgclient
  public key: nLr7...
  private key: (hidden)
  listening port: 51520

peer: q959...
  preshared key: (hidden)
  endpoint: server_real_ip:50812
  allowed ips: 0.0.0.0/0
  latest handshake: 1 minute, 52 seconds ago
  transfer: 144.54 MiB received, 24.09 MiB sent
  persistent keepalive: every 25 seconds

Any suugestions would be very appreciated.

You can adjust wireguard_watchdog to restart the VPN interface.

3 Likes

Thank you very muсh for link.
I could do similar things by manual scripting, but it is great that someone done such tool already.
But anyway i'm curious what is happenning under hood in my case, what makes wg server side ignore dropped client.

1 Like

It seems to be a protocol-specific issue which is best to discuss upstream.

Also remove the listening port from the clients.

1 Like

Well, probably it may be a protocol-specific thing, but why client #3 with Keenetic standard fw module doesn't suffer from this? In my case tunnel get stuck only between two WRTs devices.

In fact, there is nothing in web client & server configs concerning listening port. At server it is greyed out default port value, and at the client greyed out string "random", this is also default value from OpenWRT fw.

Can you post the uci export network from the server and the problematic client?

1 Like

Sure.

Server:

root@server-router:~# uci export network
package network

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option packet_steering '1'
	option ula_prefix 'fdb6:8c97:e778::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'lan1 lan2'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.63.1'
	option delegate '0'

config interface 'wan'
	option ifname 'wan'
	option proto 'dhcp'
	option macaddr '74:EA:3A:D1:2D:16'
	option delegate '0'
	option peerdns '0'
	list dns '1.1.1.1'
	list dns '8.8.8.8'

config interface 'wg0'
	option proto 'wireguard'
	option delegate '0'
	option listen_port '50812'
	list addresses '172.22.0.1/32'
	option private_key 'ODQV...'

config wireguard_wg0
	option description 'spasrouter'
	option public_key 'nLr7...'
	option preshared_key 'J6wV...'
	list allowed_ips '172.22.0.33/32'
	list allowed_ips '192.168.33.0/24'
	list allowed_ips '192.168.8.0/24'
	option route_allowed_ips '1'

config wireguard_wg0
	option description 'porkshus'
	option public_key 'CguL...'
	option preshared_key 'zLi8...'
	list allowed_ips '172.22.0.2/32'
	list allowed_ips '192.168.0.0/24'
	option route_allowed_ips '1'

config interface 'antizapret'
	option proto 'none'
	option ifname 'tun0'
	option delegate '0'

Client:

package network

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd96:2765:595e::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option delegate '0'
	option ipaddr '192.168.33.1'

config interface 'wan'
	option ifname 'eth0.2'
	option proto 'dhcp'
	option metric '100'
	option peerdns '0'
	option auto '0'
	option delegate '0'

config interface 'hilink'
	option ifname 'eth1'
	option proto 'dhcp'
	option peerdns '0'
	option metric '100'
	option delegate '0'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '1 2 3 4 5t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '0 5t'

config interface 'wgclient'
	option proto 'wireguard'
	option private_key 'AMWh...'
	list addresses '172.22.0.33/32'
	option delegate '0'

config wireguard_wgclient
	option description 'a33wgsrv'
	option public_key 'q959...'
	option preshared_key 'J6wV...'
	option endpoint_host 'server_real_ip'
	option endpoint_port '50812'
	option persistent_keepalive '25'
	option route_allowed_ips '1'
	list allowed_ips '0.0.0.0/0'

Looks good, my wrong!

1 Like

Thank you very much for your efforts, mate.

1 Like

It is likely not affected due to the difference in protocol implementation or hardware.
The WireGuard developers should be better informed about this kind of detail.