[solved] Wireguard: No handshake if Client Endpoint changes ip

[Solution] The issue depends on PBR that creates a broken routing table at startup. My WAN link is PPPoE based. See https://github.com/openwrt/packages/issues/25616

Dear All,

I configured a Wireguard VPN server on my router in order to have access to my home network.

All works great except a tedious issue: when the client endpoint changes IP from the first used to connect, the VPN server doesn't complete the handshake. The only way to permit the VPN handshake is to restart the Wireguard server interface.
It's like the Wireguard doesn't forgot the first used endpoint's IP, after a long timeout.

Any Idea? What can I investigate or change in my configuration?

uci show network
...
network.wgsrv=interface
network.wgsrv.proto='wireguard'
network.wgsrv.private_key='BLABLA'
network.wgsrv.listen_port='51820'
network.wgsrv.addresses='192.168.79.1/24'
network.wgsrv.dns='192.168.1.1'
network.@wireguard_wgsrv[0]=wireguard_wgsrv
network.@wireguard_wgsrv[0].description='pixel7'
network.@wireguard_wgsrv[0].public_key='BLABLA'
network.@wireguard_wgsrv[0].private_key='BLABLA'
network.@wireguard_wgsrv[0].route_allowed_ips='1'
network.@wireguard_wgsrv[0].allowed_ips='192.168.79.2/32'
...
uci show firewall
firewall.@defaults[0]=defaults
firewall.@defaults[0].input='REJECT'
firewall.@defaults[0].output='ACCEPT'
firewall.@defaults[0].forward='REJECT'
firewall.@defaults[0].synflood_protect='1'
firewall.@defaults[0].flow_offloading='1'
firewall.@defaults[0].flow_offloading_hw='1'
firewall.@defaults[0].offload_delay='1'
firewall.@defaults[0].ofd_packets='32'
firewall.@defaults[0].ofd_proto='{udp}'
firewall.@zone[0]=zone
firewall.@zone[0].name='wan'
firewall.@zone[0].input='REJECT'
firewall.@zone[0].output='ACCEPT'
firewall.@zone[0].forward='REJECT'
firewall.@zone[0].masq='1'
firewall.@zone[0].mtu_fix='1'
firewall.@zone[0].network='wan' 'wan6' 'wan836'
firewall.@zone[1]=zone
firewall.@zone[1].name='lan'
firewall.@zone[1].input='ACCEPT'
firewall.@zone[1].output='ACCEPT'
firewall.@zone[1].forward='ACCEPT'
firewall.@zone[1].network='lan'
firewall.@zone[3]=zone
firewall.@zone[3].name='wgsrv'
firewall.@zone[3].input='ACCEPT'
firewall.@zone[3].output='ACCEPT'
firewall.@zone[3].forward='ACCEPT'
firewall.@zone[3].network='wgsrv'
firewall.@forwarding[0]=forwarding
firewall.@forwarding[0].src='lan'
firewall.@forwarding[0].dest='wan'
...
firewall.@rule[9]=rule
firewall.@rule[9].name='Allow-Wireguard'
firewall.@rule[9].proto='udp'
firewall.@rule[9].src='*'
firewall.@rule[9].target='ACCEPT'
firewall.@rule[9].dest_port='51820'
firewall.@rule[10]=rule
firewall.@rule[10].name='Allow-Wireguard'
firewall.@rule[10].proto='udp'
firewall.@rule[10].target='ACCEPT'
firewall.@rule[10].dest='*'
firewall.@rule[10].src_port='51820'
firewall.@forwarding[2]=forwarding
firewall.@forwarding[2].src='wgsrv'
firewall.@forwarding[2].dest='lan'
firewall.@forwarding[3]=forwarding
firewall.@forwarding[3].src='wgsrv'
firewall.@forwarding[3].dest='wan'
firewall.@forwarding[4]=forwarding
firewall.@forwarding[4].src='lan'
firewall.@forwarding[4].dest='wgsrv'
...

Thanks in advance

As far as I know it is up to the client to announce its new IP address.

reduce the timeout ?

Which timeout?
I already tried persistent_keepalive to 15seconds... but nothing change.

Can you better explain?
Thanks.

That the server does not play a role, the client must announce its new IP to the server and it looks like the client does not do that.

What happens if you restart the client, does it work then?

Or are you talking about that the servers IP address changes?

Ignore this post! PBR create a ip rule for wireguard port, removing that rule the issue remain.

Restart client changes nothing.
I noticed:

11:42:43.682609 phy1-ap0 P   IP 192.168.1.230.52944 > 192.168.1.1.51820: UDP, length 148
11:42:43.682609 br-lan In  IP 192.168.1.230.52944 > 192.168.1.1.51820: UDP, length 148
11:42:43.683738 pppoe-wan Out IP 213.45.12.113.51820 > 192.168.1.230.52944: UDP, length 92
11:42:48.821457 phy1-ap0 P   IP 192.168.1.230.52944 > 192.168.1.1.51820: UDP, length 148
11:42:48.821457 br-lan In  IP 192.168.1.230.52944 > 192.168.1.1.51820: UDP, length 148
11:42:48.822579 pppoe-wan Out IP MY_PUBLIC_ADDRESS.51820 > 192.168.1.230.52944: UDP, length 92

Moving the client inside the network.

The client is 192.168.1.230 (An android phone), the VPN Server 192.168.1.1.
The Wireguard Server receives on 192.168.1.1 interface but the handshake answer goes on pppoe-wan interface (with the public address).
Very strange.
I have not any NAT rules and the server is binded on 0.0.0.0:

netstat -lnu|grep 51820
udp        0      0 0.0.0.0:51820           0.0.0.0:*                           
udp        0      0 :::51820                :::*          

You cannot run a client from inside your network you must use your phone on cellular.

Inside your network you will run into routing issues

I can connect from inside. I discovered that PBR creates a static route to use the public wan interface. Furthermore, the issue is not at this level.

Removing the pbr rule or/and using the public ip (instead internal ip) I still face the issue.

Connecting using the internal IP (after removing the pbr rule):

peer: BLABLA
endpoint: 192.168.1.230:36393
allowed ips: 192.168.79.2/32
latest handshake: 9 minutes, 5 seconds ago
transfer: 1.39 MiB received, 19.59 MiB sent
persistent keepalive: every 15 seconds

The switching to another ip (from my mobile operator):

tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
12:16:58.683277 pppoe-wan In  IP 78.xxx.xxx.xxx.31216 > 87.yyy.yyy.yyy.51820: UDP, length 148
12:17:03.976854 pppoe-wan In  IP 78.xxx.xxx.xxx.31216 > 87.yyy.yyy.yyy.51820: UDP, length 148
12:17:09.273097 pppoe-wan In  IP 78.xxx.xxx.xxx.31216 > 87.yyy.yyy.yyy.51820: UDP, length 148
12:17:14.392930 pppoe-wan In  IP 78.xxx.xxx.xxx.31216 > 87.yyy.yyy.yyy.51820: UDP, length 148
12:17:19.692642 pppoe-wan In  IP 78.xxx.xxx.xxx.31216 > 87.yyy.yyy.yyy.51820: UDP, length 148
12:17:25.002159 pppoe-wan In  IP 78.xxx.xxx.xxx.31216 > 87.yyy.yyy.yyy.51820: UDP, length 148

The server increases the received frames but it doesn't update the endpoint ip and don't send anything.

ciao

luigi