Wireguard config with "endpoint_host" not working with luci on 21.02.0-rc4

rwalli · August 2, 2021, 3:40pm

Hello!

Can't configure wireguard peers with option "endpoint" (client) on a ramips/mt7621 device on 21.02.0-rc4 with luci.

Error: Network device is not present

command line configuration works on all devices and releases.
configuration with luci works on all devices with 21.02.0-rc3.
configuration with luci works on non-dsa devices and all releases.

anyone else here with this error, or did I miss something...

rwalli · August 6, 2021, 1:38pm

As there is no answer I guess there is something wrong with my config.
It would be very helpful if someone with an mt7621 device could confirm that "wireguard-client" is working on 21.02.0-rc4.

to check:
append this to /etc/config/network:

config interface 'wg_test'
	option proto 'wireguard'
	list addresses '192.168.68.1/32'
	option private_key 'SP+tbIpPdi7ZjzivKPsVEo4zSTTlZoEaFCUe0R+jm3U='

config wireguard_wg_test
	option public_key 'l8SGbfG+oZIJrc8EQcDx0q7iAHDETF6RPgaYXOqZVVE='
	option endpoint_host '10.11.12.13'
	option endpoint_port '1234'

and see if an interface is created:

/etc/init.d/network restart
ifconfig wg_test

cesarvog · August 6, 2021, 3:36pm

I'm no specialist, but here is my working rc4 config for comparison (this is to connect to a VPN provider):

config interface 'wg0'
	option proto 'wireguard'
	option listen_port '51820'
	option peerdns '0'
	option private_key 'my_private_key'
	list addresses '10.12.12.13/24'
	list dns '1.1.1.1'

config wireguard_wg0 'wgclient'
	option public_key 'my_public_key'
	option persistent_keepalive '25'
	option description 'WG'
	option endpoint_host '11.12.13.14'
	option endpoint_port '1443'
	list allowed_ips '0.0.0.0/0'

rwalli · August 6, 2021, 5:04pm

thank you for your config but unfortunately it does not work for me.
on which mt7621 device is this?

cesarvog · August 6, 2021, 8:42pm

I've posted my config just for reference in case your problem was caused by misconfiguration, even though WireGuard package is not running on a MediaTek based device on my side. The posted config works for me on all versions of 21.02 made available this far, on both an x86/64 and a GL-MV1000 (mvebu based). Since it does not seem to work on your mt7621 based device, maybe on that platform something is amiss, either on the WG package itself, or on one of it's dependencies.

rwalli · August 28, 2021, 2:07pm

Finally I got a second device for testing and found out that the problem occurs in a combination of option endpoint_host and lack of a option gateway in /etc/config/network.

Can someone please confirm that wireguard-interface creation failed if commenting-out option gateway on a mt7621-device with a wireguard-option endpoint_host.

Thank you in advance,

Robert

vgaetera · August 28, 2021, 3:20pm

Sounds like you need to increase metric on the upstream interface.

rwalli · August 28, 2021, 4:49pm

I don't have a upstream interface on this device and the default route is advertised via ospf.
but even if i type in the default route by command line, the wireguard interface will not be created.

PS: I guess it's a bug because it is easy to reproduce:

install a fresh openwrt-rc4 with luci-proto-wireguard on mt7621
add a wireguard-peer with endpoint_host

-> no wireguard interfeace is created.

after adding a default route to /etc/config/network
the wireguard interface is created.

iplaywithtoys · July 16, 2023, 4:00pm

Confirmed. Can reproduce. I've been tinkering with OSPF in OpenWRT 22.03.5 in an EVE-NG lab and wanted to introduce WireGuard into the mix. Been tearing out what little hair remains trying to work out why WG doesn't bring up wg0 despite the WG configuration being accurate. Finally resorted to Google and stumbled upon this thread.

With routes advertised by OSPF (using frr-ospfd) and no explicit static route to the WG peer configured in /etc/config/network, whether by option gateway or config route, the WireGuard interface does not stay up, and logread shows wg0 being torn down again immediately after creation.

Adding a static route to the WG peer into /etc/config/network magically allows wg0 to appear and stay persistent. As my lab is intentionally not using static routes, this is... suboptimal.

Don't know if it's tied to OSPF, or any dynamic routing, or simply the absence of an explicitly-defined static route, but it's definitely repeatable. Also don't know if it's a fault in OpenWRT or a fault in WireGuard.

lleachii · July 17, 2023, 7:47am

Can you show an example?

I believe I had a slightly related issue on IPv6. In my case I had 2 upstream IPv6 interfaces; but needed to use the interface Wireguard didn't prefer. In my case, I created a static route with a lower metric.

iplaywithtoys · July 17, 2023, 7:52am

Can do. It'll have to be later today. I'll grab the config from my lab.

vgaetera · July 17, 2023, 10:34am

Increase verbosity and insert debugging and logging instructions in the relevant script:
https://github.com/openwrt/openwrt/blob/master/package/network/utils/wireguard-tools/files/wireguard.sh

iplaywithtoys · July 17, 2023, 12:31pm

With static route for remote WG host:

/etc/config/network

[...]
config interface 'wan'
	option device 'eth1'
	option proto 'static'
	option ipaddr '172.16.0.26'
	option netmask '255.255.255.252'
[...]

config interface 'wg0'
        option proto 'wireguard'
        option private_key 'private key'
        option listen_port '51280'
        list addresses '192.0.2.2'

config wireguard_wg0
        option description 'Left'
        list allowed_ips '172.30.0.0/24'
        option route_allowed_ips '1'
        option endpoint_host '10.0.0.26'
        option endpoint_port '51280'
        option public_key 'peer public key'

config route
        option interface 'wan'
        option target '10.0.0.26'
        option gateway '172.16.0.25'

Interface status from ip link
wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN qlen 1000

Routing table

default via 172.16.0.25 dev eth1  metric 20 - OSPF advertised default route
[...]
10.0.0.26 via 172.16.0.25 dev eth1 - static route to remote WG host

Without static route for remote WG host:

Interface status
No wg0 interface appears in ip link output

Routing table

default via 172.16.0.25 dev eth1  metric 20 - OSPF advertised default route
[...]
10.0.0.24/30 via 172.16.0.25 dev eth1  metric 20  - OSPF advertised route to remote WG host

Still working on gathering some debug and log evidence; I'll add it here once I can get it.

iplaywithtoys · July 17, 2023, 8:59pm

This is disappointing. I figured there was enough there to guide my searching, but I have to cry uncle at this one.

How enhance the wireguard log output? - #4 by vgaetera appears to suggest I need to roll my own image if I want to gather logs and debugging detail.

Ditto Wireguard debug log info? - #2 by vgaetera

I note, however, that /sys/kernel/debug/dynamic_debug/ does appear to exist, and appears to contain a single file - control - with the contents # filename:lineno [module]function flags format

https://www.wireguard.com/quickstart/#debug-info appears to suggest that if "your kernel supports dynamic debugging" then additional information might be obtainable. However, echo module wireguard +p > /sys/kernel/debug/dynamic_debug/control does not appear to change anything; the contents of that file appear to remain fixed at # filename:lineno [module]function flags format

export LOG_LEVEL=verbose does not appear to change any of the generated information, as best I can tell.

As for the wireguard.sh script, if you have suggestions for suitable additions I'm all ears. So to speak. I've been staring at the script for a couple hours now, but .sh scripting is not (yet) my forté.

If it comes to it I can make do with static routes but it'd be nice to get it working with dynamic routes.

iplaywithtoys · July 18, 2023, 8:30am

That's great, thank you. I'll dive into that later today and see what I can find.

For the avoidance of doubt, when I previously wrote "this is disappointing" I was not referring to your response; rather, I was referring to my inability to find the information that your response suggested.

I'm accustomed to being able to search for stuff and find it, so when I encounter something I can't search for and find, it takes me by surprise.

egc · July 18, 2023, 9:54am

iplaywithtoys:

config interface 'wg0'
        option proto 'wireguard'
        option private_key 'private key'
        option listen_port '51280'
        list addresses '192.0.2.2'

config wireguard_wg0
        option description 'Left'
        list allowed_ips '172.30.0.0/24'
        option route_allowed_ips '1'
        option endpoint_host '10.0.0.26'
        option endpoint_port '51280'
        option public_key 'peer public key'

Your List address is outside the allowed IP's that is unusual.
Furthermore this address is not a private address which is also unusual.
To get a route of the WG address via the tunnel it is better to use a list address with scope /24

Not sure if this is related to your problem though.

iplaywithtoys · July 18, 2023, 10:17am

Traffic destined for the peer's tunnel address is not needed, so it's not necessary (in this situation) to include it. The only traffic which needs to go through the tunnel for this scenario is traffic destined for the remote subnet 172.30.0.0/24.

There may be other scenarios in which the peer needs to be reached at its tunnel address, and so the peer's tunnel address would need to be in the AllowedIPs directive, but this is not one of them.

See https://datatracker.ietf.org/doc/html/rfc5737

Only if I need an entire 256-address block for the tunnel itself. Point-to-point addressing is also viable, and is common practice.

Unfortunately, it is not. The proximate cause of the problem is the absence of an explicitly-defined static route to the peer, not the choice of IP addressing for the tunnel.

If the static route is present, then the wg0 interface stays up.

If the static route is not present (e.g. because it's not needed when using dynamic routing) then the wg0 interface does not stay up.

I'll gather the evidence suggested by @vgaetera in a bit, and then see where we go from there.

mk24 · July 18, 2023, 11:38am

Point to point Wireguard can have the allowed_ips set to /0 on both ends, and routing controlled externally.

When a Wireguard interface has more than one peer, allowed_ips become important. A non-overlapping set of per-peer allowed_ips is needed for the kernel driver to determine which peer to send an outgoing packet to. There really isn't much provision to configure that dynamically, but if you stay with point to point links it should not be necessary.

I don't think the wireguard script has any provision to bring down the interface when it sees something that it doesn't like-- maybe it could be OSPF doing that.

iplaywithtoys · July 18, 2023, 11:44am

Anything's possible. Hopefully the additional logging might shed some light on the behaviour.

iplaywithtoys · July 18, 2023, 12:50pm

The above code sent me down a rabbit hole of learning about other capabilities of sed beyond the ones I already knew, and also learning about output redirection. I've ordered a copy of the O'Reilly book for sed & awk, along with the O'Reilly book about shell scripting. Should be some fun bedtime reading! So thank you for the nudge towards some more learning.

Contents excerpt of /lib/netifd/proto/wireguard.sh before applying the suggested change:

[ -n "$INCLUDE_ONLY" ] || {
        . /lib/functions.sh
        . ../netifd-proto.sh
        init_proto "$@"
}

proto_wireguard_init_config() {
        proto_config_add_string "private_key"
        proto_config_add_int "listen_port"
        proto_config_add_int "mtu"
        proto_config_add_string "fwmark"
        available=1
        no_proto_task=1
}

Contents excerpt of /lib/netifd/proto/wireguard.sh after applying the suggested change:

[ -n "$INCLUDE_ONLY" ] || {
        . /lib/functions.sh
        . ../netifd-proto.sh
        init_proto "$@"
}
set -x -v
exec &> /tmp/wireguard.log

proto_wireguard_init_config() {
        proto_config_add_string "private_key"
        proto_config_add_int "listen_port"
        proto_config_add_int "mtu"
        proto_config_add_string "fwmark"
        available=1
        no_proto_task=1
}

With a static route to the WG peer defined, and before making the above change to the wireguard.sh script, the wg0 interface appears as expected when issuing /etc/init.d/network restart:

# logread -e wg0
Tue Jul 18 12:44:55 2023 daemon.notice netifd: Interface 'wg0' is setting up now
Tue Jul 18 12:44:55 2023 daemon.notice netifd: Interface 'wg0' is now up
Tue Jul 18 12:44:55 2023 daemon.notice netifd: Network device 'wg0' link is up

Still with a static route defined, but after making the above change to the script, the wg0 interface disappears upon restarting the network stack:

# logread -e wg0
Tue Jul 18 12:44:55 2023 daemon.notice netifd: Interface 'wg0' is setting up now
Tue Jul 18 12:44:55 2023 daemon.notice netifd: Interface 'wg0' is now up
Tue Jul 18 12:44:55 2023 daemon.notice netifd: Network device 'wg0' link is up
Tue Jul 18 12:45:57 2023 daemon.notice netifd: Interface 'wg0' has lost the connection
Tue Jul 18 12:45:57 2023 daemon.notice netifd: Network device 'wg0' link is down
Tue Jul 18 12:45:57 2023 daemon.notice netifd: wg0 (3076): exec &> /tmp/wireguard.log
Tue Jul 18 12:45:57 2023 daemon.notice netifd: wg0 (3076): + exec
Tue Jul 18 12:45:57 2023 daemon.notice netifd: Interface 'wg0' is now down

The file /tmp/wireguard.log exists, and is 768 lines long. Would you like me to post the entire contents, or extract certain parts of it?