Interface not found bonding vxlan over wireguard

I'm trying to bond two different wireguard links for link aggregation (using vxlan over wg).

It works if I do it manually, it doesn't work with UCI.

I created the two vxlan interfaces:

config interface 'vxlan0'
	option proto 'vxlan'
	option peeraddr '192.168.50.1'
	option tunlink 'wg0'
	option port '4789'
	option vid '30'

config interface 'vxlan1'
	option proto 'vxlan'
	option peeraddr '192.168.60.1'
	option tunlink 'wg1'
	option port '4789'
	option vid '31'

Then I added the bond0 interface

config interface 'bond0'
	option proto 'bonding'
	option slaves 'vxlan0 vxlan1'

config interface 'bonded'
	option proto 'static'
	option ifname 'bond0'
	option ipaddr '192.168.110.10'
	option netmask '255.255.255.0'

The issue is that the vxlan* interfaces are not enslaved, since they are not online when netifd issues the command:

Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'bond0' is setting up now
Tue Mar  9 16:52:59 2021 kern.info kernel: [57449.299599] IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
Tue Mar  9 16:52:59 2021 kern.info kernel: [57449.302957] 8021q: adding VLAN 0 to HW filter on device bond0
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'bonded' is enabled
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'bonded' is setting up now
Tue Mar  9 16:52:59 2021 daemon.info avahi-daemon[2513]: Joining mDNS multicast group on interface bond0.IPv4 with address 192.168.110.10.
Tue Mar  9 16:52:59 2021 daemon.info avahi-daemon[2513]: New relevant interface bond0.IPv4 for mDNS.
Tue Mar  9 16:52:59 2021 daemon.info avahi-daemon[2513]: Registering new address record for 192.168.110.10 on bond0.IPv4.
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'bonded' is now up
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'vxlan0' is setting up now
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'vxlan1' is setting up now
Tue Mar  9 16:52:59 2021 kern.info kernel: [57449.322017] bonding: bonding-bond0 is being created...
Tue Mar  9 16:52:59 2021 daemon.notice netifd: bond0 (18261): bond0 ERROR IN CONFIGURATION - vxlan0: No such device
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'vxlan1' is now down
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'vxlan0' is now down
Tue Mar  9 16:52:59 2021 kern.info kernel: [57449.538764] bonding: bonding-bond0 is being deleted...
Tue Mar  9 16:52:59 2021 kern.info kernel: [57449.539842] bonding-bond0 (unregistering): Released all slaves
Tue Mar  9 16:52:59 2021 user.notice root: bonding_teardown(bond0):
Tue Mar  9 16:52:59 2021 daemon.notice netifd: Interface 'bond0' is now down
Tue Mar  9 16:53:14 2021 daemon.notice netifd: Interface 'vxlan1' is setting up now
Tue Mar  9 16:53:14 2021 daemon.notice netifd: Interface 'vxlan0' is setting up now
Tue Mar  9 16:53:15 2021 daemon.notice netifd: Interface 'vxlan1' is now up
Tue Mar  9 16:53:15 2021 daemon.notice netifd: Interface 'vxlan0' is now up
Tue Mar  9 16:53:15 2021 daemon.notice netifd: tunnel 'vxlan1' link is up
Tue Mar  9 16:53:15 2021 daemon.notice netifd: tunnel 'vxlan0' link is up
Tue Mar  9 16:53:16 2021 daemon.info avahi-daemon[2513]: Joining mDNS multicast group on interface vxlan0.IPv6 with address fe80::7421:d6ff:fe24:eb30.
Tue Mar  9 16:53:16 2021 daemon.info avahi-daemon[2513]: New relevant interface vxlan0.IPv6 for mDNS.
Tue Mar  9 16:53:16 2021 daemon.info avahi-daemon[2513]: Registering new address record for fe80::7421:d6ff:fe24:eb30 on vxlan0.*.
Tue Mar  9 16:53:16 2021 daemon.info avahi-daemon[2513]: Joining mDNS multicast group on interface vxlan1.IPv6 with address fe80::f4e7:7cff:fe0f:8760.
Tue Mar  9 16:53:16 2021 daemon.info avahi-daemon[2513]: New relevant interface vxlan1.IPv6 for mDNS.
Tue Mar  9 16:53:16 2021 daemon.info avahi-daemon[2513]: Registering new address record for fe80::f4e7:7cff:fe0f:8760 on vxlan1.*.

If I issue the commands manually, it works:

$i p l add vxlan0 type vxlan id 30 dstport 4789 remote 192.168.50.1
$ ip l add vxlan1 type vxlan id 31 dstport 4789 remote 192.168.60.1
$ ip l add bond0 type bond
$ ip l set vxlan0 master bond0
$ ip l set vxlan1 master bond0
$ ip l set bond0 up
$ ip a add dev bond0 192.168.110.10/24
$ ping 192.168.110.1
PING 192.168.110.1 (192.168.110.1): 56 data bytes
64 bytes from 192.168.110.1: seq=0 ttl=64 time=119.478 ms
64 bytes from 192.168.110.1: seq=1 ttl=64 time=60.506 ms
64 bytes from 192.168.110.1: seq=2 ttl=64 time=56.796 ms
$ ip -d l show bond0
179: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 16:c8:5d:4a:3e:e9 brd ff:ff:ff:ff:ff:ff promiscuity 0 
    bond mode balance-rr miimon 0 updelay 0 downdelay 0 use_carrier 1 arp_interval 0 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_rate slow ad_select stable tlb_dynamic_lb 1 addrgenmode eui64 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 

Is there any way of delaying the setup of bond0 interface waiting for vxlan to be up?

Not possible. Wireguard is a Layer 3 interface anyways.

Probably you didn't read all the post. It works by configuring it manually, using vxlan over wg

I clarified it in the post thank you

I read it.

I'm also surprised, since the kernel won't allow a point to point interface to be bonded, but perhaps the vxlan is different and can be bonded.

Do the enslave in a hotplug script when the device comes up. /etc/hotplug.d/iface/99-vxlan-enslave

3 Likes

I followed this advice https://www.reddit.com/r/WireGuard/comments/jphwi3/bonding_wireguard_interfaces/gbescx6?utm_source=share&utm_medium=web2x&context=3

2 Likes

You need to add a layer between the wireguard interface and the bond,

Yes, this is possible. Any interface will do. As you read, you're not bonding the Wireguard.

Just make sure its traffic is routed to the interfaced you create. Simple.

1 Like

The issue is that uci/netifd is not waiting for vxlans to be up, so the enslaving fails.

  • Use another type device
  • Add a custom command (it should run later in the process) :wink:
1 Like

I gave you the solution already

Do the enslave in a hotplug script /etc/hotplug.d/iface/99-vxlan-enslave

3 Likes

Many years ago I used to bond two internet links together. This required a modification to the kernel to allow point to point interfaces to be bonded. I posted the whole implementation to my github. If you look here

You will find a bunch of scripts that setup a bond0 interface and enslave the tun interfaces after they come up.

A lot of the stuff should be applicable to what you're trying to do and can provide some inspiration, even if I haven't used it for years and it's originally based on Openwrt 15.05

1 Like

Yes, I just wanted to make sure that there wasn't any way of doing it using uci config, or that there was some kind of bug in the configuration process

no bug. you just have an unusual edge use case that's not easily supported

I believe it's justified to rely on hotplug.
Netifd may not be smart enough to handle race conditions that well.

In theory, you can declare the interfaces using UCI with disabled autostart.
Then activate them with ifup triggered by a hotplug event.

Now the issue is that when vxlans are created by netifd they are also brought up, which renders them unusable anymore as a slave. Creating and enslaving them manually works, setting them down and enslaving them is not recoverable. Will try to use hotplug on wg to create and enslave

Do yourself a favour and look at the github repo I linked you.

I had to do exactly the same thing: wait for interfaces to come up, creating a bond interface, slave the two other interfaces using an openvpn tunnel. It worked 24/7 for 4 years, so the code is robust

It's a little more of a complex use case than you're doing here, since the two vpn links were conceptually one large pipe, so the two streams needed to be reassembled by a bonding driver on a server in a data centre, but the sequence of things I had to do is going to be very similar to what you're trying to do.

As far as I recall, I actually had to use sysfs to do some of the things as ifenslave/command line tools misbehaved.

I created an hotplug script on the two wireguard connections so that it

  • adds the vxlan tunnel over wireguard
  • sets bond0 as a master to the vxlan tunnel

Right now it seems to work properly

3 Likes

Hi there, can you share the script and how to auto run the script when boot?

Now I have 2 WAN.

1 will have direct internet connection.
Another one will using wireguard.

It works for me using hotplug.

The problem is monitoring the bond slaves and remove the salve from bond as soon as one of the tunnel goes down. Wireguard never pulls the interface down even if it is unable to reconnect, so hotplug doesn't help there.

Monitoring wg endpoints from both the ends works, but in case server has multiple clients over the same bonding slaves, it wouldn't work out the way we want.

Since VXLAN is L2, ping is not going to help.

Adding IPs for ARP monitoring for bond slaves also didn't help as underlying interfaces (wg) don't support ethtool.

Slaves in RR mode cause 50% packet loss if one of the tunnels is disconnected.

Any ideas?

1 Like

You should be able to handle link down with wireguard_watchdog.

1 Like