Fail-over with mwan3 works only when unplugging the cable

Hello everyone,

I have installed mwan3 to implement fail-over across three internet connections. It works great if I unplug or plug back the WAN cables. But if the WAN access is lost due to another issue, for instance the internet connectivity is lost, then:

  • :white_check_mark: The corresponding WAN interface is correctly marked as down in the MultiWAN status
  • :white_check_mark: From the router, I can ping any internet IP, so the fail-over worked somehow
  • :cross_mark: However, I cannot ping any internet IP from the clients behind the router

In case it makes a difference, I do not use the default LAN interface and bridge created by OpenWRT (192.168.1.0/24). I have defined my own (V)LAN networks. All three WAN interfaces belong to the “WAN” firewall zone with masquerade, internet access from the LAN networks works fine when I unplug the other cables. And: it is an IPv4 only setup.

I have tried to troubleshoot, but the mechanisms used and well documented by mwan3 are beyond my understanding. I thought that given the specific symptoms, someone might point my in the right direction? I would be grateful for any hint.

Please post the output of

head -v -n -0 /etc/config/mwan3; head -v -n -0 /etc/config/network;\
head -v -n -0 /etc/config/firewall

Redact the sensitive info (MACs, public IPs, keys...).

1 Like

Hello, I have the same problem, my config is:

head -v -n -0 /etc/config/mwan3; head -v -n -0 /etc/config/network
==> /etc/config/mwan3 <==

config globals 'globals'
option mmx_mask '0x3F00'

config interface 'wan'
option enabled '1'
list track_ip '1.0.0.1'
list track_ip '1.1.1.1'
list track_ip '208.67.222.222'
list track_ip '208.67.220.220'
option family 'ipv4'
option reliability '1'
option initial_state 'online'
option track_method 'ping'
option count '1'
option size '56'
option max_ttl '60'
option timeout '4'
option interval '10'
option failure_interval '5'
option recovery_interval '5'
option down '5'
option up '5'

config interface 'wanb'
option enabled '1'
list track_ip '1.0.0.1'
list track_ip '1.1.1.1'
list track_ip '208.67.222.222'
list track_ip '208.67.220.220'
option family 'ipv4'
option reliability '1'
option initial_state 'online'
option track_method 'ping'
option count '1'
option size '56'
option max_ttl '60'
option timeout '4'
option interval '10'
option failure_interval '5'
option recovery_interval '5'
option down '5'
option up '5'

config member 'wan_m1_w3'
option interface 'wan'
option metric '1'
option weight '1'

config member 'wanb_m1_w2'
option interface 'wanb'
option metric '1'
option weight '1'

config policy 'balanced'
list use_member 'wan_m1_w3'
list use_member 'wanb_m1_w2'
option last_resort 'unreachable'

config rule 'https'
option sticky '1'
option dest_port '443'
option proto 'tcp'
option use_policy 'balanced'
option family 'ipv4'

config rule 'default_rule_v4'
option dest_ip '0.0.0.0/0'
option use_policy 'balanced'
option family 'ipv4'
option proto 'all'
option sticky '0'

==> /etc/config/network <==

config interface 'loopback'
option device 'lo'
option proto 'static'
option ipaddr '127.0.0.1'
option netmask '255.0.0.0'

config globals 'globals'
option ula_prefix 'fd41:5890:ffab::/48'
option packet_steering '1'

config device
option name 'br-lan'
option type 'bridge'
list ports 'lan2'
list ports 'lan3'

config interface 'lan'
option device 'br-lan'
option proto 'static'
option ipaddr '192.168.77.1'
option netmask '255.255.255.0'
option ip6assign '60'

config interface 'wan'
option device 'wan'
option proto 'dhcp'
option metric '10'

config device
option name 'lan1'

config interface 'wanb'
option proto 'dhcp'
option device 'lan1'
option metric '20'

Thanks a lot for any help

@pavelgl I did not manage to get the information you asked for yet, sorry. I have only intermittent access to the device. Hopefully the information provided by @jaimedb will help. Otherwise, I will post mine after the holiday. Thanks for your help!

This cannot be debugged without posting the requested information.
It would also be good to see the OpenWrt version used.

ubus call system board

The same in what way?
You are using mwan3 in load balancing, not failover mode.
When a failure occurs, can you ping public IPs from the router itself?
The network and mwan3 configurations look correct, but we haven't seen the firewall settings.

ubus call system board
cat /etc/config/firewall

Hi @pavelgl ,

thanks a lot for your time. Here is the requested information.

head -v -n -0 /etc/config/mwan3; head -v -n -0 /etc/config/network; head -v -n -0 /etc/config/firewall
==> /etc/config/mwan3 <==

config globals 'globals'
        option mmx_mask '0x3F00'

config rule 'default_rule_v4'
        option dest_ip '0.0.0.0/0'
        option use_policy 'failover'
        option family 'ipv4'
        option proto 'all'
        option sticky '0'

config rule 'default_rule_v6'
        option dest_ip '::/0'
        option use_policy 'failover'
        option family 'ipv6'
        option proto 'all'
        option sticky '0'

config interface 'wana'
        option initial_state 'online'
        option family 'ipv4'
        option track_method 'ping'
        option reliability '1'
        option count '1'
        option size '56'
        option max_ttl '60'
        option timeout '4'
        option interval '10'
        option failure_interval '5'
        option recovery_interval '5'
        option down '5'
        option up '5'
        option enabled '1'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'

config interface 'wanb'
        option initial_state 'online'
        option family 'ipv4'
        option track_method 'ping'
        option reliability '1'
        option count '1'
        option size '56'
        option max_ttl '60'
        option timeout '4'
        option interval '10'
        option failure_interval '5'
        option recovery_interval '5'
        option down '5'
        option up '5'
        option enabled '1'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'

config interface 'wanc'
        option enabled '1'
        option initial_state 'online'
        option family 'ipv4'
        option track_method 'ping'
        option reliability '1'
        option count '1'
        option size '56'
        option max_ttl '60'
        option timeout '4'
        option interval '10'
        option failure_interval '5'
        option recovery_interval '5'
        option down '5'
        option up '5'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'
        list track_ip '...'

config member 'wana_m10_w1'
        option interface 'wana'
        option metric '10'
        option weight '1'

config member 'wanb_m20_w1'
        option interface 'wanb'
        option metric '20'
        option weight '1'

config member 'wanc_m30_w1'
        option interface 'wanc'
        option metric '30'
        option weight '1'

config policy 'failover'
        list use_member 'wana_m10_w1'
        list use_member 'wanb_m20_w1'
        list use_member 'wanc_m30_w1'
        option last_resort 'unreachable'

config policy 'wana_only'
        list use_member 'wana_m10_w1'
        option last_resort 'unreachable'

config policy 'wanb_only'
        option last_resort 'unreachable'
        list use_member 'wanb_m20_w1'

config policy 'wanc_only'
        list use_member 'wanc_m30_w1'
        option last_resort 'unreachable'

==> /etc/config/network <==

config interface 'loopback'
        option device 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fde1:e009:68f4::/48'
        option packet_steering '1'

config device
        option name 'br-lan'
        option type 'bridge'
        list ports 'eth0'

config interface 'lan'
        option device 'br-lan'
        option proto 'static'
        option ipaddr '192.168.1.1'
        option netmask '255.255.255.0'
        option ip6assign '60'

config device
        option name 'eth7'

config device
        option type 'bridge'
        option name 'br-lan2'
        list ports 'eth7'

config bridge-vlan
        option device 'br-lan2'
        option vlan '1'
        list ports 'eth7:t'

config bridge-vlan
        option device 'br-lan2'
        option vlan '2'
        list ports 'eth7:t'

config interface 'vlan1'
        option proto 'static'
        option device 'br-lan2.1'
        option ipaddr '10.1.1.254'
        option netmask '255.255.255.0'

config interface 'vlan2'
        option proto 'static'
        option device 'br-lan2.2'
        option ipaddr '10.1.2.254'
        option netmask '255.255.255.0'

config interface 'vlan6'
        option proto 'static'
        option device 'br-lan2.6'
        option ipaddr '10.1.6.254'
        option netmask '255.255.255.0'

config bridge-vlan
        option device 'br-lan2'
        option vlan '3'
        list ports 'eth7:t'

config interface 'vlan3'
        option proto 'static'
        option device 'br-lan2.3'
        option ipaddr '10.1.3.253'
        option netmask '255.255.255.0'

config route
        option interface 'vlan2'
        option target '10.4.0.0/16'
        option gateway '10.1.2.12'

config bridge-vlan
        option device 'br-lan2'
        option vlan '6'
        list ports 'eth7:t'

config bridge-vlan
        option device 'br-lan2'
        option vlan '10'
        list ports 'eth7:t'

config interface 'vlan10'
        option proto 'static'
        option device 'br-lan2.10'
        option ipaddr '10.1.10.253'
        option netmask '255.255.255.0'

config route
        option interface 'vlan2'
        option target '10.5.0.0/16'
        option gateway '10.1.2.12'

config interface 'wanb'
        option proto 'dhcp'
        option device 'eth2'
        option metric '20'

config interface 'wanc'
        option proto 'dhcp'
        option device 'eth3'
        option metric '30'

config interface 'wana'
        option proto 'dhcp'
        option device 'eth1'
        option metric '10'

==> /etc/config/firewall <==

config defaults
        option input 'REJECT'
        option output 'ACCEPT'
        option forward 'REJECT'
        option synflood_protect '1'

config zone
        option name 'wan'
        option input 'REJECT'
        option output 'ACCEPT'
        option forward 'REJECT'
        option mtu_fix '1'
        option masq '1'
        list network 'wanb'
        list network 'wanc'
        list network 'wana'

config rule
        option name 'Allow-DHCP-Renew'
        option src 'wan'
        option proto 'udp'
        option dest_port '68'
        option target 'ACCEPT'
        option family 'ipv4'

config rule
        option name 'Allow-Ping'
        option src 'wan'
        option proto 'icmp'
        option icmp_type 'echo-request'
        option family 'ipv4'
        option target 'ACCEPT'

config rule
        option name 'Allow-IGMP'
        option src 'wan'
        option proto 'igmp'
        option family 'ipv4'
        option target 'ACCEPT'

config rule
        option name 'Allow-DHCPv6'
        option src 'wan'
        option proto 'udp'
        option dest_port '546'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-MLD'
        option src 'wan'
        option proto 'icmp'
        option src_ip 'fe80::/10'
        list icmp_type '130/0'
        list icmp_type '131/0'
        list icmp_type '132/0'
        list icmp_type '143/0'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-ICMPv6-Input'
        option src 'wan'
        option proto 'icmp'
        list icmp_type 'echo-request'
        list icmp_type 'echo-reply'
        list icmp_type 'destination-unreachable'
        list icmp_type 'packet-too-big'
        list icmp_type 'time-exceeded'
        list icmp_type 'bad-header'
        list icmp_type 'unknown-header-type'
        list icmp_type 'router-solicitation'
        list icmp_type 'neighbour-solicitation'
        list icmp_type 'router-advertisement'
        list icmp_type 'neighbour-advertisement'
        option limit '1000/sec'
        option family 'ipv6'
        option target 'ACCEPT'

config rule
        option name 'Allow-ICMPv6-Forward'
        option src 'wan'
        option dest '*'
        option proto 'icmp'
        list icmp_type 'echo-request'
        list icmp_type 'echo-reply'
        list icmp_type 'destination-unreachable'
        list icmp_type 'packet-too-big'
        list icmp_type 'time-exceeded'
        list icmp_type 'bad-header'
        list icmp_type 'unknown-header-type'
        option limit '1000/sec'
        option family 'ipv6'
        option target 'ACCEPT'

config zone
        option name 'admin'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'

config zone
        option name 'vlan10'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        list network 'vlan10'

config zone
        option name 'vlan1'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        list network 'vlan1'

config zone
        option name 'vlan2'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        list network 'vlan2'

config zone
        option name 'vlan3'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        list network 'vlan3'

config zone
        option name 'vlan6'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        list network 'vlan6'

config forwarding
        option src 'vlan1'
        option dest 'vlan6'

config zone
        option name 'lan'
        option input 'ACCEPT'
        option output 'ACCEPT'
        option forward 'ACCEPT'
        list network 'lan'

config forwarding
        option src 'vlan1'
        option dest 'vlan2'

config forwarding
        option src 'vlan1'
        option dest 'wan'

config forwarding
        option src 'vlan10'
        option dest 'wan'

config forwarding
        option src 'lan'
        option dest 'vlan6'

config forwarding
        option src 'vlan2'
        option dest 'vlan1'

config forwarding
        option src 'vlan2'
        option dest 'wan'

config forwarding
        option src 'vlan6'
        option dest 'wan'

Unfortunately, nothing in the posted configs could explain the described behavior.

Just to be clear, you are not establishing multiple connections to the same ISP and wan[abc] are not on the same subnet, right?

Thanks a lot for looking into it. All 3 WAN interfaces are in different subnets. If I remember correctly, 192.168.2, 192.168.6 and 192.168.178. They all use a different router and ISP. It means that I have a double-NAT, unfortunately. One in OpenWRT, and one in the ISP router. But since the failover works when I unplug a cable, I don’t think it is the issue.

Is there anything I can do to pinpoint the issue?

Not such a trivial task.

So there is no sensitive information.
Let's see the output of:

ubus call system board
ip ru; ip -4 ro li ta all

Run these commands (if it hasn't already been done):

mwan3 stop
opkg remove iptables-zz-legacy ip6tables-zz-legacy --force-depends
opkg update; opkg install iptables-nft; mwan3 start

Verify that all interfaces are up and running and the failover policy uses wana at 100%.

mwan3 interfaces; mwan3 policies | grep failover -A1

Check the returned IP address (do not post it).

wget http://ipecho.net/plain -qO- ; echo

Simulate wana outage.

nft insert rule inet fw4 output oifname "eth1" meta nfproto ipv4 meta l4proto icmp counter drop

Wait a while and verify that wana is down and the failover policy uses wanb at 100%.

mwan3 interfaces; mwan3 policies | grep failover -A1

Check if the IP address has changed.

wget http://ipecho.net/plain -qO- ; echo

Check the connection from a lan client.

If it doesn't work, don't disconnect cables, but run mwan3 ifdown wana.
If there is no change, try ifdown wana .

Restart the firewall service to restore the settings.

1 Like

I did some more experiments today. To summarize, what is useful is to monitor the state of the mwan3 interfaces using:

$ mwan3 interfaces

e.g.:

Interface status:
interface wana is online and tracking is active (online 00h:00m:57s, uptime 00h:00m:57s)
interface wanb is online and tracking is active (online 00h:00m:52s, uptime 00h:00m:53s)
interface wanc is online and tracking is active (online 00h:00m:51s, uptime 00h:00m:53s)

Interface status:
interface wana is disconnecting and tracking is active
interface wanb is online and tracking is active (online 00h:03m:35s, uptime 00h:03m:36s)
interface wanc is online and tracking is active (online 00h:03m:34s, uptime 00h:03m:36s)

Interface status:
interface wana is offline and tracking is active
interface wanb is online and tracking is active (online 00h:08m:15s, uptime 00h:21m:45s)
interface wanc is online and tracking is active (online 00h:08m:15s, uptime 00h:21m:45s)

When blocking internet traffic on the upstream "wana" router:

  • the transition from "online" to "disconnecting" takes around 10-20 seconds for wana, which is compatible with my settings (1 ping, ping interval 10 seconds, ping timeout 4 seconds).
  • in the "disconnecting" state, LAN clients have no internet access
  • the transition from "disconnecting" to "offline" takes around 2 minutes
  • once "wana" is offline, traffic goes through "wanb" as expected

When unplugging the "wana" upstream cable:

  • the transition from "online" to "disabled and tracking is paused (23)" occurs within a few seconds
  • LAN clients have almost no interuption of internet access

So the main question is know why it takes so long to go from "disconnecting" to "offline".

Also, I am not sure how I should configure "Flush conntrack table" ("Flush global firewall conntrack table on interface events") for my 3 mwan3 interfaces:

  • empty (current state)
  • ifup (netifd)
  • ifdown (netifd)
  • connected (mwan3)
  • disconnected (mwan3)

I guess it influences also the failover time from the client perspective.

Pavel, I will send you more information by PM. Thanks again for you help!