Mwan3 port to nftables

Thanks for the detailed instruction. I've applied it (as I also updated to newest openwrt).

I will be able to do some more debugging in the evening, but first tries shows as follows.
There's 'exit 0' at my user script so no additional magic :wink:

  1. dns issues - nextdns uses kind-of round robin but always connects to dns.nextdns.io (and all of those are public ips).
    Sat May 9 10:02:37 2026 daemon.notice nextdns[5425]: Connected 38.175.123.217:443 (con=2ms tls=11ms, TCP, TLS13)

but I think that it's behavior might be connected to the way I'm testing 'wan fail':

# wan fail
iptables -I FORWARD -o eth1 -j DROP; iptables -I OUTPUT -o eth1 -j DROP
# wan restore
iptables -D FORWARD -o eth1 -j DROP; iptables -D OUTPUT -o eth1 -j DROP

It's not as direct as ifdown wan and maybe it's nextdns issue of handling connection failure. Just an idea.

For now I'm switching tests to pure ifdown.
I've also added CIDR of wg_awh to 'bypass mwan3' subnet.

  1. After reboot:
    [wan working]
    Tailscale routes properly but rather due to daemon
    mwan3 starts properly
    wireguard [wg_awh] connects via vpn_unl instead of default wan connection

(for unknown reason ip route shows endpoint of 'wg_awh' tunnel routed via 'vpn_unl' interface)

  1. ifdown wan
    Tailscale routed properly (no change - but nft list set still empty)
    failover happened as expected (as in terms of connectivity)
    wg_awh connected to endpoint... (ip route now shows proper gateway for it - backup lte)

  2. ifup wan
    Tailscale routed properly (no change, pings getting lower) nft list still empty:

 nft list set inet mwan3 mwan3_custom_v4
table inet mwan3 {
        set mwan3_custom_v4 {
                type ipv4_addr
                flags interval
                auto-merge
        }
}

failover happened as expected
wg_awh is connected properly to wan this time and soon after handshake I can see ping went down.

I'm not checking all wg's as limited time in the morning.

What is strange is that there's no place I could define vpn_unl for wg_awh, it seems to be populated automatically.

root@wrt:~# ip rule list
0:      from all lookup local
1001:   from all iif eth1 lookup 1
1002:   from all iif VLANs.1 lookup 2
1003:   from all iif vpn_unl lookup 3
1101:   from all fwmark 0x100/0x3f00 lookup 1
1102:   from all fwmark 0x200/0x3f00 lookup 2
1103:   from all fwmark 0x300/0x3f00 lookup 3
1161:   from all fwmark 0x3d00/0x3f00 blackhole
1162:   from all fwmark 0x3e00/0x3f00 unreachable
1201:   from all fwmark 0x100/0x3f00 unreachable
1202:   from all fwmark 0x200/0x3f00 unreachable
1203:   from all fwmark 0x300/0x3f00 unreachable
5210:   from all fwmark 0x80000/0xff0000 lookup main
5230:   from all fwmark 0x80000/0xff0000 lookup default
5250:   from all fwmark 0x80000/0xff0000 unreachable
5270:   from all lookup 52
32766:  from all lookup main
32767:  from all lookup default
90029:  from all iif lo lookup tailscale

 nft list chain inet mwan3 mwan3_rules
table inet mwan3 {
        chain mwan3_rules {
                meta mark & 0x00ff0000 == 0x00080000 meta mark & 0x00003f00 == 0x00000000 jump mwan3_policy_failover
                ip saddr 192.168.40.0/24 ip daddr 0.0.0.0/0 meta mark & 0x00003f00 == 0x00000000 jump mwan3_policy_wan_only
                ip daddr 0.0.0.0/0 meta mark & 0x00003f00 == 0x00000000 jump mwan3_policy_failover
        }
}

# nft list set inet mwan3 mwan3_custom_v4
table inet mwan3 {
        set mwan3_custom_v4 {
                type ipv4_addr
                flags interval
                auto-merge
        }
}

I've seen nft list set populated between some of the reboots, but can't see it populated i.e. on tailscale restart.
(Why reboot? just to make sure it will work after powerfail just the same ;-))

# ps w |grep mwan3
 2205 root      1724 S    {mwan3track} /bin/sh /usr/sbin/mwan3track wan
 2206 root      1724 S    {mwan3track} /bin/sh /usr/sbin/mwan3track wan_lte
 2207 root      1720 S    {mwan3track} /bin/sh /usr/sbin/mwan3track vpn_unl
22358 root      1724 S    {mwan3track} /bin/sh /usr/sbin/mwan3track wan

Also what I've noticed nextdns sometimes tend to 'catch' endpoint via old gateway and sticks to it. So might add restarting anyway to prevent slow dns :wink:

Sat May  9 10:08:44 2026 daemon.info dnsmasq[1]: using 15 more local addresses
Sat May  9 10:08:44 2026 daemon.notice nextdns[5425]: Received signal: broken pipe (ignored)
Sat May  9 10:08:44 2026 daemon.notice nextdns[5425]: Connected 217.146.13.3:443 (con=46ms tls=145ms, TCP, TLS13)
Sat May  9 10:08:45 2026 user.notice firewall: Reloading firewall due to ifup of wan (eth1)

Will try to take a deeper look in the evening.

Main concern is vpn_unl - do I need to add 'default gateway' to that interface if I want to route only specific traffic via it?
It seems removing that resolves a lot of issues.

Also I've noticed there's some error in rtmon (this is after reboot, but I've also noticed it before deleting one of the intarfaces from mwan3).

Sat May  9 10:00:12 2026 user.notice SQM: Stopping SQM on eth1
Sat May  9 10:00:13 2026 user.notice SQM: Starting SQM script: piece_of_cake.qos on eth1, in: 303000 Kbps, out: 30000 Kbps
Sat May  9 10:00:13 2026 user.notice SQM: piece_of_cake.qos was started on eth1 successfully
Sat May  9 10:00:13 2026 user.notice firewall: Reloading firewall due to ifup of wan (eth1)
Sat May  9 10:00:14 2026 daemon.info procd: Instance mwan3::rtmon_ipv4 s in a crash loop 6 crashes, 0 seconds since last crash
Sat May  9 10:00:14 2026 daemon.info procd: Instance mwan3::rtmon_ipv6 s in a crash loop 6 crashes, 0 seconds since last crash
Sat May  9 10:00:15 2026 daemon.notice nextdns[5425]: Network change detected: ifb4eth1 fe80::2050:99ff:fe8d:9eb9/64 added
Sat May  9 10:00:15 2026 user.notice mwan3-hotplug[9436]: Execute ifup event on interface wan (eth1)
Sat May  9 10:00:16 2026 daemon.notice procd: /etc/rc.d/S96led: setting up led WAN
Sat May  9 10:00:16 2026 daemon.notice procd: /etc/rc.d/S96led: setting up led LAN
Sat May  9 10:00:16 2026 daemon.notice procd: /etc/rc.d/S96led: setting up led WLAN
Sat May  9 10:00:17 2026 daemon.notice procd: /etc/rc.d/S96led: setting up led SYS

There's a bunch of things happening here, but the most important thing to sort out first is why mwan3rtmon is in a crash loop. That should not be happening and that means mwan3 won't work properly. Pointless testing until you've sorted that out.

If mwan3rtmon is in a crash loop then the routes won't get mirrrored between table 52 and the custom set either, pppoe and vpn routes won't work properly either.

I tested the new version I put up for you and that's the version that's actually running on my own machine at the moment, so I don't think the crash is related to the small patch I made to it.

You need to check that you installed it correctly. I assume you cloned the repo or downloaded it using wget and then copied the patched version over to your router. Is the executable bit set on it?

root@openwrt:~# ls -l $(which mwan3rtmon)
-rwxr-xr-x 1 root root 17147 May  9 03:39 /usr/sbin/mwan3rtmon

Quite a shame on me - didn't check that. Made it executable, restarted and will debug later in the evening.

You should put that script at the end of the setup document into your mwan3.user instead. The one that does mwan3_flush_marked_conntrack

1 Like

This is a very narrow and poor test of "wan down". A real WAN failure typically means the link drops or the upstream gateway stops responding, which also triggers route withdrawal, neighbour table changes, and interface carrier events. None of that happens here.

  • iptables and nftables operate in different hooks. mwan3 is nftables-based; the interaction between an iptables DROP rule and mwan3's nftables policy is not well-defined and could produce misleading results
  • Dropping OUTPUT on the interface also blocks mwan3 tracking pings, which is probably intentional, but it's a blunt instrument. It also drops any other locally-originated traffic on that interface simultaneously.
  • The restore is symmetric, so recovery is instant and clean, which is also not realistic. Real WAN recovery involves DHCP renegotiation, route re-advertisement, neighbour discovery, etc.

Much better.

That was because mwan3rtmon was not running and replicating the tailscale table 52 peers into the custom set.

It's also loaded at mwan3 start, so if there's anything in the table 52 at mwan3 startup, it will get loaded into the set. Most of the time it's probably empty I think, since mwan3 starts up before tailscale and it might just catch one or two tailscale routes that get added. The set won't get dynamically updated to mirror tailscale route add and delete events without mwan3rtmon running.

Live conntrack entry. Use the mwan3_flush_marked_conntrack mwan3.user script in the setup document. That will fix it as it will cause instant failover. You're using a connection oriented protocol, so constant dns traffic will keep the conntrack entry live unless you flush it.

vpn_unl is a wireguard interface. The default dev vpn_unl metric 201 route in the main table is not what causes all traffic to use vpn_unl. mwan3's mark-based policy rules control that. That default route is required for mwan3 to know vpn_unl is a usable WAN interface and to populate its per-interface routing table (table 3). Without it, mwan3 has no gateway information for the interface and can't route anything via it. So yes, you must add it. If it's not there, the routing health tab will show an unhealthy interface.

When you say removing it "resolves a lot of issues", that's almost certainly because removing it takes vpn_unl out of mwan3's WAN pool entirely, so traffic that was being incorrectly directed to it stops going there. It's fixing the symptom by disabling the interface, not fixing the underlying policy configuration.

If you want only specific traffic routed via vpn_unl, the default route stays, and the vpn_unl_only policy, which is already configured, is what achieves that. The question is whether your rules correctly steer only the intended traffic into that policy: that's more likely to be the source of your issues.

One observation: your diagnostic output shows vpn_unl status=unknown score=0 tracking=disabled, and the mwan3 config for vpn_unl has no track_ip entries. Without track IPs, mwan3 can't ping-check the interface, so it never transitions from unknown to online. That means mwan3 won't fail over if vpn_unl goes down, and it may also affect whether traffic is steered to it at all depending on initial_state. That's a separate problem from your default route question, but potentially the more important one if you're seeing unexpected behaviour

Thanks for the answers about default gateway. I will add tracking to vpn_unl too.

I've tested scenarios I had used before. Failover works and tunnels failover despite the IP route shown in CLI.

There was one situation when I was reconfiguring interfaces (reducing their number), and when a policy in mwan3 was assigned to an interface that was disabled, the rules for Tailscale disappeared and never came back. [nft list set inet mwan3 mwan3_custom_v4 were empty]
At that point I had to restart the Tailscale service. I couldn't reproduce this, I will try in future and report if I manage to find the scenario.

Other than that (and just to make the setup more robust), I might add a cron job that restarts Tailscale when no routes are detected.

There are still concerns I have:

1. (not directly connected with mwan3, more like wireguard?) why does one of wireguards Why does one of the WireGuard (VPN) interfaces with the lowest metric seem to pin itself as the route for endpoints that are not configured for this service?

I can see that right after reboot there seems to be traffic there for a few seconds (while firewall rules reload and mwan3 configuration applies):

# ip route |grep vpn_
default dev vpn_unl proto static scope link metric 201 
10.103.[proper endpoint for vpn_unl interface] dev vpn_unl proto static scope link metric 201 
65.[endpoint from different interface: wg_vps peer1] dev vpn_unl proto static scope link metric 201 
95.[endpoint from different interface: wg_vps peer2] dev vpn_unl proto static scope link metric 201 

(I do not have 10. /8 network anymore)
My idea is to remove those routes with a script - but better than that would be to configure it more properly...

2. Way of testing:
[here were longer observations about iptables vs ifdown, but you've already answered, I won't be using iptables for tests]

3. mwan3 interface configuration:

Apart from a pure WAN setup (load balancing / failover, like WAN and WAN_LTE in my case), which other interfaces should be configured in mwan3?

Only those I want to route traffic through? Only public ones?


Properly working script proof :wink:

# nft list set inet mwan3 mwan3_custom_v4
table inet mwan3 {
        set mwan3_custom_v4 {
                type ipv4_addr
                flags interval
                auto-merge
                elements = { 100.83.[redacted, all entries from tailscale].100.0/22 }
        }
}

Side note: I tried to recreate the default IPv6 rule, but LuCI does not allow ::/0, even though IPv6 is enabled in the first field.


IPv6 is not a priority for me right now since it’s not routed to LAN anyway.

Having used "old" mwan3 (iptables) intensively, usually with 3 WANs: WAN-LAN, WAN-LTE, WAN-WIFI, I am following this thread and have to say, that switchover-testing using "ifdown/ifup" is a rather trivial test. Because it generates a lot of noise on the openwrt itself, triggering simple failover logic, and some standard routing decisions, not related to mwan3 . My most realistic test scenario was to silently interfere with the upstream connection. I.e. blocking mwan3-PINGs one hop upstream (i.e. ISP-router). Or to cover openwrt antennas for WAN-WIFI. Goal was to keep openwrt-environment itself absolutely normal, only the worst case, detection of broken conn because of unsuccessful pings, to trigger switchover. This also covered one of the worst real-world scenarios, I had: LTE-modem being stuck, without any error indication. Just my few cents.

If you fiddle with the rt_table_lookup entries in /etc/config/mwan3 by adding or removing them and only do a reload, not a restart, this might happen. I've added a SIGHUP handler to force mwan3rtmon to reload config when mwan3 is reloaded too. Not sure if this is what happened here, but adding and removing rt_table_lookup should not be a regular thing. Just restart mwan3 if you do change those entries for any reason

Don't use hammers to fix symptoms. If it recurs, see if it's the issue I described above. If it is, wait until I've published the mwan3rtmon that has the SIGHUP handler in it. This is an uncommon enough scenario that you can just restart mwan3 if you change the rt_table_lookup option while waiting for the version with the SIGHUP handler in it.

This is a wireguard question rather than an mwan3 one, but wireguard must route its peer's UDP endpoint addresses (the real internet addresses it sends handshake packets to) via the main routing table, not through the tunnel itself in order to avoid a routing loop. So when wg_vps comes up, netifd/wireguard installs host routes for its peer endpoints via whatever the current default route is.

At boot, vpn_unl has metric 201, so if it comes up before the physical WAN interfaces are fully established in mwan3, it temporarily is the best default. So wg_vps's peer endpoint routes get pinned to vpn_unl. The result is wg_vps trying to reach its peers through vpn_unl, which is itself a wireguard tunnel, a potential loop.

This resolves within a few seconds because once mwan3 finishes loading and the physical WAN interfaces take their correct roles, the endpoint routes get replaced with correct ones.

It's a boot-ordering race between wireguard interfaces that have implicit routing dependencies. The fix (if it's causing actual problems beyond the brief transient at boot) is to ensure wireguard interfaces that are VPN clients have their peer endpoint routes explicitly pinned to the physical WAN interface rather than relying on whatever the current default is. That means setting the route_allowed_ips peer option or using a pre-up script.

If you're not seeing real connectivity problems, just the transient state in ip route, it's not worth fixing.

Leave it blank: that's the actual wildcard match. The validation is interpreting ::/0 as a network prefix rather than as the "match any address shorthand".

Leave destination blank, and the luci app will omit the field from the UCI config entirely, which mwan3 treats as match-all, identical in effect to ::/0

1 Like

It is.

That's a better way to do it than on the openwrt router, yes

mwan3 version 3.6.1 up for testing

This completes the functional extension of the legacy mwan3 feature list rt_table_lookup <tableid>.

Extends the legacy mwan3 custom sets that were previously only loaded statically during start_service() and reload_service() from the tables defined in the UCI global config list option rt_table_lookup to be fully dynamic, using mwan3rtmon to listen for and to add and remove routes from the custom sets in response to RTM_NEWROUTE and RTM_DELROUTE events on the underlying tables.

Also adds a SIGHUP handler to mwan3rtmon to cause it to flush and repopulate these custom sets on mwan3 reload, ensuring that their contents remain in sync with any newly added or removed list rt_table_lookup <tableid> options.

What this means in practice is that a 3rd party package that dynamically adds and deletes custom routes to a routing table can have these routes mirrored in real-time to the mwan3_custom_v4/v6, enabling mwan3 rules to be bypassed for these routes.

1 Like

In case anyone is curious...

I guess at this point calling it a "port" is no longer really accurate. 4,500 lines of code added to the core and 3,600 lines to the luci app.

git diff --numstat 400258063 98891e123 \
  | awk '
      BEGIN { FS="\t"; print "| Filename | Lines Added | Lines Deleted |\n|---|---|---|" }
      { f=$3; sub(/.*\//, "", f); print "| "f" | "$1" | "$2" |"; a+=$1; d+=$2 }
      END   { print "| **Total** | **"a"** | **"d"** |" }
    '
Filename Lines Added Lines Deleted
Makefile 116 12
mwan3 3 0
{15-mwan3 => 25-mwan3} 11 9
{16-mwan3-user => 26-mwan3-user} 0 0
mwan3 149 29
mwan3-remove-firewall-include 5 0
common.sh 274 9
mwan3-migrate-ipset-v4.sh 81 0
mwan3-skeleton.nft 99 0
mwan3.sh 1185 562
mwan3 12 23
mwan3-diag 424 0
mwan3-lb-test 739 0
mwan3rtmon 685 181
mwan3track 100 25
mwan3 613 46
sockopt_wrap.c 13 8
Total 4509 904
git diff --numstat bee369c7f fc4746883 \
  | awk '
      BEGIN { FS="\t"; print "| Filename | Lines Added | Lines Deleted |\n|---|---|---|" }
      { f=$3; sub(/.*\//, "", f); print "| "f" | "$1" | "$2" |"; a+=$1; d+=$2 }
      END   { print "| **Total** | **"a"** | **"d"** |" }
    '
Filename Lines Added Lines Deleted
configuration.js 344 0
globals.js 134 5
interface.js 18 1
ipset.js 220 0
member.js 3 0
policy.js 816 17
rule.js 230 14
simulator.js 661 0
detail.js 191 11
ipsets.js 274 0
overview.js 180 58
routing.js 290 0
troubleshooting.js 97 10
90_mwan3.js 41 74
luci-mwan3 59 29
luci-app-mwan3.json 40 0
luci-app-mwan3.json 35 3
Total 3633 222
4 Likes

Is it possible that mwan3 tailscale-managing of routes changed the behavior of incoming traffic from tailscale?
I've noticed that traffic from tailscale clients doesn't reach internal.

Despite that everything works as expected so far, great amount of work.


I'll answer myself: I needed to define subnet managed my tailscale to list bypass_network in mwan3 config file.
Routes defined here were properly picked up:

Mon May 11 14:02:34 2026 user.notice mwan3-init[25561]: Adding bypass_network 10.x.x.0/24 to mwan3_dynamic_v4 set
Mon May 11 14:02:34 2026 user.notice mwan3-init[25561]: Adding bypass_network 10.y.y.0/24 to mwan3_dynamic_v4 set
Mon May 11 14:02:34 2026 user.notice mwan3-init[25561]: Adding bypass_network 100.x.x.x/16 to mwan3_dynamic_v4 set

I am not and don't want to be an expert on using Tailscale - are these routes that are separate from the ones listed in lookup table 52?

Worth noting too that when you previously had that very broad /8 netmask, this would have ensured that everything in 10.x.x.x was added to the mwan3 connected sets, which would have exactly the same effect as what you're doing here, adding these networks individually to the bypass_networks.

I don't have /8 anymore.
Those masks are from 2 sources:
10.x are two different wireguards subnets
100.x is from tailscale

Tailscale is generally not visible within ip route or route -n
But are visible (all of them) in ip route show table 52

I'm not an expert too (as you probably already can see), but want to help it make it fool-proof :wink:

I know you don't now, but you did previously. So everything within 10.0.0.0/8 would have previously been considered to be a directly connected network, and thus bypassed mwan3 (as directly connected networks should).

Now that you've removed the /8, the smaller networks attached to your wireguard interfaces should still be in connected networks (assuming mwan3 is managing those wg interfaces). For example, if one of your wg interfaces is 10.1.1.1/24 then 10.1.1.0/24 will be in connected networks, as it is essentially a directly connected network.

You should probably PM me your /etc/config/network and your tailscale config if you need more help, otherwise I'm just guessing...

mwan3 version 3.6.2 up for testing

3.6.2 is a maintenance release: a set of defensive edge-case fixes and correctness improvements.

  • Binds the mwan3rtmon route listener before the initial netlink dumps to close a narrow startup race window, and replaces main_route_cache with an on-demand kernel query to eliminate a class of cache-drift failures that could only manifest if route events arrived during the dump phase.

  • Aligns shell and mwan3rtmon custom-set filtering so both paths apply identical exclusions.

  • Adds a dormant is_default_route guard to populate_connected_set as a forward-compatibility precaution.

  • Clamps check_quality to 0 when the configured track method cannot produce quality samples, preventing an arithmetic error in the unusual case where check_quality is paired with a non-ping method.

  • Fixes mwan3_track_clean which targeted incorrect paths and was a no-op

  • Tightens ip rule deletion at stop_service to use content-based matching rather than a priority-range regex, which matters only when rule bases are configured outside the default 1000-3999 band.

3 Likes

I installed this version, still have to restart mwan3 after reboot to get it work. Any clue?

Sounds like a timing problem. Check the log for possible re-starts by procd. I.e. "logread | grep mwan3" . What does it say ?

And what does "ls /etc/rc.d" say ? (startup sequence)

This version contained no changes that could introduce that behaviour if it didn't exist already.

Can you reproduce this every time or was it a one-off?

Can you be more specific about what "not working" actually means?

If you can reproduce it, then send me the output of mwan3-diag, taken after reboot and before you restart it, then another mwan3-diag output after you've restarted it.