OpenWrt 21.02 MWAN3 load balancing not working and connectivity lags

Hello,
Lately I have upgraded Netgear R6220 form OpenWrt 19.07 to 20.02 and I have some problems with it:

I use mwan3 to load balance flows between two 100 Mbps links (in the same subnet). On 19.07 it worked great: new connections were assigned to both links so using multithreaded speedtest, torrent etc. gave me 200 Mbps. On 20.02 I get only 100 Mbps. Failover works. If one of the links is disconnected then flows are assigned to the other one but both can't be used simultaneously. I put my configs at the bottom of the page.

After upgrade Internet feels laggy as well. Sometimes pages load very slowly. It looks like DNS resolving or establishing TCP connections is veery slow.

Thank you for the help.

root@Normalnie:/etc/config# cat network

config interface 'loopback'
        option device 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option packet_steering '1'
        option ula_prefix 'fd30:63fe:3ace::/48'

config device
        option name 'br-lan'
        option type 'bridge'
        list ports 'lan3'
        list ports 'lan4'

config interface 'lan'
        option device 'br-lan'
        option proto 'static'
        option ipaddr '192.168.1.1'
        option netmask '255.255.255.0'
        option ip6assign '60'

config interface 'wan2'
        option proto 'dhcp'
        option device 'lan1'
        option hostname '*'
        option metric '20'

config interface 'wan3'
        option proto 'dhcp'
        option device 'lan2'
        option hostname '*'
        option metric '30'

config device
        option name 'wan'
        option macaddr 'f0:76:1c:9f:f3:4c'

config device
        option name 'lan1'
        option macaddr '2c:41:38:04:07:30'

config device
        option name 'lan2'
        option macaddr '44:8a:5b:a0:6e:e1'

config interface 'wan1'
        option proto 'dhcp'
        option device 'wan'
        option metric '10'
root@Normalnie:/etc/config# cat mwan3

config globals 'globals'
        option mmx_mask '0x3F00'

config rule 'default_rule_v4'
        option dest_ip '0.0.0.0/0'
        option use_policy 'balanced'
        option family 'ipv4'
        option proto 'all'
        option sticky '0'

config rule 'default_rule_v6'
        option dest_ip '::/0'
        option use_policy 'balanced'
        option family 'ipv6'
        option proto 'all'
        option sticky '0'

config interface 'wan1'
        option initial_state 'online'
        option family 'ipv4'
        option track_method 'ping'
        option reliability '1'
        option count '1'
        option size '56'
        option max_ttl '60'
        option check_quality '0'
        option timeout '4'
        option interval '10'
        option failure_interval '5'
        option recovery_interval '5'
        option down '5'
        option enabled '1'
        list track_ip '8.8.8.8'
        list track_ip '8.8.4.4'
        option up '3'

config interface 'wan2'
        option initial_state 'online'
        option family 'ipv4'
        option track_method 'ping'
        option reliability '1'
        option count '1'
        option size '56'
        option max_ttl '60'
        option check_quality '0'
        option timeout '4'
        option interval '10'
        option failure_interval '5'
        option recovery_interval '5'
        option down '5'
        option up '3'
        list track_ip '8.8.8.8'
        list track_ip '8.8.4.4'
        option enabled '1'

config interface 'wan3'
        option enabled '1'
        option initial_state 'online'
        option family 'ipv4'
        option track_method 'ping'
        option reliability '1'
        option count '1'
        option size '56'
        option max_ttl '60'
        option check_quality '0'
        option timeout '4'
        option interval '10'
        option failure_interval '5'
        option down '5'
        option up '3'
        option recovery_interval '5'
        list track_ip '8.8.8.8'
        list track_ip '8.8.4.4'

config member 'wan1_member'
        option interface 'wan1'

config member 'wan2_member'
        option interface 'wan2'

config member 'wan3_member'
        option interface 'wan3'

config policy 'balanced'
        list use_member 'wan1_member'
        list use_member 'wan2_member'
        list use_member 'wan3_member'
        option last_resort 'unreachable'

Strangely enough load balancing works over WiFi but not over Ethernet.

I have a similar issues.My configuration file is the same as Openwrt 19.03,but i upgrade to Openwrt 20.02,mwan3 does not take effect and cannot be diverted.

Did you ever get anywhere with this? Seeing what sound like similar issues on 21.02. Running on x86 with plenty of grunt, DNS lookups feel very slow at times randomly, page loads very slow, and several devices (Google Homes, LG TV) timing out trying to hit their command&control.

I'm having the same issue on a Linksys WRT1900ACS. I THINK I set up the DSA switch shenanigans properly for 3 WANs, but it never actually load balances. I couldn't find anything useful in logs.

I'm wondering if it's a DSA issue since the mwan3 page in the docs hasn't been updated yet:

For now, I'm stuck on 19.07. :frowning:

The documentation is written from a swconfig perspective admittedly because of DSA being newer and having less user feedback.

Best to report any issues to https://github.com/openwrt/packages for the attention of @feckert and @aaronjg for any bugs or issues.

Edit: I have added a clearer warning on the switch/VLAN section on the mwan3 wiki around DSA.

Alrighty. I hope to have some time around the holidays to make a 2nd attempt with 21.02.1. Part of the problem may be that with swconfig, I never fully understood the finer under-the-hood details of what I was doing, but I got it working. Thus, converting to DSA is also kind of hit and miss.

OTOH, I'm sure not so many people use mwan and therefore it's not exactly a showstopper for the vast majority of users.

So, I'll try to experiment again and post a bug/issue with config and more details. Thanks!

mwan3 failover works perfectly for me with the ipq4018 chipsed under DSA with a dhcp and wwan connection, but it took a bit of playing with to get it to work.

Similar issue here running OpenWrt 21.02.2 r16495 over a native x86 box. MWAN3 would cause extreme slow connection startup speed, (long TTFB), from time to time. I tried to leaving just enabling one WAN and still similar symptoms. Don't know which is the cause. Will try later with wireshark if got time.

same issue here, mwan3 creating huge delay and not actually balancing anything.

Hi all,

I have the similar issue, I found mwan3 on 21.02 is not working properly. I found there are missing seq on ping reply. I tried on both x86 and Linksys WRT3200ACM, the result is the same. So I fallback to 19.07, it works great.

root@wrt32:~# ping google.com -I lan1
PING google.com (172.217.25.14): 56 data bytes
64 bytes from 172.217.25.14: seq=4 ttl=119 time=2.497 ms
64 bytes from 172.217.25.14: seq=5 ttl=119 time=2.726 ms
64 bytes from 172.217.25.14: seq=6 ttl=119 time=4.883 ms
^C
--- google.com ping statistics ---
7 packets transmitted, 3 packets received, 57% packet loss
round-trip min/avg/max = 2.497/3.368/4.883 ms
root@wrt32:~# ping google.com -I wan
PING google.com (172.217.25.14): 56 data bytes
64 bytes from 172.217.25.14: seq=4 ttl=59 time=2.609 ms
64 bytes from 172.217.25.14: seq=5 ttl=59 time=2.587 ms
64 bytes from 172.217.25.14: seq=6 ttl=59 time=2.694 ms
^C
--- google.com ping statistics ---
7 packets transmitted, 3 packets received, 57% packet loss
round-trip min/avg/max = 2.587/2.630/2.694 ms

I tried to add a load balancing rule with destination port 53 (DNS port) to use default route table. It mitigate things a bit. Long DNS lookup time is somehow improved. On a few occasions, the connection time is still long, but I was not able to pin point the cause.

Another observation is that, sometimes, ICMP ping from my intranet endpoint to the Internet endpoint would get "destination unreachable", but at that time if I ping from the router (OpenWRT) to the Internet endpoint, the result is successful and stable connection. After the ping from router, and with a few delay, the intranet ping would become normal. Don't know if this is related or not.

It's working great for me now. I finally updated to 21.02.2, and followed the updated DSA instructions here:

No real problems at all now, and no lag or anything.

Note that if you ping DNS servers for your WAN 'up/down detection', you'll need to create custom mwan3 load balancing rules to shuffle those ICMP requests to the specific WAN modem. I was silly and didn't realize this early on, so my down detection didn't work! So now I have 2 DNS for DHCP LAN puters, and 2 different DNS servers (at different providers) for each WAN ping - with rules for each. I generally use fastest DNS servers (Google or Cloudflare) for DHCP on the LAN for best performance.

So each DNS rule in luci should look something like:

DNS1 192.168.1.0/24 — 64.6.64.6 — icmp wan1_only

That's "LAN traffic TO DNS server 64.6.64.6 AT any port USING protocol=ICMP IS ROUTED TO wan1_only"

It's zippy!!

I got around to adding DSA specific guidance on VLANs/additional WAN interfaces more recently, glad it was helpful!

1 Like