OpenWrt Forum Archive

Topic: Multi-WAN Load Balancing

The content of this topic has been archived between 29 Mar 2018 and 3 May 2018. Unfortunately there are posts – most likely complete pages – missing.

Health pings remain at 1 Hz...

However, I stumbled over a serious issue. As mentioned before, I run an asterisk server on the same device as the multiwan script. From time to time, asterisk forwards the public IP address of the failover link to the sip providers while registering, when the main link is still up... Some time later, it reverts back to the main link's IP address. Obviously, asterisk shouldn't know anything about the failover link when the main link is still up, as load balancing is not activated in my conf. If this happens during an active call, the call isn't dropped, but the callee can no longer hear me while I still hear him. I should point out that this only happens while registering at the sip provider.

This behavior is completely random, it happens quite frequently but I have not been able to find what triggers this behavior. However, disabling the multiwan script resolves the issue.

Greetings.

Hi friends,

Yesterday I purchased a Dlink dir-825 rev B1 router that I immediatly flashed with openwrt Backfire (10.03.1-rc3, r22796).
Each internet connector is connected to the ISP router statically.

That took me some hour to configure the switch (/etc/config/network) and so I share my little work ;-)

1 Wan is connected to Internet connector (free.fr)
1 Wan is connected to switch connector 4 (orange.fr)
1 Lan is connected to switch connector 1

/etc/config/network

config 'interface' 'loopback'
    option 'ifname' 'lo'
    option 'proto' 'static'
    option 'ipaddr' '127.0.0.1'
    option 'netmask' '255.0.0.0'

config 'interface' 'lan1'
    option 'ifname' 'eth0.1'
    option 'type' 'bridge'
    option 'proto' 'static'
    option 'ipaddr' '192.168.181.10'
    option 'netmask' '255.255.255.0'
    option 'defaultroute' '0'
    option 'peerdns' '0'

config 'interface' 'lan2'
    option 'ifname' 'eth0.2'
    option 'type' 'bridge'
    option 'proto' 'static'
    option 'ipaddr' '192.168.2.1'
    option 'netmask' '255.255.255.0'

config 'interface' 'lan3'
    option 'ifname' 'eth0.3'
    option 'type' 'bridge'
    option 'proto' 'static'
    option 'ipaddr' '192.168.3.1'
    option 'netmask' '255.255.255.0'

config 'interface' 'lan4'
    option 'ifname' 'eth0.4'
    option 'type' 'bridge'
    option 'proto' 'static'
    option 'ipaddr' '192.168.120.10'
    option 'netmask' '255.255.255.0'
    option 'gateway' '192.168.120.254'
    option 'defaultroute' '0'
    option 'peerdns' '0'
    option 'dns' '208.67.222.222 208.67.220.220'
    option 'macaddr' '00:24:01:E7:7F:F9'

config 'interface' 'wan'
    option 'ifname' 'eth1'
    option 'proto' 'static'
    option 'ipaddr' '192.168.100.10'
    option 'netmask' '255.255.255.0'
    option 'gateway' '192.168.100.254'
    option 'defaultroute' '0'
    option 'peerdns' '0'
    option 'dns' '208.67.222.222 208.67.220.220'
    option 'mtu' '1492'

config 'switch'
    option 'name' 'rtl8366s'
    option 'reset' '1'
    option 'enable_vlan' '1'

config 'switch_vlan'
    option 'device' 'rtl8366s'
    option 'vlan' '5'
    option 'pvid' '5'
    option 'ports' '5t'

config 'switch_vlan'
    option 'device' 'rtl8366s'
    option 'vlan' '1'
    option 'pvid' '1'
    option 'ports' '3 5t'

config 'switch_vlan'
    option 'device' 'rtl8366s'
    option 'vlan' '2'
    option 'pvid' '2'
    option 'ports' '2 5t'

config 'switch_vlan'
    option 'device' 'rtl8366s'
    option 'vlan' '3'
    option 'pvid' '3'
    option 'ports' '1 5t'

config 'switch_vlan'
    option 'device' 'rtl8366s'
    option 'vlan' '4'
    option 'pvid' '4'
    option 'ports' '0 5t'


/etc/config/multiwan

config 'multiwan' 'config'
    option 'resolv_conf' '/tmp/resolv.conf.auto'
    option 'lan_if' 'lan'
    option 'default_route' 'fastbalancer'

config 'interface' 'wan'
    option 'icmp_hosts' 'dns'
    option 'timeout' '3'
    option 'health_fail_retries' '3'
    option 'health_recovery_retries' '5'
    option 'weight' '1'
    option 'health_interval' '5'
    option 'failover_to' 'fastbalancer'
    option 'dns' '208.67.222.222'

config 'interface' 'lan4'
    option 'icmp_hosts' 'dns'
    option 'timeout' '3'
    option 'health_fail_retries' '3'
    option 'health_recovery_retries' '5'
    option 'weight' '1'
    option 'health_interval' '5'
    option 'failover_to' 'fastbalancer'
    option 'dns' '208.67.222.222'


config 'mwanfw'
    option 'dst' 'smtp.free.fr'
    option 'wanrule' 'wan'

config 'mwanfw'
    option 'dst' 'smtp.orange.fr'
    option 'wanrule' 'lan4'

(Last edited by mynetmemo on 25 Sep 2010, 10:56)

mynetmemo wrote:

Hi friends,

Yesterday I purchased a Dlink dir-825 rev B1 router that I immediatly flashed with openwrt Backfire (10.03.1-rc3, r22796).
Each internet connector is connected to the ISP router statically.

That took me some hour to configure the switch (/etc/config/network) and so I share my little work ;-)

1 Wan is connected to Internet connector (free.fr)
1 Wan is connected to switch connector 4 (orange.fr)
1 Lan is connected to switch connector 1
...

Hello

Wanted to let you know:
1) don't forget to make FIREWALL rules same for WAN2 as there are configured in WAN1
2) in dnsmasq.conf I was adding this lines too, not sure if those are needed:

config dhcp wan2
        option interface        wan
        option ignore   1

3) can't remember... if I remember, I came here and edit this post:)

Good luck

huglester wrote:
mynetmemo wrote:

Hi friends,

Yesterday I purchased a Dlink dir-825 rev B1 router that I immediatly flashed with openwrt Backfire (10.03.1-rc3, r22796).
Each internet connector is connected to the ISP router statically.

That took me some hour to configure the switch (/etc/config/network) and so I share my little work ;-)

1 Wan is connected to Internet connector (free.fr)
1 Wan is connected to switch connector 4 (orange.fr)
1 Lan is connected to switch connector 1
...

Hello

Wanted to let you know:
1) don't forget to make FIREWALL rules same for WAN2 as there are configured in WAN1

That's what I did ;-)

huglester wrote:

2) in dnsmasq.conf I was adding this lines too, not sure if those are needed:

config dhcp wan2
        option interface        wan
        option ignore   1

I don't think so because it's working like a charm...

huglester wrote:

3) can't remember... if I remember, I came here and edit this post:)

Good luck

Anyway, thanx for your comments ;-)

renoir wrote:

Health pings remain at 1 Hz...

However, I stumbled over a serious issue. As mentioned before, I run an asterisk server on the same device as the multiwan script. From time to time, asterisk forwards the public IP address of the failover link to the sip providers while registering, when the main link is still up... Some time later, it reverts back to the main link's IP address. Obviously, asterisk shouldn't know anything about the failover link when the main link is still up, as load balancing is not activated in my conf. If this happens during an active call, the call isn't dropped, but the callee can no longer hear me while I still hear him. I should point out that this only happens while registering at the sip provider.

This behavior is completely random, it happens quite frequently but I have not been able to find what triggers this behavior. However, disabling the multiwan script resolves the issue.

Greetings.

Hey renoir,

Another improved version of multiwan may solve your issues reported here:
1. performance on a WRT54GS v1.1: try set health_monitor to serial
2. ping every second: if this version still does it, post your etc/config/multiwan
3. asterisk wan switching: if it listens on a src port, use source-ports accordingly, see the sample VoIP mwanfw (works for my ATA)

I was asking Craig to review it before submitting a patch, but you're the perfect person to provide feedback and report all the bugs I introduce, right? smile

buildster wrote:
renoir wrote:

Health pings remain at 1 Hz...

However, I stumbled over a serious issue. As mentioned before, I run an asterisk server on the same device as the multiwan script. From time to time, asterisk forwards the public IP address of the failover link to the sip providers while registering, when the main link is still up... Some time later, it reverts back to the main link's IP address. Obviously, asterisk shouldn't know anything about the failover link when the main link is still up, as load balancing is not activated in my conf. If this happens during an active call, the call isn't dropped, but the callee can no longer hear me while I still hear him. I should point out that this only happens while registering at the sip provider.

This behavior is completely random, it happens quite frequently but I have not been able to find what triggers this behavior. However, disabling the multiwan script resolves the issue.

Greetings.

Hey renoir,

Another improved version of multiwan may solve your issues reported here:
1. performance on a WRT54GS v1.1: try set health_monitor to serial
2. ping every second: if this version still does it, post your etc/config/multiwan
3. asterisk wan switching: if it listens on a src port, use source-ports accordingly, see the sample VoIP mwanfw (works for my ATA)

I was asking Craig to review it before submitting a patch, but you're the perfect person to provide feedback and report all the bugs I introduce, right? smile

Taking a look a it right now, I still have yet to update the luci menus for the last change, sorry, been very busy and detached as of late.

Thanks Buildster, I'll keep ya posted. smile

Thanks buildster,

I will have a look at it asap! Seems odd though if one would have to specify src-ports for asterisk to go through the main link, as there is already a default rule that sends all traffic through the main link. But of course I will try what you suggested smile

Greetings,
renoir.

SouthPawn wrote:

Taking a look a it right now, I still have yet to update the luci menus for the last change, sorry, been very busy and detached as of late.

Thanks Buildster, I'll keep ya posted. smile

Hey Craig,

Thank you for looking at corresponding changes in the luci menus! I've worked with uci only, as I don't have a testing unit for a http server and luci files. My WHR-HP-G54 has 56 KB left on its 4 MB flash. I really envy those with a monster router or two. tongue

renoir wrote:

Thanks buildster,

I will have a look at it asap! Seems odd though if one would have to specify src-ports for asterisk to go through the main link, as there is already a default rule that sends all traffic through the main link. But of course I will try what you suggested smile

Greetings,
renoir.

I learned it the hard way in case of Skype and sip-based ATA in my triple-wan setup. Multi-wan load balancing loves to spread network connections of a voice session and break it in some way. For my ATA, I have to ensure both SIP and RTP ports go thru the same wan. Look forward to your feedback...

Good luck.

Wow, this MULTIWAN script might just be what I am looking for!!!

Only I don't want to use it for any load balancing or failover at all. I want to use it for the following scenario:

I have a tripple-play subscription over FttH! It supports 2 WAN connections:

* 1 PPPoE connection via q-tagged ethernet (VLAN 6), for the main internet connection (this is already running perfectly)
* 1 PPPoE connection via VLAN 7. Essentially, this one also has full internet connectivity, but heavily QOS'ed. It only supports 512 kbit down adn 512 kbit up, but with absolutely highest priority on the ISP's network layer (and probably also the fiber's).

The second WAN connection is meant to use for VOIP. At my VOIP client (a Fritz!box FON WLAN 7270 with integrated DECT base), I can set what "ToS" to give all SIP- and RTP related packets.

Is it possible to add a functionality (or is it already configurable when doing it manually at the terminal) to just use the second WAN connection for packets with a specific ToS ? smile

Yesterday I compiled a new image with multiwan 1.0.18 included. However, running asterisk on the same device is still a little troublesome. The way multiwan announces dns must have changed (given the problem doesn't show with multiwan disabled). Let me explain: asterisk is one of the services running on the router and is started via init.d-script (start=90). Before 1.0.18, asterisk manages to resolve the sip providers' host addresses without any problems. With 1.0.18, it doesn't resolve the addresses anymore at startup. When I do a manual sip reload right after the router is up, it does resolve the addresses properly. There is a workaround for this (setting dnsmgr.conf to do periodic resolves), but since this problem didn't exist before 1.0.18, it might be worth to take a look at. To my knowledge, no other services are affected. I tried to give multiwan some room to load properly before asterisk kicks in (sleep 30), but no luck.

buildster, I still have to try your proposed improvements, I've been quite busy lately. I'll keep you posted smile

Greetings.

renoir wrote:

Yesterday I compiled a new image with multiwan 1.0.18 included. However, running asterisk on the same device is still a little troublesome. The way multiwan announces dns must have changed (given the problem doesn't show with multiwan disabled). Let me explain: asterisk is one of the services running on the router and is started via init.d-script (start=90). Before 1.0.18, asterisk manages to resolve the sip providers' host addresses without any problems. With 1.0.18, it doesn't resolve the addresses anymore at startup. When I do a manual sip reload right after the router is up, it does resolve the addresses properly. There is a workaround for this (setting dnsmgr.conf to do periodic resolves), but since this problem didn't exist before 1.0.18, it might be worth to take a look at. To my knowledge, no other services are affected. I tried to give multiwan some room to load properly before asterisk kicks in (sleep 30), but no luck.

buildster, I still have to try your proposed improvements, I've been quite busy lately. I'll keep you posted smile

Greetings.

hey renoir, which revision of the trunk did you compile?
When you have time, can you try each of the followings, to isolate the cause of your ip resolution problem?

1. disable multiwan: /etc/init.d/disable multiwan; reboot
2. downgrade to multiwan 1.0.17 (if there is space left, no re-flash of image is needed. just opkg install...)
3. if the issue persists, flash back to your previous image, upgrade to multiwan 1.0.18, and then patch it to the new version in ticket 7996

Recently installed Backfire 10.03.1 r22752 onto a WRT54GS 32M/8M and then MultiWAN_1.0.18 with Luci app 1.0.16. It took a little while to get the three WAN links configured for my three 6M dsl lines but I got it. P2P apps get the full bandwidth "measured up to 2080KB/s with a torrent" which is what I expected from reading load-balancing multiple connections "not bonding". What I didn't expect and cannot explain is why my speedtests on speedtest.net and others report ~11-12MB d/l and 800K u/l. As far as I know these tests use a single transfer during tests.. how do I get the combined BW from two of my connections? Why not all three? Does the multiwan script do some transfer magic? I'd like to get to the bottom of this if nothing more than to learn why. One last Q, when setting up the link weights, I saw it defaults to 10.. if all connections are 10 as in 10:10:10 would that be the same as 1:1:1 or does the script round-robin 10 connections per link before moving to the next WAN link? Thanks

Hi Blak6spdZ,
Regarding the first question, there is no practical way to bond disparate network connections to form a single, aggregated interface. If you had full control of the other end of each ISP line, you may have a chance... The best you can do is what you are doing now, spreading your multiple IP connections across the three lines (as with BT). To bandwidth-test each line, you could run iptraf on the router console and soak all three lines with heavy downloads, or test them one at a time by adding a temporary route for the test-box under  "Multi-WAN Traffic Rules". That would force the test-pc to use one specific wan for each test.

tribhuvanji, I understand the way load-balancing works.. I stated earlier that I could get an aggregate throughput of all three of my lines when D/L a torrent file.. what I cannot explain is why speedtest.net throughput tests always give me the speed as if two of my lines are bonded "roughly 11-12Mb" download even though these are seperate DSL circuits with three different WAN gateways, so are in no way ISP bonded.

Finally I got MultiWAN to work as I expected. Maybe this is obvious to others, or maybe I have misunderstood something, but when I was trying this is my lab I could never get a ping session to fail over while it was running. New sessions went fine, but not the existing ones that were already in the connection track table.

I modified MultiWAN to clear the conntrack table upon startup, and to remove all connections on a failed interface during the fail process.

I am running version 1.0.18 on Backfire 10.03. Hardware is a WRT54GL v1.1. I am not using the load balancing function, just one WAN connection at a time.

Maybe there is a reason why MultiWAN does not do this by default. Any thoughts?

Anders B wrote:

Finally I got MultiWAN to work as I expected. Maybe this is obvious to others, or maybe I have misunderstood something, but when I was trying this is my lab I could never get a ping session to fail over while it was running. New sessions went fine, but not the existing ones that were already in the connection track table.

I modified MultiWAN to clear the conntrack table upon startup, and to remove all connections on a failed interface during the fail process.

I am running version 1.0.18 on Backfire 10.03. Hardware is a WRT54GL v1.1. I am not using the load balancing function, just one WAN connection at a time.

Maybe there is a reason why MultiWAN does not do this by default. Any thoughts?

Hey Anders,

I see the multiwan script does "ip route flush cache" when failing over. Apparently, it's not enough. Can you post your changes to clear conntrack table and  to remove all connections on a failed interface?

Firstly, a big thanks for the MultiWAN script. It's exactly what I was looking for.

I'm running it on OpenWRT Backfire 10.03 on a WGT634U. The WGT is connected to two TG585v7's, which are both in bridge mode.
My first ISP is PlusNet, and the second is TalkTalk. Both ISPs give me approx 6MBit/s downstream and 512kbps upstream.
I'm using MultiWAN 1.0.18 and the Luci stuff is 1.0.16.

I've been able to get the MultiWAN configuration to work pretty well, but have come up against a couple of issues:

1) My main goal with load balancing was to improve my NNTP download throughput. To this end, I've setup my mwanfw entries as follows:

config 'mwanfw'
        option 'dst' 'secure.news.eu.easynews.com'
        option 'wanrule' 'fastbalancer'

config 'mwanfw'
        option 'dst' 'secure.news.us.easynews.com'
        option 'wanrule' 'fastbalancer'

I then use SABnzbd to download from primarily secure.news.eu.easynews.com. Sometimes, I am clearly getting the combined throughput, as my download rate reaches about 1.4MB/sec. However, most of the time, I can see that only one link is being used (checked with ifstat).

I have also been messing about with the standard QoS stuff, and have found that installing, and then removing this seems to have changed the MultiWAN behaviour.

For example, on first boot of the router, I see the following:

root@OpenWrt:/etc/config# iptables -L MultiWanRules -t mangle -v
Chain MultiWanRules (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 FW1MARK    icmp --  any    any     192.168.0.3          anywhere            mark match 0x0 
  643 45246 LoadBalancer  all  --  any    any     anywhere             anywhere            mark match 0x0

No matter what traffic passes through the router, this table doesn't change.

However, if I run /etc/init.d/multiwan restart, then the table becomes:

root@OpenWrt:/etc/config# iptables -L MultiWanRules -t mangle -v
Chain MultiWanRules (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 FW2MARK    tcp  --  any    any     192.168.1.0/24       173-13-171-209-sfba.hfc.comcastbusiness.net mark match 0x0 multiport dports 21 
    0     0 FW1MARK    icmp --  any    any     192.168.0.3          anywhere            mark match 0x0 
    0     0 FastBalancer  all  --  any    any     anywhere             199.89.233.72.static.reverse.ltdomains.com mark match 0x0 
    0     0 FastBalancer  all  --  any    any     anywhere             198.89.233.72.static.reverse.ltdomains.com mark match 0x0 
    0     0 FastBalancer  all  --  any    any     anywhere             197.89.233.72.static.reverse.ltdomains.com mark match 0x0 
    0     0 FastBalancer  all  --  any    any     anywhere             200.89.233.72.static.reverse.ltdomains.com mark match 0x0 
    0     0 FastBalancer  all  --  any    any     anywhere             secure.news.eu.easynews.com mark match 0x0 
    0     0 FastBalancer  all  --  any    any     anywhere             secure.news.dc1.easynews.com mark match 0x0 
   38  1889 LoadBalancer  all  --  any    any     anywhere             anywhere            mark match 0x0

which seems more in line with what is in my /etc/config/multiwan file:

config 'multiwan' 'config'
    option 'default_route' 'balancer'

config 'interface' 'wan'
    option 'weight' '10'
    option 'health_interval' '10'
    option 'icmp_hosts' 'dns'
    option 'timeout' '3'
    option 'health_fail_retries' '3'
    option 'health_recovery_retries' '5'
    option 'failover_to' 'wan1'
    option 'dns' 'auto'

config 'interface' 'wan1'
    option 'weight' '10'
    option 'health_interval' '10'
    option 'icmp_hosts' 'dns'
    option 'timeout' '3'
    option 'health_fail_retries' '3'
    option 'health_recovery_retries' '5'
    option 'failover_to' 'balancer'
    option 'dns' 'auto'

config 'mwanfw'
        option 'src' '192.168.1.0/24'
        option 'dst' 'ftp.netlab7.com'
        option 'proto' 'tcp'
        option 'ports' '21'
        option 'wanrule' 'wan1'

config 'mwanfw'
    option 'src' '192.168.0.3'
        option 'proto' 'icmp'
        option 'wanrule' 'wan'

config 'mwanfw'
        option 'dst' 'www.whatismyip.com'
        option 'wanrule' 'fastbalancer'
        
config 'mwanfw'
        option 'dst' 'secure.news.eu.easynews.com'
        option 'wanrule' 'fastbalancer'

config 'mwanfw'
        option 'dst' 'secure.news.us.easynews.com'
        option 'wanrule' 'fastbalancer'

Has the QoS stuff messed up my configuration, or is there another explanation for the difference in mangle tables from bootup, and after a multiwan restart?

All I really want to achieve is load balanced throughput for my NNTP and HTTP downloads, combined with some QoS to prioritise my Cisco VPN traffic and other web browsing activities.

If anyone can give me some pointers to where I'm going wrong, then I'd much appreciate it.

Thanks,

Andy.


For reference, here are my /etc/config/network and /etc/config/firewall files:

#### VLAN configuration 
config switch eth0
    option enable   1

config switch_vlan eth0_0
    option device   "eth0"
    option vlan     0
    option ports    "1 2 3 5"

config switch_vlan eth0_1
    option device   "eth0"
    option vlan     1
    option ports    "4 5"

config switch_vlan eth0_2
    option device   "eth0"
    option vlan     2
    option ports    "0 5"
    
#### Loopback configuration
config interface loopback
    option ifname    "lo"
    option proto    static
    option ipaddr    127.0.0.1
    option netmask    255.0.0.0


#### LAN configuration
config interface lan
    option type     bridge
    option ifname    "eth0.0"
    option proto    static
    option ipaddr    192.168.0.1
    option netmask    255.255.255.0


#### WAN configuration
config 'interface' 'wan'
    option 'ifname' 'eth0.1'
    option 'username' 'bagpuss'
    option 'password' 'letmein'
    option 'vpi' '38'
    option 'vci' '0'
    option 'mtu' '1500'
    option 'defaultroute' '0'
    option 'ppp_redial' 'demand'
    option 'proto' 'pppoe'
    
config 'interface' 'wan1'
        option 'ifname' 'eth0.2'
        option 'username' 'user@talktalk.net'
        option 'password' 'password'
        option 'vpi' '38'
        option 'vci' '0'
        option 'defaultroute' '0'
        option 'ppp_redial' 'demand'
        option 'proto' 'pppoe'
        option 'mtu' '1432'
config 'defaults'
    option 'syn_flood' '1'
    option 'input' 'ACCEPT'
    option 'output' 'ACCEPT'
    option 'forward' 'REJECT'

config 'zone'
    option 'name' 'lan'
    option 'input' 'ACCEPT'
    option 'output' 'ACCEPT'
    option 'forward' 'REJECT'

config 'zone'
    option 'name' 'wan'
    option 'input' 'DROP'
    option 'output' 'ACCEPT'
    option 'forward' 'REJECT'
    option 'masq' '1'
    option 'mtu_fix' '1'
    option 'network' 'wan wan1'

config 'forwarding'
    option 'src' 'lan'
    option 'dest' 'wan'

config 'rule'
    option 'src' 'wan'
    option 'proto' 'udp'
    option 'dest_port' '68'
    option 'target' 'ACCEPT'

config 'rule'
    option 'src' 'wan'
    option 'proto' 'icmp'
    option 'icmp_type' 'echo-request'
    option 'target' 'DROP'

config 'include'
    option 'path' '/etc/firewall.user'

Just a quick follow up to my previous post. I've now found some additional differences between the MultiWAN configuration following a reboot versus running /etc/init.d/multiwan restart.

From a fresh boot, I see the following:

root@OpenWrt:~# ip rule show
0:    from all lookup local 
9:    from all fwmark 0x1 lookup LoadBalancer 
10:    from 84.93.23.101 lookup MWAN1 
11:    from all fwmark 0x10 lookup MWAN1 
20:    from 84.13.252.100 lookup MWAN2 
21:    from all fwmark 0x20 lookup MWAN2 
32766:    from all lookup main 
32767:    from all lookup default 
root@OpenWrt:~# ip route show table 10
root@OpenWrt:~# ip route show table 11
root@OpenWrt:~# ip route show table 20
root@OpenWrt:~# ip route show table 21

root@OpenWrt:~# cat /etc/iproute2/rt_tables 
#
# reserved values
#
255    local
254    main
253    default
0    unspec
#
# local
#
#1    inr.ruhep
#
170 LoadBalancer
171 MWAN1
172 MWAN2

root@OpenWrt:~# ip route show table 170
195.166.130.7 dev ppp0  proto kernel  scope link  src 84.93.23.101 
84.13.240.1 dev ppp1  proto kernel  scope link  src 84.13.252.100 
192.168.0.0/24 dev br-lan  proto kernel  scope link  src 192.168.0.1 
default via 195.166.130.7 dev ppp0  proto static 

root@OpenWrt:~# ip route show table 171
195.166.130.7 dev ppp0  proto kernel  scope link  src 84.93.23.101 
84.13.240.1 dev ppp1  proto kernel  scope link  src 84.13.252.100 
192.168.0.0/24 dev br-lan  proto kernel  scope link  src 192.168.0.1 
default via 195.166.130.7 dev ppp0  proto static  src 84.93.23.101 

root@OpenWrt:~# ip route show table 172
195.166.130.7 dev ppp0  proto kernel  scope link  src 84.93.23.101 
84.13.240.1 dev ppp1  proto kernel  scope link  src 84.13.252.100 
192.168.0.0/24 dev br-lan  proto kernel  scope link  src 192.168.0.1 
default via 84.13.240.1 dev ppp1  proto static  src 84.13.252.100

At this point, no load balancing appears to be happening. The traffic always appears to go out of the second WAN interface (wan1), as shown by ifstat.

However, if I run /etc/init.d/restart I see the following difference:

root@OpenWrt:~# ip route show table 170
195.166.130.7 dev ppp0  proto kernel  scope link  src 84.93.23.101 
84.13.240.1 dev ppp1  proto kernel  scope link  src 84.13.252.100 
192.168.0.0/24 dev br-lan  proto kernel  scope link  src 192.168.0.1 
default  proto static 
    nexthop via 195.166.130.7  dev ppp0 weight 10
    nexthop via 84.13.240.1  dev ppp1 weight 10

Now when I'm surfing/download, the traffic appears to be round robin balanced between the WAN connections.

Is there something I'm doing wrong? Does my config look correct?

Any help would be much appreciated.

Thanks,

Andy.

I've now upgraded to 10.03.1-RC3 with MultiWAN 1.0.18 plus ticket 7996 diffs.

This seems to have resolved the weirdness I was seeing with the nexthop stuff being missing from the ip route table, but I'm now faced with the following error:

Oct 19 20:44:44 OpenWrt user.notice multiwan: Performance load balancer(fastbalanacer) is unavailable due to current kernel limitations.

This seems to be because of a lack of iptable statistics, but I'm not sure how to fix this. Presumably, I'd need to build a new kernel, but I've not got the facilities to do this currently. Is there any other way of resolving this, other than downgrading to 10.03?

This is all kind of frustrating, really, as I've noticed that using fastbalancer seemed to work better for spreading the traffic to Easynews.

Andy.

Bagpuss wrote:

I've now upgraded to 10.03.1-RC3 with MultiWAN 1.0.18 plus ticket 7996 diffs.

This seems to have resolved the weirdness I was seeing with the nexthop stuff being missing from the ip route table, but I'm now faced with the following error:

Oct 19 20:44:44 OpenWrt user.notice multiwan: Performance load balancer(fastbalanacer) is unavailable due to current kernel limitations.

This seems to be because of a lack of iptable statistics, but I'm not sure how to fix this. Presumably, I'd need to build a new kernel, but I've not got the facilities to do this currently. Is there any other way of resolving this, other than downgrading to 10.03?

This is all kind of frustrating, really, as I've noticed that using fastbalancer seemed to work better for spreading the traffic to Easynews.

Andy.

Hey Andy, do you have package iptables-mod-ipopt installed? The statistic mod should be in it. See https://dev.openwrt.org/browser/branche … s/Makefile. If yes, can you post the output of lsmod?

By the way, Craig has committed ticket 7996 diffs. For your next upgrade, it may be easier to just install multiwan 1.0.19 from the trunk/its snapshot.

buildster wrote:

Hey Andy, do you have package iptables-mod-ipopt installed? The statistic mod should be in it. See https://dev.openwrt.org/browser/branche … s/Makefile. If yes, can you post the output of lsmod?

By the way, Craig has committed ticket 7996 diffs. For your next upgrade, it may be easier to just install multiwan 1.0.19 from the trunk/its snapshot.

Hi buildster,

I've downgraded to 10.03 for now, as I wanted to get the system back up and running. There is obviously load balancing taking place with this configuration (using 1.0.18) but it doesn't seem to be consistent.

My default configuration is to simply have fastbalancer as the default route. When I do this, sometimes my connections to Easynews are balancing, and I see the combined throughput of both links. At other times, only one link is used. I'm keeping an eye on the links using ifstat -i ppp0,ppp1.

I'm making 10 simultaneous connections to Easynews, all going to secure.news.eu.easynews.com. My expectation is that multiwan would round robin balance these connections, so that 5 go through ppp0 and 5 through ppp1. I'm thinking that the challenge is for multiwan to decide if it's okay to split the traffic in this way. I'm guessing that for things like ssh, you'd want to bind to one particular interface, and then keep it that way. The same, I guess, for VPN.

I'd be quite happy to try and install stuff from the trunk or snapshots. Can I simply go to: http://downloads.openwrt.org/snapshots/trunk/
and download the latest nightly build for my platform, or would it be better to run 10.03.1-rc3, and point the firmware at the packages from the latest snapshot?

If I install the latest snapshot firmware, will there be any issues with dependencies etc. in the packages when the next snapshot is built?

Sorry for so many questions, but I want to make sure I get this correct.

Thanks,

Andy.

Andy,

To get your system up and running, multiwan 1.0.18 should do the job... unless the changes in 1.0.19 accidentally made it better. I normally don't do that, but do the opposite smile.

I haven't tested it myself, but multiwan 1.0.19 should run in place of 1.0.18 on openwrt 10.03 because it didn't introduce any new/hard dependency. If you like to test it, wget the 1.0.19 package to /tmp and okpg install it. opkg remove it (easily) if it doesn't work out.

buildster wrote:

Hey Anders,

I see the multiwan script does "ip route flush cache" when failing over. Apparently, it's not enough. Can you post your changes to clear conntrack table and  to remove all connections on a failed interface?

Hey,

Sorry for the delay. Here are my two changes, both in /usr/bin/multiwan

1. In the function iptables_init():

Last line in the function i added "conntrack -F" to clear any pre-multiwan sessions going thru the wrong internet connection. Maybe this is not needed.

2. The function add() in the function failover():

    add() {

# >> Anders: Define variable for failed interface ip. Maybe this is already available in some global variable?
        local clearip
# << Anders

        wan_fail_map=$(echo $wan_fail_map | sed -e "s/${1}\[${failchk}\]//g")
        wan_fail_map=$(echo $wan_fail_map${1}[x])
        wan_recovery_map=$(echo $wan_recovery_map | sed -e "s/${1}\[${recvrychk}\]//g")
        update_cache

        if [ "$existing_failover" == "2" ]; then
            if [ "$failover_to" != "balancer" -a "$failover_to" != "fastbalancer" -a "$failover_to" != "disable" -a "$failover_to_wanid" != "$wanid" ]; then
                iptables -I FW${wanid}MARK 2 -t mangle -j FW${failover_to_wanid}MARK
            elif [ "$failover_to" == "balancer" ]; then
                iptables -I FW${wanid}MARK 2 -t mangle -j LoadBalancer
            elif [ "$failover_to" == "fastbalancer" ]; then
                iptables -I FW${wanid}MARK 2 -t mangle -j FastBalancer
            fi
        fi
        mwnote "$1 has failed and is currently offline."

# >> Anders: Figure out the ip of the failed interface, then clear associated entries.
        clearip=$(query_config ipaddr $1)
        mwnote "Deleting conntrack entries for $clearip"
        conntrack -D -n $clearip
# << Anders

    }


Yeah, and also the package conntrack-tools needs to be installed.

Hey Anders,

Thanks for posting your changes! I'll apply them to my setup.