OpenWrt Forum Archive

Topic: Dual-WAN Load Balancing

The content of this topic has been archived between 4 Apr 2018 and 1 May 2018. Unfortunately there are posts – most likely complete pages – missing.

I figured this is probably a better place for this,

I've been trying to work out a script that can make dual-wan load balancing and failover simple and straight forward, while also providing the ability to use qos-scripts and miniupnpd on the second wan as well. (if desired.)

This has been replaced by the Multi-WAN version,
https://forum.openwrt.org/viewtopic.php?id=23904

Let me know if there are any questions, or suggestions.

Thanks smile



Updated the script a little.

Made some changes, there are less errors if we do not find gateways necessary for dual wan initialization, also notification of failure on gateway detection will only happen the first time, not every minute as before.

/etc/dualwan_rules.user now is a file loaded for custom rules for dualwan applications.

dualwan_rules.user contains example of enabling dualwan miniupnpd instead of having it be an option in config.

Failover timer was also updated so it's more precise now.

(Last edited by SouthPawn on 15 Mar 2010, 09:07)

Excellent...thanks for your efforts.

I look forward to testing this out.  Does this handle things like SSH properly?

One issue I seem to of overlooked is the udhcp client, and how it handles routes one it receives a new route. (If you're using static IPs with static gateways, this will not be an issue.)

It's clearing all other default routes before adding the new one, even ones statically set on other interfaces, which it shouldn't do.

/usr/share/udhcpc/default.script, line 57,

eval $(route -n | awk '
                         /^0.0.0.0\W{9}('$valid_gw')\W/ {next}
                         /^0.0.0.0/ {print "route del -net "$1" gw "$2"
";"}
                 ')

which should be:

eval $(route -n | grep "$interface" | awk '
                         /^0.0.0.0\W{9}('$valid_gw')\W/ {next}
                         /^0.0.0.0/ {print "route del -net "$1" gw "$2" dev  ";"}
                 ')

https://dev.openwrt.org/ticket/6514

As for SSH, I wouldn't imagine there should be a problem with SSH.

But definitely let me know any issues you may run into smile

(Last edited by SouthPawn on 17 Jan 2010, 21:21)

Also, another thing to note is when using this with DHCP, make sure you set the dns servers to something that accessible on either WAN link, such as opendns.

I'm just starting to learn OpenWrt. I aplolgize in advance for my ingnorance. I've got a Linksys WRT54G3g-St running the WhiteRusian version. Will these scripts work on it?  Thanks

they won't run on whiterussian, completely different firmware

Hi SouthPawn,

I just upgraded from WhiteRussian, where I had manually configured dual-wan to Kamikaze, and your module is a godsend, saving me a couple hours of trying to figure out how to set up firewall rules again.

However, I do have a couple of issues:

- I tried using traffic rules to force traffic to and from one of my machines to go through the secondary, but that doesn't seem to be enforced. The firewall config does target GW2MARK for packets to and from that host, but I'm not sure iproute2 does anything with that

- when I /etc/init.d/dualwan stop, the default routes are not restored, so connectivity is broken

- when I try to edit the agent file, my vi session gets killed periodically; why not use 'killall', which is safer, rather than the complex grep you're using in the script, which is pretty indisciminate?

Thanks a bunch, I hope you will be able to improve the script so it can be included in the standard repository!

Pierre

rockpilp wrote:

Hi SouthPawn,

I just upgraded from WhiteRussian, where I had manually configured dual-wan to Kamikaze, and your module is a godsend, saving me a couple hours of trying to figure out how to set up firewall rules again.

However, I do have a couple of issues:

- I tried using traffic rules to force traffic to and from one of my machines to go through the secondary, but that doesn't seem to be enforced. The firewall config does target GW2MARK for packets to and from that host, but I'm not sure iproute2 does anything with that

- when I /etc/init.d/dualwan stop, the default routes are not restored, so connectivity is broken

- when I try to edit the agent file, my vi session gets killed periodically; why not use 'killall', which is safer, rather than the complex grep you're using in the script, which is pretty indisciminate?

Thanks a bunch, I hope you will be able to improve the script so it can be included in the standard repository!

Pierre

Hey Pierre,

Thanks for the feedback!

I've made a few changes based on what you've told me.
(edited above, ftp://ftp.netlab7.com/dualwan_0.1c.ipk)

A. Hmm, the only real thing that comes to mind is that possibly theres a ip routing conflict. The script makes use of routing table 200, I probably should have made this something a bit more obscure, but it's possible that if something else is attempting to make use of table 200.

Here's an example of a config entry I have in /etc/config/dualwan to force one of my lan ips to go to the secondary:

config 'dualwanfw'
        option 'src' '192.168.0.6'
        option 'wanrule' 'wan2'


B. The Dual-WAN Agent backs out cleanly now, meaning it'll restore everything to the way it was before it was started. Bringing back both default routes, flushing the ip rules and flushing the wan2 routing table.

C. The reason for the complex grep/kill system is I cannot seem to killall a shell script, but I changed what grep is looking for so that it won't kill your vi session, only the script as it's running.

Thanks again Pierre,
-Craig

(Last edited by SouthPawn on 1 Feb 2010, 01:44)

Hi Craig,

Thanks, your updated version does fix B, this makes it much more comfortable to test.

When I invoke the agent from the command-line, I get the following on the console:

Try `iptables -h' or 'iptables --help' for more information.
expr: syntax error
iptables v1.3.8: Couldn't load match `mark':File not found

Is this a sign something is wrong, or just normal output?

My config is

root@GMPiLP:~# cat /etc/config/dualwan

config 'dualwan' 'config'
        option 'fail_timer' '20'
        option 'enabled' '1'
        option 'balance_ratio' 'auto'
        option 'auto_failover' '1'
        option 'wan1_conf' 'orange'
        option 'wan2_conf' 'neuf'

config 'dualwanfw'
        option 'src' '192.168.2.109'
        option 'wanrule' 'wan2'

config 'dualwanfw'
        option 'dst' '192.168.2.109'
        option 'wanrule' 'wan2'

Ok, so the error your getting is actually cosmetic, and I've updated the script again to remove it.

One thing I should of mentioned or detailed more is that the rules set in the /etc/config/dualwan are route selection for outgoing traffic, from lan->wan or lan->wan2.

Configuring incoming should be done in the firewall, similar to as you would do normally for the primary wan.

This line:

config 'dualwanfw'
        option 'dst' '192.168.2.109'
        option 'wanrule' 'wan2'

Should actually be in the /etc/config/firewall as:

config 'redirect'
        option 'src' 'neuf'
        option 'dest_ip' '192.168.2.109'

Let me know if that works, on a side note, I probably should have made the config variable lansrc and inetdst in the config file to clear up any confusion.

Thanks Pierre,

I do have a firewall rule for incoming connection redirection (port-forwarding); I had also added the DualWAN config for incoming just to check that wasn't the reason traffic from that host was using the wrong WAN sometimes.

config 'redirect'
        option 'src' 'wan'
        option '_name' '*edited*'
        option 'proto' 'tcp'
        option 'src_dport' '9997-9998'
        option 'dest_ip' '192.168.2.109'
        option 'dest_port' '9997-9998'

Anyway, on the topic of the routing table, I would have expected 'ip route list 200' to display the load-balanced table. Instead, here is what I have:

root@GMPiLP:~# ip route list
192.168.4.0/24 dev eth0.2  proto kernel  scope link  src 192.168.4.2 
192.168.2.0/24 dev br-lan  proto kernel  scope link  src 192.168.2.1 
192.168.1.0/24 dev eth0.1  proto kernel  scope link  src 192.168.1.2 
default via 192.168.4.1 dev eth0.2 
default via 192.168.1.1 dev eth0.1 
root@GMPiLP:~# ip route list 200
root@GMPiLP:~# /etc/init.d/dualwan start
root@GMPiLP:~# ip route list 200
root@GMPiLP:~# ip route list
192.168.4.0/24 dev eth0.2  proto kernel  scope link  src 192.168.4.2 
192.168.2.0/24 dev br-lan  proto kernel  scope link  src 192.168.2.1 
192.168.1.0/24 dev eth0.1  proto kernel  scope link  src 192.168.1.2 
default via 192.168.1.1 dev eth0.1

Pierre

rockpilp wrote:

I do have a firewall rule for incoming connection redirection (port-forwarding); I had also added the DualWAN config for incoming just to check that wasn't the reason traffic from that host was using the wrong WAN sometimes.

config 'redirect'
        option 'src' 'wan'
        option '_name' '*edited*'
        option 'proto' 'tcp'
        option 'src_dport' '9997-9998'
        option 'dest_ip' '192.168.2.109'
        option 'dest_port' '9997-9998'

Anyway, on the topic of the routing table, I would have expected 'ip route list 200' to display the load-balanced table. Instead, here is what I have:

root@GMPiLP:~# ip route list
192.168.4.0/24 dev eth0.2  proto kernel  scope link  src 192.168.4.2 
192.168.2.0/24 dev br-lan  proto kernel  scope link  src 192.168.2.1 
192.168.1.0/24 dev eth0.1  proto kernel  scope link  src 192.168.1.2 
default via 192.168.4.1 dev eth0.2 
default via 192.168.1.1 dev eth0.1 
root@GMPiLP:~# ip route list 200
root@GMPiLP:~# /etc/init.d/dualwan start
root@GMPiLP:~# ip route list 200
root@GMPiLP:~# ip route list
192.168.4.0/24 dev eth0.2  proto kernel  scope link  src 192.168.4.2 
192.168.2.0/24 dev br-lan  proto kernel  scope link  src 192.168.2.1 
192.168.1.0/24 dev eth0.1  proto kernel  scope link  src 192.168.1.2 
default via 192.168.1.1 dev eth0.1

Pierre

Looks like you're missing the word table in that command line, should be ip route list table 200.

Got it, thanks!

Suggestion: a status page (perhaps on the main settings page for DualWAN, indicating how both links are doing (in case one gets disabled by the periodic checks).

Pierre

It does send various information to the syslog, such as went it fails over, to which wan it's failed over and how long it's going to wait till it retries.

Also success and failures when attempting to initialize itself.

A status page may be something to look at in the future, but right now I want to mainly tackle functionality.

One thing to note, is that the script and the rules are working with new connections, so if there's already a connection made when the change is made it may still be going to the default wan.

Is it still not sending that lan ip out the secondary wan connection? 

Thanks Pierre,
-Craig

Yes, I just tested thourougly, all traffic is still going out the primary, even though I added the rule for the one host to only use the secondary.

I can see in the firewall rules that those packets are targetted at GW2MARK. Should there be a chain called GW2MARK?

Chain DualWan (4 references)
 pkts bytes target     prot opt in     out     source               destination         
1485K  813M CONNMARK   all  --  any    any     anywhere             anywhere            CONNMARK restore 
 723K   70M GW2MARK    tcp  --  any    any     nas                  anywhere            
 1441  156K GW2MARK    udp  --  any    any     nas                  anywhere            
    0     0 GW2MARK    icmp --  any    any     nas                  anywhere

Also, a question: in the config, what's the difference between Dynamic Balance and 50/50? When I watch traffic on the Realtime Network Traffic page, I can see that 99% of the traffic from regular hosts uses the primary. When I manually set up my previous box, traffic was more evenly balanced (given lots of different concurrent connections, like BitTorrent).

Thanks again for your assistance.

Pierre

Ok, I attempted to update this again, for a problem I think you may be running into.

I believe the conmarks may be getting "forgotten"

ftp://ftp.netlab7.com/dualwan_0.1d.ipk , also edited post above.

Let me know if this works any better.

Thanks for your help on this,
-Craig

(Last edited by SouthPawn on 1 Feb 2010, 23:51)

rockpilp wrote:

Also, a question: in the config, what's the difference between Dynamic Balance and 50/50? When I watch traffic on the Realtime Network Traffic page, I can see that 99% of the traffic from regular hosts uses the primary. When I manually set up my previous box, traffic was more evenly balanced (given lots of different concurrent connections, like BitTorrent).

The Dynamic Balance continually is pinging both gateways. (same process used for failover.)

As latency increases on one wan link, it slowly moves traffic over to the second, then attempts to equalize when both wan links are showing similar latency again.

It's always attempting to go back to 50/50, but again it's based on the latency of each gateway's ping we get back.

Thanks for the update Craig.

I still don't have a definition for the GW2MARK chain in the Firewall status page, and traffic is still going through the wrong WAN. It's unfortunate that BusyBox's shell does not honor the -x parameter to make debugging easier. Is there another way I could generate an execution log for you?

Other comments:

Is it useful to create three separate firewall rules for TCP, UDP and ICMP when the DualWAN rule is set to All, rather than one firewall rule that does not specify a protocol?

The iptables cosmetic error in the log is still there with 0.1d. Not a big issues, but definitely makes the logs less easy to read.

Hey Pierre,

Check something else too, see if you have a chain for Balancer.
After taking a second look at the error you send a second time, it appears it's missing chains that the script normally looks for.

Almost as though the mangle table is getting flushed after initialization. (By some other script?)

Try running this from the shell and see if it fixes everything.

iptables -t mangle -N Balancer
iptables -t mangle -F Balancer
iptables -t mangle -A Balancer -m mark --mark 0x0 -m statistic --mode random --probability 0.50 -j GW2MARK

iptables -t mangle -N GW2MARK
iptables -t mangle -A GW2MARK -m mark --mark 0x0 -j MARK --set-mark 0x200
iptables -t mangle -A GW2MARK -m mark --mark 0x200 -j CONNMARK --save-mark

The GW2MARK chain is necessary to tag packets to which interface the traffic should be going.

I did change the rules creation to not make individual entries for everything.
It should be a little cleaner now, and repackaged the above 0.1d.

I'm getting an error during install:

root@GMPiLP:~# opkg install dualwan_0.1e.ipk 
Multiple packages (dualwan and dualwan) providing same name marked HOLD or PREFER.  Using latest.
Upgrading dualwan on root from 0.1d to 0.1e...
Collected errors:
 * ERROR: Cannot satisfy the following dependencies for dualwan:
         *  iptables-mod-ipopt *

Manually installing iptables-mod-ipopt allowed me to install the latest version.

Now I'm getting this:

root@GMPiLP:~# /etc/init.d/dualwan start
iptables v1.3.8: Couldn't load match `statistic':File not found

Try `iptables -h' or 'iptables --help' for more information.

Despite the remaining error in the logs, traffic does seem to be routed to the right WAN, and I do have the GW2MARK chain. Thanks for your help in getting this working!

I'm glad to hear it's working!

The problem as I can tell is that it's entirely a version issue, I made it on 8.09.1, and have not tested it on the older versions.
The issue specifically is the load balancing portion, which utilizes the statistic module for netfilter.

Thanks Again,
-Craig

(Last edited by SouthPawn on 3 Feb 2010, 00:29)

I'm on KAMIKAZE (8.09.2, r18961). Should I have the necessary support?

Sorry, posts 26 to 25 are missing from our archive.