DHCP in VRRP cluster, how to configure properly?

Ahoy ahoy friends.
I have configured a VRRP setup using one Raspberry Pi 4 OpenWrt device as backup, and my x86 device as master. It works well as far, but some issues regarding DHCP exist.
I have followed the following guide: https://openwrt.org/docs/guide-user/network/high-availability
I got a master and a backup router, and they have their role as expected. Unfortunately the backup router still continues to provide DHCP leases for some clients, so i have both, DHCP leases on my master router, and DHCP leases on my passive router. That's an issue because DNS Hostname resolving doesn't work for the clients which are using the backup router instead of the master router, so i can't ping them using their hostname, nor can i see them on the "overview" tab. How can i solve this issue, to prevent the backup router of assigning DHCP leases when in passive role in keepalived?
My network and access to network devices is mainly DHCP and DNS based so that's a huge pain in the ass.
Thanks in advance!

don't mind me, I'm just notifying @risk, the person that wrote that article

in the example... gw/dns is the 'vip'... so you likely didn't follow the guide correctly?

i believe you are misunderstanding the concept and the guide likely mentions nothing about 'seeing leases redundantly'...

the dhcp is 'active redundancy', in which typically apart from reservations you will split your scope roughly in half across both dhcp servers...

Thanks a lot for your reply.
Well the issue is, i am not able to reach a lot of my hosts anymore using hostnames.
Some example: I got my FreeNAS server, and it receives a DHCP lease from the backup router. When i try to reach this device i am not able to, because the Master server and it's DNS do not know about the FreeNAS device, because it's DHCP lease was assigned by the backup router, so only the backup router's DNS knows about it's IP.

Well, the DNS is being reached through the virtual ip, but it only contains the information the master router has, not the backup router.
These devices here received a DHCP lease from the backup rotuer (That's the backup router's view).
So these devices i can't reach through DNS because the virtual ip is pointing to the master router, and it doesn't contain the DHCP --> Hostname mapping of these devices listed below.

So my access to my freenas device only works occasionally, when suddenly it receives a DHCP lease from the master router instead of the backup router.

I think some workaround would be, first use the DNS of the master router, and then, the DNS of the backup router.

most people switch to fully featured dns/dhcp servers to implement synchronization...

plenty of guides on the net about this...

I don't know if dnsmasq can somehow be configured to sync leases in any way.

You could setup /etc/ethers with hosts you typically access using hostnames.

1 Like

Yeah exactly, i think some kind of sync mechanism would solve this issue at all.
Is there some alternative to dnsmasq with support for sync?

Otherwise the only good solution for high availability in this case is a dedicated machine for DNS and DHCP.

dnsmasq stores current leases in a text file called /tmp/dhcp.leases by default in OpenWrt (it's also a configuration option you can change from UCI or Luci web interface (Network -> DHCP and DNS -> Resolv and Hosts files -> Lease File )

This is what it looks like on my OpenWrt router VM

root@VM-router:~# cat /tmp/dhcp.leases
1633703346 00:1c:42:0f:b1:c7 192.168.222.244 hostname1 01:00:1c:42:0f:b1:c7
1633703352 c4:41:1e:68:97:62 192.168.222.243 hostname2 01:c4:41:1e:68:97:62
1633703161 c0:10:b1:2c:e4:e6 192.168.123.148 * 01:c0:10:b1:2c:e4:e6
1633703141 e8:f4:08:1f:9c:67 192.168.123.69 hostname3 01:e8:f4:08:1f:9c:67

The first number is a timestamp (seconds since Unix "beginning of time" date which is somewhere in 1970, so it should be consistent with another device if the clocks are set correctly), then there is mac address of the device, then IP, then hostname (I redacted the hostnames of my devices above), then it seems another mac address but I'm not sure of what that is.

I think if you just set a cron job that scp the file from the master to the slave, the slave should know them when it's time to take over. I never tested this, but I think it should work.

Note that /tmp is a folder that lives in RAM so there is no problem with flash wearing out, you can sync this every 5 seconds, no worries

root@VM-router:~# df -h /tmp
Filesystem                Size      Used Available Use% Mounted on
tmpfs                   242.2M    360.0K    241.8M   0% /tmp

I think we have to develop some kind of script for that.
Unfortunately the slave will also distribute DHCP leases. So slave and master have to compare both their leases files and sync the missing information on their side. I don't know if such a thing is possible with rsync, or diff through scp or something.

Apparently dnsmasq supports a --dhcp-script=... flag - you can use it to send add/remove events into socat or similar, and have them reach the other side.

Contents of the leases file is a result of a bunch add/remove operations from either host, as they are applied to the previous leases file.

IIRC BusyBox has awk built-in, it should be possible to write a script that reads the leases file, applies add remove events, and writes out the new leases file.

This is pretty much all you need to have near realtime replication work.

ah so it's double-syncing.

rsync/scp aren't smart enough to sync file contents on their own, but a script can just copy the file over to another place in /tmp and then merge it with the existing lease file.

Then it's scp to copy the file to another location in /tmp
then cat file1 file2 to join them
then sort -u to remove duplicates

This is relatively simple and dumb, as it just merges the files every X time, and it assumes that dnsmasq will automatically drop the entries when their lease is up.

You will need to do the following on both routers.

so first thing you import the public SSH key of the router 1 in router 2 (and the reverse) so they can scp to each other without writing the password
this to read the current public key https://openwrt.org/docs/guide-user/security/dropbear.public-key.auth#extras
and this to write the key https://openwrt.org/docs/guide-user/security/dropbear.public-key.auth#web_interface_instructions

then you copy the following script to /bin/dnsmasq-lease-sync.sh and edit the IP address (so it can point to the other router)

#!/bin/sh

#syncs contents of dnsmasq dhcp leases

other_router=192.168.11.254

scp root@$other_router:/tmp/dhcp.leases /tmp/dhcp_lease_temp

cat /tmp/dhcp.leases /tmp/dhcp_lease_temp | sort -u > /tmp/dhcp_lease_new

mv /tmp/dhcp_lease_new /tmp/dhcp.leases

then you make it executable with

chmod u+x /bin/dnsmasq-lease-sync.sh

and then you add a crontab (scheduled task) so that this script is executed every minute. If you need faster than that you must run this as a service, and it's more stuff to add than this.


*/1 * * * *  /bin/dnsmasq-lease-sync.sh

afaik in OpenWrt that functionality is hooked to call this script https://github.com/openwrt/openwrt/blob/master/package/network/services/dnsmasq/files/dhcp-script.sh

that is redirecting all events to the hotplug subsystem https://openwrt.org/docs/guide-user/base-system/hotplug

so if you place a script in /etc/hotplug.d/dhcp it should be called every time there is a dhcp event and it will receive the information according to the dhcp-script above.

I will try that out tomorrow. I think this will be important for quite a lot of people. In this case, i'd let my OpenWrt device work as a VM on my server, and use a Raspberry Pi4, or Banana Pi M2+ as a backup option while my server is down for some reason.
So i don't need a dedicated x86 machine for OpenWrt anymore.

Hey friends, it works really fine after adding the script to /etc/hotplug.d/dhcp !
I think it might be worth to add this content to the Wiki. Now the setup is working flawless!

1 Like

Very good!

Now you can add a line with

/etc/hotplug.d/dhcp/dnsmasq-lease-sync.sh

in the /etc/sysupgrade.conf file so the script is saved when you do config backup and when you do a firmware upgrade

EDIT: added the script to the wiki article about high availability.

Thanks a lot, but which option is preferrable, the /etc/hotplug.d/dhcp one, or the cronjob one?
One more issue. In the guide a change to the /etc/init.d/keepalived is suggested, but unfortunately it's not that well explained on how to make these changes. I had to modify it myself, and always wonder again, why /tmp/keepalived.conf has been created. Furthermore the path for it should be added to the sysupgrade.conf file to survive Upgrades.


config global_defs                                                                             
   option alt_config_file          "/etc/keepalived/keepalived.conf"

Unfortunately i don't know where to add that.

cronjob is better as it's done every minute regardless of events.

Keepalived package is integrated with UCI OpenWrt configuration management, so the "better" way to configure it is to edit only the /etc/config/keepalived and put all your configuration there.
This is an example of possible configuration options you can put in the /etc/config/keepalived file https://github.com/openwrt/packages/blob/master/net/keepalived/files/keepalived.config

The OpenWrt UCI system will create a config file in /tmp/keepalived.conf with your configuration options.

But the maintainer has allowed to override the UCI configuration. By writing

config global_defs                                                                             
   option alt_config_file          "/etc/keepalived/keepalived.conf"

in /etc/config/keepalived you are telling it to just make a link to your manually written config file in that place.
So the /tmp/keepalived.conf is now just a link to /etc/keepalived/keepalived.conf.

You can see this if you do a
ls -lah /tmp
in ssh/console

Everything that is created automatically in /tmp is re-created automatically when the device is restarted. /tmp is a RAM folder, not a disk folder.

/tmp/keepalived.conf is created again by UCI system when the service starts up

The only things you need to add to sysupgrade.conf is indicated in the last step in the guide, https://openwrt.org/docs/guide-user/network/high-availability#sysupgrade_backup_add_dirs
and it adds the whole folder where the config files are, plus the script I added.

/etc/keepalived/
/etc/conntrackd/
/bin/dnsmasq-lease-sync.sh

Thanks a lot. Unfortunately, somehow, the config global_defs didn't have any effect in the config for me. So keepalived started over with a blank default config.
I've used alt_config_file='/etc/keepalived/keepalived.conf' now on top of the /etc/init.d/keepalived, and it worked.

EDIT: Saw your message right now, i'll give it a try!
Unfortunately, making these changes to the /etc/config/keepalived doesn't have any effect. I still have to manually reconfigure the /etc/init.d/keepalived file.

EDIT:
That's the same issue i have with keepalived.
The custom settings in /etc/config/keepalived, specifying the custom path are not applied.
I tried with 21.02 OpenWrt branch, maybe it's different in master.

One more suggestion for the wiki, the -i flag is missing for the scp command in order to use the pubkey.
Thanks to @vgaetera i was able to solve this issue, using his code example.
This works now:

#!/bin/sh
#syncs contents of dnsmasq dhcp leases

SSH_USER="root"
SSH_HOST="172.20.32.2"
SSH_KEY="/etc/dropbear/dropbear_ed25519_host_key"
scp -i "${SSH_KEY}" "${SSH_USER}"@"${SSH_HOST}":/tmp/dhcp.leases /tmp/dhcp_lease_temp


cat /tmp/dhcp.leases /tmp/dhcp_lease_temp | sort -u > /tmp/dhcp_lease_new

mv /tmp/dhcp_lease_new /tmp/dhcp.leases

Also there is some issue, somehow, the /tmp/dhcp.leases stays empty, even if there are leases on the master router. At least when there are no DHCP leases on the backup router, it will stay empty.
Also in some cases, it doesn't update.

EDIT:

I've found some new approach, i think it's even superior.

https://stijn.tintel.eu/blog/2017/09/28/building-redundant-router-setup-open-source-software-part-2