Issues with unbound and odhcp setup

I replaced the scripts with the ones from the pull request, and it seems to be working fine now. Nice work!

Do you know whether this fix will be available as an updated package for 19.07? Or only in the next release (20.x)?

1 Like

Hmm, it looks like I was too optimistic. Something is still wrong, because occasionally I'm still seeing cases where I'm unable to resolve any dhcp hostnames. Only the router itself and the host entries are present in the dns. So the same problem I had before. It just seems to happen less frequent now. If I restart unbound everything is fine again.

I'll try to reproduce it again with my wrapper script enabled.

1 Like

Thank you for the heads up! I am still on 18.06.8 and was thinking about to upgrade for some time.....

I used the following wrapper script to log the execution of the /usr/lib/unbound/odhcpd.sh script:

#!/bin/sh

TIMESTAMP=$(date "+%Y%m%dT%H%M%S").$$
LOGDIR=/mnt/data/unbound/$TIMESTAMP
LOGFILE=$LOGDIR/odhcp.log

mkdir -p $LOGDIR 
cp -a /var/lib/odhcpd/dhcp.leases $LOGDIR
cp -a /var/lib/unbound/dhcp* $LOGDIR

date > $LOGFILE
echo $TIMESTAMP >> $LOGFILE

nslookup barracuda.home.lan >> $LOGFILE

sh -x /usr/lib/unbound/odhcpd.sh "$@" >> $LOGFILE 2>&1

nslookup barracuda.home.lan >> $LOGFILE

At the time the problem appears, the script is executed twice (same timestamp, different pid):

20200901T151510.28823
20200901T151510.28828

The logfile of the first one (20200901T151510.28823):

Tue Sep  1 15:15:10 UTC 2020
20200901T151510.28823
Server:		127.0.0.1
Address:	127.0.0.1#53

Name:      barracuda.home.lan
Address 1: 192.168.1.10
Address 2: fd81:631b:716f:10:3eec:efff:fe01:a300
Address 3: 2a02:1810:4f03:f710:3eec:efff:fe01:a300
+ . /lib/functions.sh
+ N='
'
+ _C=0
+ NO_EXPORT=1
+ LOAD_STATE=1
+ LIST_SEP=' '
+ reset_cb
+ '[' -z  -a -f /lib/config/uci.sh ]
+ . /lib/config/uci.sh
+ CONFIG_APPEND=
+ . /usr/lib/unbound/defaults.sh
+ UB_LIBDIR=/usr/lib/unbound
+ UB_VARDIR=/var/lib/unbound
+ UB_PIDFILE=/var/run/unbound.pid
+ UB_TOTAL_CONF=/var/lib/unbound/unbound.conf
+ UB_CORE_CONF=/var/lib/unbound/server.conf.tmp
+ UB_HOST_CONF=/var/lib/unbound/host.conf.tmp
+ UB_DHCP_CONF=/var/lib/unbound/dhcp.conf
+ UB_ZONE_CONF=/var/lib/unbound/zone.conf.tmp
+ UB_CTRL_CONF=/var/lib/unbound/ctrl.conf.tmp
+ UB_SRVMASQ_CONF=/var/lib/unbound/dnsmasq_srv.conf.tmp
+ UB_EXTMASQ_CONF=/var/lib/unbound/dnsmasq_ext.conf.tmp
+ UB_SRV_CONF=/var/lib/unbound/unbound_srv.conf
+ UB_EXT_CONF=/var/lib/unbound/unbound_ext.conf
+ UB_TLS_KEY_FILE='TLS server UCI not implemented'
+ UB_TLS_PEM_FILE='TLS server UCI not implemented'
+ UB_TLS_FWD_FILE=/var/lib/unbound/ca-certificates.crt
+ UB_TLS_ETC_FILE=/etc/ssl/certs/ca-certificates.crt
+ UB_RKEY_FILE=/var/lib/unbound/root.key
+ UB_RHINT_FILE=/var/lib/unbound/root.hints
+ UB_TIME_FILE=/var/lib/unbound/hotplug.time
+ UB_SKIP_FILE=/var/lib/unbound/skip.time
+ UB_CTLKEY_FILE=/var/lib/unbound/unbound_control.key
+ UB_CTLPEM_FILE=/var/lib/unbound/unbound_control.pem
+ UB_SRVKEY_FILE=/var/lib/unbound/unbound_server.key
+ UB_SRVPEM_FILE=/var/lib/unbound/unbound_server.pem
+ date '+%s'
+ UB_XSER=26649555
+ UB_XSOA='7200 IN SOA localhost. nobody.invalid. 26649555 3600 1200 9600 300'
+ UB_XNS='7200 IN NS localhost.'
+ UB_XTXT='7200 IN TXT "comment=local intranet dns zone"'
+ UB_MTXT='7200 IN TXT "comment=masked internet dns zone"'
+ UB_LTXT='7200 IN TXT "comment=rfc6762 multicast dns zone"'
+ UB_ANCHOR=/usr/sbin/unbound-anchor
+ UB_CONTROL=/usr/sbin/unbound-control
+ UB_CONTROL_CFG='/usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf'
+ odhcpd_zonedata
+ uci_get 'unbound.@unbound[0].dhcp_link'
+ local 'PACKAGE=unbound.@unbound[0].dhcp_link'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp_link'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_link=odhcpd'
+ uci_get 'unbound.@unbound[0].dhcp4_slaac6'
+ local 'PACKAGE=unbound.@unbound[0].dhcp4_slaac6'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp4_slaac6'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp4_slaac6=1'
+ uci_get 'unbound.@unbound[0].domain'
+ local 'PACKAGE=unbound.@unbound[0].domain'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].domain'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_domain=home.lan'
+ uci_get 'dhcp.@odhcpd[0].leasefile'
+ local 'PACKAGE=dhcp.@odhcpd[0].leasefile'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'dhcp.@odhcpd[0].leasefile'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_origin=/var/lib/odhcpd/dhcp.leases'
+ '[' -f /var/lib/unbound/unbound.conf ]
+ '[' -f /var/lib/odhcpd/dhcp.leases ]
+ '[' odhcpd '=' odhcpd ]
+ '[' -n home.lan ]
+ local longconf dateconf dateoldf
+ local 'dns_ls_add=/var/lib/unbound/dhcp_dns.add'
+ local 'dns_ls_del=/var/lib/unbound/dhcp_dns.del'
+ local 'dns_ls_new=/var/lib/unbound/dhcp_dns.new'
+ local 'dns_ls_old=/var/lib/unbound/dhcp_dns.old'
+ local 'dhcp_ls_new=/var/lib/unbound/dhcp_lease.new'
+ '[' '!' -f /var/lib/unbound/dhcp.conf ]
+ '[' '!' -f /var/lib/unbound/dhcp_dns.old ]
+ date '+%s'
+ date -r /var/lib/unbound/dhcp.conf '+%s'
+ dateconf=2919
+ date '+%s'
+ date -r /var/lib/unbound/dhcp_dns.old '+%s'
+ dateoldf=2919
+ '[' 2919 -gt 300 ]
+ touch /var/lib/unbound/dhcp_dns.old
+ sort /var/lib/odhcpd/dhcp.leases
+ longconf=longtime
+ awk -v 'conffile=/var/lib/unbound/dhcp.conf' -v 'pipefile=/var/lib/unbound/dhcp_dns.new' -v 'domain=home.lan' -v 'bslaac=1' -v 'bisolt=0' -v 'bconf=1' -f /usr/lib/unbound/odhcpd.awk /var/lib/unbound/dhcp_lease.new
+ awk '{ print $1 }' /var/lib/unbound/dhcp_dns.old
+ sort
+ uniq
+ cp /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.add
cp: can't stat '/var/lib/unbound/dhcp_dns.new': No such file or directory
+ cp /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.old
+ cat /var/lib/unbound/dhcp_dns.del
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
removed 71 datas
+ cat /var/lib/unbound/dhcp_dns.add
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
cat: can't open '/var/lib/unbound/dhcp_dns.add': No such file or directory
added 0 datas
+ rm -f /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.del /var/lib/unbound/dhcp_dns.add /var/lib/unbound/dhcp_lease.new
Server:		127.0.0.1
Address:	127.0.0.1#53

** server can't find barracuda.home.lan: NXDOMAIN
** server can't find barracuda.home.lan: NXDOMAIN

and the second one (20200901T151510.28828):

Tue Sep  1 15:15:10 UTC 2020
20200901T151510.28828
Server:		127.0.0.1
Address:	127.0.0.1#53

Name:      barracuda.home.lan
Address 1: 192.168.1.10
Address 2: fd81:631b:716f:10:3eec:efff:fe01:a300
Address 3: 2a02:1810:4f03:f710:3eec:efff:fe01:a300
+ . /lib/functions.sh
+ N='
'
+ _C=0
+ NO_EXPORT=1
+ LOAD_STATE=1
+ LIST_SEP=' '
+ reset_cb
+ '[' -z  -a -f /lib/config/uci.sh ]
+ . /lib/config/uci.sh
+ CONFIG_APPEND=
+ . /usr/lib/unbound/defaults.sh
+ UB_LIBDIR=/usr/lib/unbound
+ UB_VARDIR=/var/lib/unbound
+ UB_PIDFILE=/var/run/unbound.pid
+ UB_TOTAL_CONF=/var/lib/unbound/unbound.conf
+ UB_CORE_CONF=/var/lib/unbound/server.conf.tmp
+ UB_HOST_CONF=/var/lib/unbound/host.conf.tmp
+ UB_DHCP_CONF=/var/lib/unbound/dhcp.conf
+ UB_ZONE_CONF=/var/lib/unbound/zone.conf.tmp
+ UB_CTRL_CONF=/var/lib/unbound/ctrl.conf.tmp
+ UB_SRVMASQ_CONF=/var/lib/unbound/dnsmasq_srv.conf.tmp
+ UB_EXTMASQ_CONF=/var/lib/unbound/dnsmasq_ext.conf.tmp
+ UB_SRV_CONF=/var/lib/unbound/unbound_srv.conf
+ UB_EXT_CONF=/var/lib/unbound/unbound_ext.conf
+ UB_TLS_KEY_FILE='TLS server UCI not implemented'
+ UB_TLS_PEM_FILE='TLS server UCI not implemented'
+ UB_TLS_FWD_FILE=/var/lib/unbound/ca-certificates.crt
+ UB_TLS_ETC_FILE=/etc/ssl/certs/ca-certificates.crt
+ UB_RKEY_FILE=/var/lib/unbound/root.key
+ UB_RHINT_FILE=/var/lib/unbound/root.hints
+ UB_TIME_FILE=/var/lib/unbound/hotplug.time
+ UB_SKIP_FILE=/var/lib/unbound/skip.time
+ UB_CTLKEY_FILE=/var/lib/unbound/unbound_control.key
+ UB_CTLPEM_FILE=/var/lib/unbound/unbound_control.pem
+ UB_SRVKEY_FILE=/var/lib/unbound/unbound_server.key
+ UB_SRVPEM_FILE=/var/lib/unbound/unbound_server.pem
+ date '+%s'
+ UB_XSER=26649555
+ UB_XSOA='7200 IN SOA localhost. nobody.invalid. 26649555 3600 1200 9600 300'
+ UB_XNS='7200 IN NS localhost.'
+ UB_XTXT='7200 IN TXT "comment=local intranet dns zone"'
+ UB_MTXT='7200 IN TXT "comment=masked internet dns zone"'
+ UB_LTXT='7200 IN TXT "comment=rfc6762 multicast dns zone"'
+ UB_ANCHOR=/usr/sbin/unbound-anchor
+ UB_CONTROL=/usr/sbin/unbound-control
+ UB_CONTROL_CFG='/usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf'
+ odhcpd_zonedata
+ uci_get 'unbound.@unbound[0].dhcp_link'
+ local 'PACKAGE=unbound.@unbound[0].dhcp_link'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp_link'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_link=odhcpd'
+ uci_get 'unbound.@unbound[0].dhcp4_slaac6'
+ local 'PACKAGE=unbound.@unbound[0].dhcp4_slaac6'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp4_slaac6'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp4_slaac6=1'
+ uci_get 'unbound.@unbound[0].domain'
+ local 'PACKAGE=unbound.@unbound[0].domain'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].domain'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_domain=home.lan'
+ uci_get 'dhcp.@odhcpd[0].leasefile'
+ local 'PACKAGE=dhcp.@odhcpd[0].leasefile'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'dhcp.@odhcpd[0].leasefile'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_origin=/var/lib/odhcpd/dhcp.leases'
+ '[' -f /var/lib/unbound/unbound.conf ]
+ '[' -f /var/lib/odhcpd/dhcp.leases ]
+ '[' odhcpd '=' odhcpd ]
+ '[' -n home.lan ]
+ local longconf dateconf dateoldf
+ local 'dns_ls_add=/var/lib/unbound/dhcp_dns.add'
+ local 'dns_ls_del=/var/lib/unbound/dhcp_dns.del'
+ local 'dns_ls_new=/var/lib/unbound/dhcp_dns.new'
+ local 'dns_ls_old=/var/lib/unbound/dhcp_dns.old'
+ local 'dhcp_ls_new=/var/lib/unbound/dhcp_lease.new'
+ '[' '!' -f /var/lib/unbound/dhcp.conf ]
+ '[' '!' -f /var/lib/unbound/dhcp_dns.old ]
+ date '+%s'
+ date -r /var/lib/unbound/dhcp.conf '+%s'
+ dateconf=2919
+ date '+%s'
+ date -r /var/lib/unbound/dhcp_dns.old '+%s'
+ dateoldf=0
+ '[' 2919 -gt 300 ]
+ touch /var/lib/unbound/dhcp_dns.old
+ sort /var/lib/odhcpd/dhcp.leases
+ longconf=longtime
+ awk -v 'conffile=/var/lib/unbound/dhcp.conf' -v 'pipefile=/var/lib/unbound/dhcp_dns.new' -v 'domain=home.lan' -v 'bslaac=1' -v 'bisolt=0' -v 'bconf=1' -f /usr/lib/unbound/odhcpd.awk /var/lib/unbound/dhcp_lease.new
+ awk '{ print $1 }' /var/lib/unbound/dhcp_dns.old
+ sort
+ uniq
+ cp /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.add
cp: can't stat '/var/lib/unbound/dhcp_dns.new': No such file or directory
+ cp /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.old
cp: can't stat '/var/lib/unbound/dhcp_dns.new': No such file or directory
+ cat /var/lib/unbound/dhcp_dns.del
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
removed 3 datas
+ cat /var/lib/unbound/dhcp_dns.add
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
cat: can't open '/var/lib/unbound/dhcp_dns.add': No such file or directory
added 0 datas
+ rm -f /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.del /var/lib/unbound/dhcp_dns.add /var/lib/unbound/dhcp_lease.new
Server:		127.0.0.1
Address:	127.0.0.1#53

** server can't find barracuda.home.lan: NXDOMAIN
** server can't find barracuda.home.lan: NXDOMAIN

As you can see at the start of the script the dns lookup works just fine, and afterwards it fails with NXDOMAIN. There are a couple of "No such file or directory" errors during the script execution. The contents of the lease file looks normal. I suspect the scripts are simply executing in parallel and overwriting each others file.

@EricLuehrsen If you need some more data, just tell me what you need. I can easily reproduce and detect the problem now!

1 Like

This is about the best script fix that can be done, if we assume odhcpd lease trigger isn't changed.

2 Likes

I'll give the fix a try in the next few days!

In the meantime, I also discovered something strange. I noticed that I have many unbound-control processes running on my router:

# ps | grep unbound-control
 2335 root      5988 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
 2385 root      5988 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
 2428 root      5988 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
 2995 root      5988 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
 3839 root      5988 S    unbound-control -c /var/lib/unbound/unbound.conf stats_noreset
 3970 root      5988 S    unbound-control -c /var/lib/unbound/unbound.conf stats_noreset
 4090 root      5988 S    unbound-control -c /var/lib/unbound/unbound.conf stats_noreset
25060 root      5964 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
25064 root      5964 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
28771 root      5984 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
29845 root      5984 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
30351 root      5984 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
30681 root      5968 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
30721 root      5968 S    /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas

And they seem to be stuck somehow. If I manually try to execute unbound-control, it also doesn't return:

# unbound-control list_local_zones

At this point dns resolving is also problematic:

$ dig aaaa google.es @192.168.1.1

; <<>> DiG 9.16.1-Ubuntu <<>> aaaa google.es @192.168.1.1
;; global options: +cmd
;; connection timed out; no servers could be reached

$ dig aaaa google.es @fd81:631b:716f:10::1

; <<>> DiG 9.16.1-Ubuntu <<>> aaaa google.es @fd81:631b:716f:10::1
;; global options: +cmd
;; connection timed out; no servers could be reached

When I kill all those processes, everything starts working again.

1 Like

I have been running with the fix for about two weeks now, and I haven't seen the problem anymore. Nice!

1 Like

Hello!

I have the same setup (unbound + odhcpd) and encountering kind of the same issues.

As for the Luci interface not displaying anything anymore, I've asked on Github and it turns out that it was "working by mistake" (see https://github.com/openwrt/luci/issues/4451). Basically, it now needs a "dnsmasq" section to be displayed in the UI. As requests seem to be sent as expected, I'll see if I can find a way to just hide the dnsmasq parts of the UI and still display the lease list.

@jefdriesen have you managed to get the static host entries into unbound?

I've managed to have the host entries put into unbound configuration by adding the following line in the unbound's dnsmasq script:

config_foreach create_host_record host

I'm most likely certain that this is not a proper solution, but it's working for me and it could help other people (you'll need to set add_extra_dns to a value > 0).

I wasn't having this issue with the fix for about a month now, but today it happened again after my ISP had some connectivity issues. I am guessing there are still some issues with this... :\

Edit: Was using the first fix - didn't realize there was a second one - will give that a try...

No, I simply rely on the dhcp leases now. It's not entirely perfect, but in practice it works most of the time. Only after a reboot of the router, or when the host isn't running, dns resolving fails. But now that the dhcp leases are working fine, those are mainly rare corner cases, and so far I haven't run into them yet.

As EricLuehrsen pointed out earlier in this thread, you could use unbound_srv.conf for this purpose (haven't tried that yet, but I assume it works fine), or adding a domain entry with add_extra_dns as I did first. But yes, it would indeed be nice if this information could be obtained directly from the static host entries, instead of having to duplicate that information elsewhere.