Issues with unbound and odhcp setup

Hi,

I have configured unbound and odhcp according to the documentation:

# cat /etc/config/unbound 
config unbound
	option dns64 '0'
	option edns_size '1280'
	option extended_stats '0'
	option hide_binddata '1'
	option localservice '1'
	option manual_conf '0'
	option num_threads '1'
	option protocol 'default'
	option rebind_localhost '0'
	option rebind_protection '1'
	option recursion 'default'
	option resource 'default'
	option root_age '9'
	option ttl_min '120'
	option verbosity '1'
	option enabled '1'
	option validator '1'
	option unbound_control '1'
	option listen_port '53'
	option dhcp4_slaac6 '1'
	option domain 'home.lan'
	option add_local_fqdn '1'
	option add_extra_dns '1'
	option domain_type 'static'
	option validator_ntp '1'
	option dhcp_link 'odhcpd'
	option add_wan_fqdn '1'
	list trigger_interface 'lan'
	list trigger_interface 'wan'
# cat /etc/config/dhcp 
config dnsmasq
	option domainneeded '1'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option expandhosts '1'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.auto'
	option localservice '1'
	option domain 'home.lan'
	option local '/home.lan/'
	option dnssec '1'
	option port '1053'

config odhcpd 'odhcpd'
	option maindhcp '1'
	option leasefile '/var/lib/odhcpd/dhcp.leases'
	option leasetrigger '/usr/lib/unbound/odhcpd.sh'
	option loglevel '4'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv4 'server'
	option dhcpv6 'server'
	option ra 'server'
	option ra_management '1'
	list dhcp_option '6,192.168.1.1'
	option dns '192.168.1.1'
	option domain 'home.lan'

config dhcp 'guest'
	option interface 'guest'
	option start '100'
	option limit '150'
	option leasetime '1h'
	option dhcpv4 'server'
	option dhcpv6 'server'
	option ra 'server'
	option ra_management '1'
	list dhcp_option '6,192.168.2.1'
	option dns '192.168.2.1'
	option domain 'home.lan'

config host
	option name 'nas'
	option mac 'XX:XX:XX:XX:XX:XX'
	option ip '192.168.1.10'
	option dns '1'

One of the nice advantages over dnsmasq is that odhcp seems to have no problem with devices with multiple IP address on the same network (e.g. a laptop connected on both wired and wireless, or a server with multiple network cards). With dnsmasq, Luci shows a hostname for only one of the IP addresses. The others just shows up as a "-", which is annoying to identify which devices are connected. And unbound also nicely returns all IP addresses when doing DNS lookups. Great!

But I'm also having some issues:

When trying to uninstall (instead of just disabling dnsmasq), I can no longer configure static DHCP leases in Luci because the "DHCP and DNS" page is completely gone. But odhcp also uses the same settings. I can still configure static leases manually (editing /etc/config/dhcp or uci command-line), but that's less convenient.

Static DHCP leases are only resolved by unbound while the lease is active. With dnsmasq they seem to be available permanently, even if there is no active lease at the moment. Usually this isn't a problem, except in cases where the lease information is not up-to-date. For example if the router is rebooted after installing updates, some devices do not request a new IP address because they still have a valid lease. But that means their hostname does not resolve anymore until the release is renewed (up to 12h in my configuration).

There is the unbound add_extra_dns=1 option, but it doesn't seem to take into account uci host entries, only domain entries. But that means I have to enter all static hosts twice: a host entry for the static DHCP lease, and another identical domain entry for the DNS. That's not very practical, especially because the host entries already have the dns=1 option, to indicate they should be entered into dns.

The hostname of the router itself does not seem to resolve at all. I have configured 'router' as hostname, but a DNS query for router.home.lan fails with NXDOMAIN:

$ dig a router.home.lan

; <<>> DiG 9.11.5-P4-5.1ubuntu2.2-Ubuntu <<>> a router.home.lan
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 36181
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;router.home.lan.		IN	A

;; AUTHORITY SECTION:
home.lan.		7200	IN	SOA	localhost. nobody.invalid. 26528266 3600 1200 9600 300

;; Query time: 4 msec
;; SERVER: fd81:631b:716f:10::1#53(fd81:631b:716f:10::1)
;; WHEN: di jun 09 11:53:35 CEST 2020
;; MSG SIZE  rcvd: 103

With the DDNS name of the router I have a similar problem. A DNS query returns an empty response with no A record at all:

$ dig a ddns.domain.tld

; <<>> DiG 9.11.5-P4-5.1ubuntu2.2-Ubuntu <<>> a ddns.domain.tld
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16933
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1280
;; QUESTION SECTION:
;ddns.domain.tld.		IN	A

;; Query time: 32 msec
;; SERVER: fd81:631b:716f:10::1#53(fd81:631b:716f:10::1)
;; WHEN: di jun 09 11:46:32 CEST 2020
;; MSG SIZE  rcvd: 51

dnsmasq returns the correct result:

$ dig -p1053 a ddns.domain.tld

; <<>> DiG 9.11.5-P4-5.1ubuntu2.2-Ubuntu <<>> -p1053 a ddns.domain.tld
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29189
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ddns.domain.tld.		IN	A

;; ANSWER SECTION:
ddns.domain.tld.	268	IN	A	A.B.C.D

;; Query time: 5 msec
;; SERVER: fd81:631b:716f:10::1#1053(fd81:631b:716f:10::1)
;; WHEN: di jun 09 11:50:35 CEST 2020
;; MSG SIZE  rcvd: 67

This is a major issue for me. It also cause trouble for updating the DDNS record. I assume that's because the ddns-script can't check the current IP address. The strange thing is that this problem doesn't happen all the time.

Jef

Resolving the DDNS hostname works after setting add_wan_fqdn=1.

Resolving the local hostname of the router works after setting add_local_fqdn=3. Strange that add_local_fqdn=1 isn't sufficient. All DHCP provided hostnames are automatically added as fqdn (the hostname alone doesn't even resolve), but setting add_local_fqdn=1 only adds the hostname of the router and not the fqdn. That's not a very consistent behavior.

BTW, is there a way to make unbound resolve plain DHCP hostnames too, like dnsmasq does?

It should already do that: ok, I think there might be a bug in the scripts somewhere but queries from a local client should be correctly answered...

... once you let unbound know that the server for home.lan is dnsmasq.
option dhcp_link 'dnsmasq'

EDIT: if you have intentionally removed dnsmasq here, then I shut up :slight_smile: and wait for the solution as well, out of curiosity

First, the options we are talking about here are "off standard" to make SOHO routers work for a local purpose.... Unbound follows IETF RFC for better global resolution and it doesn't really like to resolve TLD as a host A/AAAA. Unbound also isn't an authoritative server, so it is not "handling" the domain, but has static-cache for data (that it didn't lookup). dnsmasq has some IETF deviations that are convenient for local networks, it is pseudo authoritative, is happy about hostnames as TLD, and makes adding domain to host names an option. I submitted add_local_fqdn and add_wan_fqdn options to dnsmasq first to extend OpenWrt original behavior. I copied this implementation (once reviewed, abused, and accepted) to Unbound. add_local_fqdn 3 or 4 is most correct per global standards. add_wan_fqdn should be 0 and have DNS upstream. For consumer ISP it is convenient to override that mess. Dynamic DNS will give you A/AAAA or CNAME records, but the other direction PTR records are owned by your ISP.

The main reason I'm trying unbound is for the recursive dns resolver. Dnsmasq simply doesn't support that. Dnsmasq might be a bit better at handling a local network, because it was designed for that purpose, but so far the unbound integration seems to work fine for me. At least once I figured out how to configure it correctly (e.g. add_wan_fqdn=0 and add_local_fqdn=3).

The fact that unbound doesn't resolve plain hostnames, and doesn't automatically adds a domain name, is usually not a problem. I have configured the DHCP server to send a domain name, and that way the local resolver on each client will take care of adding the domain name to plain hostnames.

The main feature I'm missing is to automatically enter all static dhcp leases into unbound dns with add_extra_dns=1 option. Now I need two entries for each host:

config host
	option name 'nas'
	option mac 'XX:XX:XX:XX:XX:XX'
	option ip '192.168.1.10'
	option dns '1'

config domain
	option name 'nas.home.lan'
	option ip '192.168.1.10'

I'm aware of that. I mainly need the ddns name as an easy to remember alias for the dynamic IP address. The fact that a reverse dns lookup does not return my ddns hostname, but something controlled by my isp, is no problem for me.

Could the duplicate host and domain entry cause problems? I noticed that sometimes I can't resolve nas.home.lan anymore. This happens almost daily. Restarting unbound fixes the problem.

I'm not sure if this is related, but in the logs I'm seeing this error:

Tue Jun 16 18:22:38 2020 daemon.err unbound: [24054:0] error: error parsing local-data at 2 '.138.16.192.in-addr.arpa. 300 IN PTR windows-rjel7sp.home.lan': Empty label
Tue Jun 16 18:22:38 2020 daemon.err unbound: [24054:0] error: error parsing local-data at 35 'windows-rjel7sp.home.lan. 300 IN A 192.16.138': Conversion error, ip4 addr expected

And at startup:

Sun Jun 14 12:09:44 2020 daemon.warn unbound: [24054:0] warning: duplicate local-zone home.lan.
Sun Jun 14 12:09:44 2020 daemon.warn unbound: [24054:0] warning: duplicate local-zone home.lan.
Sun Jun 14 12:09:44 2020 daemon.warn unbound: [24054:0] warning: duplicate local-zone home.lan.

Okay. I am not sure the scripts for Unbound do or even can interpret those the same as they may be (depending on other options) by dnsmasq. That option was created to ease trial of Unbound with existing dnsmasq configuration, but wasn't tested on all possible means of using those dnsmasq clauses. dnsmasq has some magic where implies a match to the local domain or suffixes it accordingly.

As far as the actual errors, it is weird. It is like the IP address was mangled. You didn't provide the DHCP UCI clause that created it, so I cannot be sure how.

The main reason I added the domain entry in addition to the host entry is because after a reboot of the router, the dhcp table (/var/lib/odhcp/dhcp.leases) is cleared. That means the nas is unavailable (in dns) until the nas renews its dhcp lease. Since that's unacceptable for a critical service in my network, I wanted to add it permanently into the dns with the domain entry.

For now I removed the domain entry, to see if that is the source of the problem or not.

It looks indeed like a mangled IP address. It's a dynamic DHCP host, so it's not configured in uci. So this might be a bug in the /usr/lib/unbound/odhcp.sh script? Because that's how dynamic dhcp hosts are added unbound dns?

I'm also experiencing some other very annoying issue with unbound. Once in a while, I loose all network connectivity on my network. Nothing seems to work anymore, except for pure IP based communication (e.g. pinging an IP address or ssh'ing to an ip address still works). After restarting unbound on the router, everything starts working again. I'm not sure what is going wrong. Any suggestions for how to debug this?

1 Like

What openwrt versions you are running /tried?

I'm using the latest 19.07.3 release:

Model: PC Engines apu2
Architecture: AMD GX-412TC SOC
Firmware Version: OpenWrt 19.07.3 r11063-85e04e9f46 / LuCI openwrt-19.07 branch git-20.167.61968-87da00a
Kernel Version: 4.14.180

It's a new setup, so I haven't tried any other versions.

Yes, host entries come through when the DHCP server detects them. Don't use dhcp/domain if you need a jump start. You can add to unbound_srv.conf and its more like a zone file than dnsmasq UCI.
local-data: nas.home.lan. 300 IN CNAME windows-rjel7sp.home.lan.
local-data: windows-rjel7sp.home.lan. 300 IN A 192.168.1.3
(find other examples in /var/lib/unbound/dhcp.conf; LuCI has a display page for everything)

You may have a resource choke. unbound-daemon (light) doesn't use libevent or libpthread. It can only have so many ports open, and depending on time outs the OS can leave them hanging. You said, "...critical service in my network," so if this is a business or other loaded environment you should use unbound-daemon-heavy and use a medium or larger resource setting.

As far as the correct lease parsing, may I please have the relevant lease and comment lines from odhcpd lease file for your NAS? Maybe odhcpd intermittently generates something extra or different for static leases. It may be an easy tweak but I don't have an example myself.

1 Like

Thanks for the info. I'll give this a try.

I'm already using the unbound-daemon-heavy variant. The router has 4G of memory, and there still is plenty of free memory available. So I would be really surprised if this is a memory resource related problem. Anyway, the unbound memory resource setting was set to "default", and I've changed it to to "large" now.

If unbound is running out file descriptors, then I would expect to see some error in the logs?

Since the router has a quad core cpu, I also increased the unbound num_threads setting to 4 a while ago. (This was not yet in the settings I posted earlier.) It seems it's only using 2 threads.

Sure:

# cat /var/lib/odhcpd/dhcp.leases
...
192.168.1.10	nas.home.lan	nas
# br-lan 3cecef01a300 ipv4 nas -1 c0a8010a 32 192.168.1.10/32
...

Note that the parsing error was not for the NAS. The windows-rjel7sp host is just another PC on the network. It does not have static dhcp entry configured, so it gets a dynamic ip address. I also only saw that error once, and I haven't seen it again. So this was probably just a one time issue that's not related to the other problems. I just noticed it in the logs while investigating the other issues. But if I see it again, I'll report the dhcp.leases file.

It looks like unbound forgets about all dhcp entries somehow. It's not just my nas that doesn't resolve anymore, but every single host in my network. The only exceptions seems to be the router itself and the two switches (which have a static IP address, and have a host entry in luci):

# unbound-control list_local_zones | grep home.lan
home.lan. transparent

# unbound-control list_local_data | grep home.lan
home.lan.       7200    IN      SOA     localhost. nobody.invalid. 26557350 3600 1200 9600 300
home.lan.       7200    IN      NS      localhost.
home.lan.       7200    IN      TXT     "comment=local intranet dns zone"
router.home.lan.        300     IN      A       192.168.1.1
router.home.lan.        300     IN      A       192.168.2.1
switch-garage.home.lan. 300     IN      A       192.168.1.2
switch-living.home.lan. 300     IN      A       192.168.1.3
1.1.168.192.in-addr.arpa.       300     IN      PTR     router.home.lan.
2.1.168.192.in-addr.arpa.       300     IN      PTR     switch-garage.home.lan.
3.1.168.192.in-addr.arpa.       300     IN      PTR     switch-living.home.lan.
1.2.168.192.in-addr.arpa.       300     IN      PTR     router.home.lan.

The strangest part is that the problem appears and then disappears again. Any idea what could cause this?

1 Like

To collect some information at the time the problem appears or disappears, I ran the following script:

while true; do
	date
	nslookup nas.home.lan
	ls -la /var/lib/unbound/
	sleep 20
done

And this is what I got:

Tue Jun 30 11:02:40 UTC 2020
Server:         127.0.0.1
Address:        127.0.0.1#53

** server can't find nas.home.lan: NXDOMAIN
** server can't find nas.home.lan: NXDOMAIN
drwxr-xr-x    2 unbound  unbound        220 Jun 30 10:26 .
drwxr-xr-x    6 root     root           120 Jun 12 14:12 ..
-rw-r--r--    1 unbound  unbound    1545720 Jun 18 09:37 adb_list.overall
-rw-r--r--    1 unbound  unbound     198416 Jun  9 20:46 ca-certificates.crt
-rw-r--r--    1 unbound  unbound       8553 Jun 30 10:26 dhcp.conf
-rw-r--r--    1 root     root          8939 Jun 30 10:26 dhcp_dns.old
-rw-r--r--    1 unbound  unbound         25 Jun 11 16:04 hotplug.time
-rw-r--r--    1 unbound  unbound        758 Jun 30 09:58 root.key
-rw-r--r--    1 unbound  unbound       7375 Jun 29 14:30 unbound.conf
-rw-r--r--    1 unbound  unbound        525 Jun  9 20:46 unbound_ext.conf
-rw-r--r--    1 unbound  unbound        873 Jun 29 14:25 unbound_srv.conf

Tue Jun 30 11:03:00 UTC 2020
Server:         127.0.0.1
Address:        127.0.0.1#53

Name:      nas.home.lan
Address 1: 192.168.1.10
drwxr-xr-x    2 unbound  unbound        220 Jun 30 11:03 .
drwxr-xr-x    6 root     root           120 Jun 12 14:12 ..
-rw-r--r--    1 unbound  unbound    1545720 Jun 18 09:37 adb_list.overall
-rw-r--r--    1 unbound  unbound     198416 Jun  9 20:46 ca-certificates.crt
-rw-r--r--    1 unbound  unbound       8297 Jun 30 11:02 dhcp.conf
-rw-r--r--    1 root     root          8939 Jun 30 11:03 dhcp_dns.old
-rw-r--r--    1 unbound  unbound         25 Jun 11 16:04 hotplug.time
-rw-r--r--    1 unbound  unbound        758 Jun 30 09:58 root.key
-rw-r--r--    1 unbound  unbound       7375 Jun 29 14:30 unbound.conf
-rw-r--r--    1 unbound  unbound        525 Jun  9 20:46 unbound_ext.conf
-rw-r--r--    1 unbound  unbound        873 Jun 29 14:25 unbound_srv.conf

Look at the timestamps of the dhcp.conf and dhcp_dns.old files. They are updated just before nslookup returns an IP address again.

So I suspect some issue with the /usr/lib/unbound/odhcpd.sh script.

I have some additional info. To debug the problem, I made a small wrapper script to execute sh -x /usr/lib/unbound/odhcpd.sh and log the output to a unique file based on the timestamp and PID. That revealed something interesting: the script gets often called twice within the same second:

# cat odhcp.log.20200630T180112.19593
Tue Jun 30 18:01:12 UTC 2020
+ . /lib/functions.sh
+ N='
'
+ _C=0
+ NO_EXPORT=1
+ LOAD_STATE=1
+ LIST_SEP=' '
+ reset_cb
+ '[' -z  -a -f /lib/config/uci.sh ]
+ . /lib/config/uci.sh
+ CONFIG_APPEND=
+ . /usr/lib/unbound/defaults.sh
+ UB_LIBDIR=/usr/lib/unbound
+ UB_VARDIR=/var/lib/unbound
+ UB_PIDFILE=/var/run/unbound.pid
+ UB_TOTAL_CONF=/var/lib/unbound/unbound.conf
+ UB_CORE_CONF=/var/lib/unbound/server.conf.tmp
+ UB_HOST_CONF=/var/lib/unbound/host.conf.tmp
+ UB_DHCP_CONF=/var/lib/unbound/dhcp.conf
+ UB_ZONE_CONF=/var/lib/unbound/zone.conf.tmp
+ UB_CTRL_CONF=/var/lib/unbound/ctrl.conf.tmp
+ UB_SRVMASQ_CONF=/var/lib/unbound/dnsmasq_srv.conf.tmp
+ UB_EXTMASQ_CONF=/var/lib/unbound/dnsmasq_ext.conf.tmp
+ UB_SRV_CONF=/var/lib/unbound/unbound_srv.conf
+ UB_EXT_CONF=/var/lib/unbound/unbound_ext.conf
+ UB_TLS_KEY_FILE='TLS server UCI not implemented'
+ UB_TLS_PEM_FILE='TLS server UCI not implemented'
+ UB_TLS_FWD_FILE=/var/lib/unbound/ca-certificates.crt
+ UB_TLS_ETC_FILE=/etc/ssl/certs/ca-certificates.crt
+ UB_RKEY_FILE=/var/lib/unbound/root.key
+ UB_RHINT_FILE=/var/lib/unbound/root.hints
+ UB_TIME_FILE=/var/lib/unbound/hotplug.time
+ UB_SKIP_FILE=/var/lib/unbound/skip.time
+ UB_CTLKEY_FILE=/var/lib/unbound/unbound_control.key
+ UB_CTLPEM_FILE=/var/lib/unbound/unbound_control.pem
+ UB_SRVKEY_FILE=/var/lib/unbound/unbound_server.key
+ UB_SRVPEM_FILE=/var/lib/unbound/unbound_server.pem
+ date '+%s'
+ UB_XSER=26559001
+ UB_XSOA='7200 IN SOA localhost. nobody.invalid. 26559001 3600 1200 9600 300'
+ UB_XNS='7200 IN NS localhost.'
+ UB_XTXT='7200 IN TXT "comment=local intranet dns zone"'
+ UB_MTXT='7200 IN TXT "comment=masked internet dns zone"'
+ UB_LTXT='7200 IN TXT "comment=rfc6762 multicast dns zone"'
+ UB_ANCHOR=/usr/sbin/unbound-anchor
+ UB_CONTROL=/usr/sbin/unbound-control
+ UB_CONTROL_CFG='/usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf'
+ odhcpd_zonedata
+ uci_get 'unbound.@unbound[0].dhcp_link'
+ local 'PACKAGE=unbound.@unbound[0].dhcp_link'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp_link'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_link=odhcpd'
+ uci_get 'unbound.@unbound[0].dhcp4_slaac6'
+ local 'PACKAGE=unbound.@unbound[0].dhcp4_slaac6'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp4_slaac6'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp4_slaac6=1'
+ uci_get 'unbound.@unbound[0].domain'
+ local 'PACKAGE=unbound.@unbound[0].domain'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].domain'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_domain=home.lan'
+ uci_get 'dhcp.@odhcpd[0].leasefile'
+ local 'PACKAGE=dhcp.@odhcpd[0].leasefile'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'dhcp.@odhcpd[0].leasefile'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_origin=/var/lib/odhcpd/dhcp.leases'
+ '[' -f /var/lib/unbound/unbound.conf ]
+ '[' -f /var/lib/odhcpd/dhcp.leases ]
+ '[' odhcpd '=' odhcpd ]
+ '[' -n home.lan ]
+ local longconf dateconf
+ local 'dns_ls_add=/var/lib/unbound/dhcp_dns.add'
+ local 'dns_ls_del=/var/lib/unbound/dhcp_dns.del'
+ local 'dns_ls_new=/var/lib/unbound/dhcp_dns.new'
+ local 'dns_ls_old=/var/lib/unbound/dhcp_dns.old'
+ local 'dhcp_ls_new=/var/lib/unbound/dhcp_lease.new'
+ sort /var/lib/odhcpd/dhcp.leases
+ '[' '!' -f /var/lib/unbound/dhcp.conf ]
+ '[' '!' -f /var/lib/unbound/dhcp_dns.old ]
+ date '+%s'
+ date -r /var/lib/unbound/dhcp.conf '+%s'
+ dateconf=309
+ '[' 309 -gt 300 ]
+ longconf=longtime
+ awk -v 'conffile=/var/lib/unbound/dhcp.conf' -v 'pipefile=/var/lib/unbound/dhcp_dns.new' -v 'domain=home.lan' -v 'bslaac=1' -v 'bisolt=0' -v 'bconf=1' -f /usr/lib/unbound/odhcpd.awk /var/lib/unbound/dhcp_lease.new
+ awk '{ print $1 }' /var/lib/unbound/dhcp_dns.old
+ sort
+ uniq
+ cp /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.add
+ cp /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.old
+ '[' -f /var/lib/unbound/dhcp_dns.del ]
+ cat /var/lib/unbound/dhcp_dns.del
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
removed 70 datas
+ '[' -f /var/lib/unbound/dhcp_dns.add ]
+ cat /var/lib/unbound/dhcp_dns.add
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
added 0 datas
+ rm -f /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.del /var/lib/unbound/dhcp_dns.add /var/lib/unbound/dhcp_lease.new
#cat odhcp.log.20200630T180112.19599
Tue Jun 30 18:01:12 UTC 2020
+ . /lib/functions.sh
+ N='
'
+ _C=0
+ NO_EXPORT=1
+ LOAD_STATE=1
+ LIST_SEP=' '
+ reset_cb
+ '[' -z  -a -f /lib/config/uci.sh ]
+ . /lib/config/uci.sh
+ CONFIG_APPEND=
+ . /usr/lib/unbound/defaults.sh
+ UB_LIBDIR=/usr/lib/unbound
+ UB_VARDIR=/var/lib/unbound
+ UB_PIDFILE=/var/run/unbound.pid
+ UB_TOTAL_CONF=/var/lib/unbound/unbound.conf
+ UB_CORE_CONF=/var/lib/unbound/server.conf.tmp
+ UB_HOST_CONF=/var/lib/unbound/host.conf.tmp
+ UB_DHCP_CONF=/var/lib/unbound/dhcp.conf
+ UB_ZONE_CONF=/var/lib/unbound/zone.conf.tmp
+ UB_CTRL_CONF=/var/lib/unbound/ctrl.conf.tmp
+ UB_SRVMASQ_CONF=/var/lib/unbound/dnsmasq_srv.conf.tmp
+ UB_EXTMASQ_CONF=/var/lib/unbound/dnsmasq_ext.conf.tmp
+ UB_SRV_CONF=/var/lib/unbound/unbound_srv.conf
+ UB_EXT_CONF=/var/lib/unbound/unbound_ext.conf
+ UB_TLS_KEY_FILE='TLS server UCI not implemented'
+ UB_TLS_PEM_FILE='TLS server UCI not implemented'
+ UB_TLS_FWD_FILE=/var/lib/unbound/ca-certificates.crt
+ UB_TLS_ETC_FILE=/etc/ssl/certs/ca-certificates.crt
+ UB_RKEY_FILE=/var/lib/unbound/root.key
+ UB_RHINT_FILE=/var/lib/unbound/root.hints
+ UB_TIME_FILE=/var/lib/unbound/hotplug.time
+ UB_SKIP_FILE=/var/lib/unbound/skip.time
+ UB_CTLKEY_FILE=/var/lib/unbound/unbound_control.key
+ UB_CTLPEM_FILE=/var/lib/unbound/unbound_control.pem
+ UB_SRVKEY_FILE=/var/lib/unbound/unbound_server.key
+ UB_SRVPEM_FILE=/var/lib/unbound/unbound_server.pem
+ date '+%s'
+ UB_XSER=26559001
+ UB_XSOA='7200 IN SOA localhost. nobody.invalid. 26559001 3600 1200 9600 300'
+ UB_XNS='7200 IN NS localhost.'
+ UB_XTXT='7200 IN TXT "comment=local intranet dns zone"'
+ UB_MTXT='7200 IN TXT "comment=masked internet dns zone"'
+ UB_LTXT='7200 IN TXT "comment=rfc6762 multicast dns zone"'
+ UB_ANCHOR=/usr/sbin/unbound-anchor
+ UB_CONTROL=/usr/sbin/unbound-control
+ UB_CONTROL_CFG='/usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf'
+ odhcpd_zonedata
+ uci_get 'unbound.@unbound[0].dhcp_link'
+ local 'PACKAGE=unbound.@unbound[0].dhcp_link'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp_link'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_link=odhcpd'
+ uci_get 'unbound.@unbound[0].dhcp4_slaac6'
+ local 'PACKAGE=unbound.@unbound[0].dhcp4_slaac6'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].dhcp4_slaac6'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp4_slaac6=1'
+ uci_get 'unbound.@unbound[0].domain'
+ local 'PACKAGE=unbound.@unbound[0].domain'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'unbound.@unbound[0].domain'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_domain=home.lan'
+ uci_get 'dhcp.@odhcpd[0].leasefile'
+ local 'PACKAGE=dhcp.@odhcpd[0].leasefile'
+ local 'CONFIG='
+ local 'OPTION='
+ local 'DEFAULT='
+ local 'STATE='
+ /sbin/uci -q get 'dhcp.@odhcpd[0].leasefile'
+ RET=0
+ '[' 0 -ne 0 ]
+ return 0
+ local 'dhcp_origin=/var/lib/odhcpd/dhcp.leases'
+ '[' -f /var/lib/unbound/unbound.conf ]
+ '[' -f /var/lib/odhcpd/dhcp.leases ]
+ '[' odhcpd '=' odhcpd ]
+ '[' -n home.lan ]
+ local longconf dateconf
+ local 'dns_ls_add=/var/lib/unbound/dhcp_dns.add'
+ local 'dns_ls_del=/var/lib/unbound/dhcp_dns.del'
+ local 'dns_ls_new=/var/lib/unbound/dhcp_dns.new'
+ local 'dns_ls_old=/var/lib/unbound/dhcp_dns.old'
+ local 'dhcp_ls_new=/var/lib/unbound/dhcp_lease.new'
+ sort /var/lib/odhcpd/dhcp.leases
+ '[' '!' -f /var/lib/unbound/dhcp.conf ]
+ '[' '!' -f /var/lib/unbound/dhcp_dns.old ]
+ date '+%s'
+ date -r /var/lib/unbound/dhcp.conf '+%s'
+ dateconf=0
+ '[' 0 -gt 300 ]
+ longconf=increment
+ awk -v 'conffile=/var/lib/unbound/dhcp.conf' -v 'pipefile=/var/lib/unbound/dhcp_dns.new' -v 'domain=home.lan' -v 'bslaac=1' -v 'bisolt=0' -v 'bconf=0' -f /usr/lib/unbound/odhcpd.awk /var/lib/unbound/dhcp_lease.new
+ + sortuniq /var/lib/unbound/dhcp_dns.new -u /var/lib/unbound/dhcp_dns.old
 /var/lib/unbound/dhcp_dns.old
+ sort /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.old
+ uniq
+ '[' -f /var/lib/unbound/dhcp_dns.del ]
+ cat /var/lib/unbound/dhcp_dns.del
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas_remove
removed 70 datas
+ '[' -f /var/lib/unbound/dhcp_dns.add ]
+ cat /var/lib/unbound/dhcp_dns.add
+ /usr/sbin/unbound-control -c /var/lib/unbound/unbound.conf local_datas
added 0 datas
+ rm -f /var/lib/unbound/dhcp_dns.new /var/lib/unbound/dhcp_dns.del /var/lib/unbound/dhcp_dns.add /var/lib/unbound/dhcp_lease.new

The unbound-control feedback indicates all dhcp-dns entries are being removed:

removed 70 datas
added 0 datas

Could this be some kind of race condition where the two scripts are executing in parallel, and writing over each others data files? I'm not really sure why the script is executed twice, but I suspect it's once for ipv4 and once for ipv6?

1 Like

:man_facepalming: odhcpd does launch the script for each DHCP lease of any kind. Yet odhcpd does not send any per lease arguments at the script such as the way dnsmasq does. A called script needs to read the lease file created by odhpcd. Because of this, it would be more appropriate for odhpcd to pend a script call with a flag, and call the script in 5 second windows if the pending flag is set. However considering the debate had about netifd and procd blasting interface reloads for IPv6 route updates, and the result that init scripts just had to ignore WAN6, I don't expect such a patch to odhcpd either. More clever script hacking I guess...

If there's anything I can do to help, let me know!

Should I create a ticket for this issue?

I am trying to see what works best to mitigate this.

1 Like