Router not receiving/responding to dhcp requests on LAN?

I upgraded a router at a remote location several hundred miles away to 18.06.1 ballsy I know, but it seemed to work.

The device is a ZyXEL NBG6817

Everything seems to work fine, and I even have wireguard running... but... the device isn't giving out ipv4 leases! I see devices on the LAN sending out DHCP DISCOVER packets, but no log on the router and no leases are ever handed out.

NOTE: ipv6 leases are being handed out fine!

Of course this results in half-broken connectivity, and I'm trying to debug it remotely and distant relatives are kinda annoyed. wth? any ideas? Settings are relatively default. relevant network:


config interface 'lan'
	option type 'bridge'
	option ifname 'eth1.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '10.79.2.1'
	option broadcast '10.79.2.255'

/etc/config/dhcp

config dnsmasq
	option domainneeded '1'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.auto'
	option nonwildcard '1'
	option cachesize '1000'
	list server '1.1.1.1'
	list server '1.0.0.1'
	list server '2606:4700:4700::1111'
	list server '2606:4700:4700::1001'
	option localservice '0'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option dhcpv6 'server'
	option ra 'server'
	option leasetime '2h'
	option ra_management '1'
	list dns 'fd8f:1240:f1e8::1'
	list domain 'lan'
	option force '1'
	option limit '200'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'



Does the dnsmasq user exist (this shouldn't be an issue for the nbg6817, as support for it didn't exist before 17.01.0, but…)?

# grep dnsmasq /etc/passwd /etc/group 
/etc/passwd:dnsmasq:x:453:453:dnsmasq:/var/run/dnsmasq:/bin/false
/etc/group:dnsmasq:x:453:dnsmasq

Yep, definitely there

I upgraded this to the release version from a snapshot if that means anything.

I should probably do a tcpdump for the dhcp requests to see if the router even hears them, but now I've set all the devices to be static IP for the moment so they won't be requesting :wink: sigh

not a solution though because when I bring my own devices or visitor devices I want them to get dhcp!

Any error messages for dnsmasq in the syslog (logread)?

nothing that seems to help:

Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: started, version 2.80test3 cachesize 1000
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP no-DHCPv6 no-Lua TFTP no-conntrack no-ipset no-auth no-DNSSEC no-ID loop-detect inotify dumpfile
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq-dhcp[5731]: DHCP, IP range 10.79.2.100 -- 10.79.2.254, lease time 2h
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain test
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain onion
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain localhost
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain local
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain invalid
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain bind
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1001#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1111#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 1.0.0.1#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 1.1.1.1#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain lan
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: reading /tmp/resolv.conf.auto
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain test
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain onion
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain localhost
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain local
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain invalid
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain bind
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1001#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1111#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 1.0.0.1#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 1.1.1.1#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using local addresses only for domain lan
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 1.1.1.1#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 1.0.0.1#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1111#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1001#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1111#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: using nameserver 2606:4700:4700::1001#53
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: read /etc/hosts - 4 addresses
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: read /tmp/hosts/odhcpd - 2 addresses
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq[5731]: read /tmp/hosts/dhcp.cfg01411c - 2 addresses
Fri Jan  4 16:44:31 2019 daemon.info dnsmasq-dhcp[5731]: read /etc/ethers - 0 addresses

While I haven't really tested a 18.01.x release image on my nbg6817 (nor wireguard at all, personally I'm using strongswan), I've been (successfully) running recent snapshots before (and after-) 18.06 was branched off (I'm sticking to building new master images every 1-6 weeks, depending on what has changed in master).

If you can, I'd try to simplify your configs for testing (e.g. broadcast is implicit and shouldn't be necessary), your dhcp/ dnsmasq config is obviously more complex, but maybe you can reduce it to (almost) defaults for a short test (as long as you remember the router's IP, there's nothing which should break when reverting to a vanilla config (what might introduce breakage most easily could be ra_management, list server, list dns (I don't seem to see this in the reference, also, no IPv4 alternative?)).

That said, are the tunnel interface firewall rules allowing DHCP/ DNS (as needed) requests through the tunnel (correct ordering)?

The wireguard is just for remote access command and control, so it's not a wireguard issue.

Yes I could probably revert it somewhat, but I can't revert it entirely or I'll lose remote access! I don't want to talk people through configuring everything... would take a full day probably.

permissions on var/db files

strace dnsmasq?

check client logs?

This is not right. Either reduce it to 154 or start from less than 54.
It looks in the logs that it is automatically reduced to finish the dhcp pool at 254, but you never know.

Also post the firewall configuration for the LAN. By default is ACCEPT all, but if you changed something, it could break the communication.

@dlakelan
Do you have any settings inside/etc/dnsmasq.conf ?
Cause i have some ipset rules inside that file and each time i do sysupgrade, none of my devices get ip
via dhcp, i ended up assign a static ip to my laptop to get ssh to work, so there's two ways to solve the problem:

  1. rename dnsmasq.conf to dnsmasq1.conf. then /etc/init.d/dnsmasq restart.
    or
  2. install dnsmasq-full package, i always use this option cause i need ipset support, it's a temporary problem.
  • Install dnsmasq-full may fix your problem, but first check dnsmaq.conf file.
    You should do the following to remove dnsmasq:
opkg update
opkg remove dnsmasq ; opkg install dnsmasq-full

If it's fail to download dnsmasq-full and it's dependence, then download them manually from HERE and transfer them to your router then install using
opkg install /path/to/package.ipk
or
opkg install /tmp/*.ipk

I cought that limit 200 thing and fixed it last night, we will see if it helps. I haven't been able to test yet.

Also I did have something in /etc/dnsmasq.conf but moving that file didn't help. I may try installing the full version anyway.

You should restart dnsmasq after moving dnsmasq.conf.
also if you have any typo inside dnsmasq.conf dnsmasq will fail silently!

I did restart and dnsmasq was running. The options there were only to limit the src ports for queries, so I don't think they did it but I will check more carefully. Hard to debug this one remotely.

Is the output of dnsmasq --test ok?
is "Authoritative" checked?
also check " /etc/hosts"
this option maybe help you to debug DHCP "--log-dhcp"

Another culprit:

You have this enabled but I don't see any allowed or denied interfaces. So it will basically work only on Loopback. You can verify that quickly with "netstat -lnp | grep 67"

It's setup ignoring wan. I did look for listening on port 67 and it was bound to 0.0.0.0/0 but I'll check again.

Is your dnsmasq compiled with DHCP like this "compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP conntrack ipset auth DNSSEC no-ID loop-detect inotify dumpfile"

It's just the standard release version from the current repo.

Also all of this worked before the upgrade I don't think I changed anything except maybe I made that error about start and limit, which I will check today.

Then there should be something like this in the config

        list notinterface 'pppoe-wan'

but I did not find it.

1 Like