[SOLVED] DNS / DHCP problem after 19.07 upgrade [SOLVED]

Folks:

I have a couple of Netgear R6100 and one Newifi D2 tied together in WDS to cover the far ends of the house. The gateway is named R6100GW and has static IP addresses defined using MAC addresses.

I have an LXC container for MySQL (named database.lan) and another LXC container for Apache (named webserver.lan). On the webserver, I have a few php sites (Drupal, Wordpress, Nextcloud) which are all resolved using names (nextcloud.lan, drupal.lan, and wordpress.lan).

R6100GW doles out IP addresses to server 192.168.111.251 and containers 192.168.111.149 (webserver) and 192.168.111.91 and they work great like they used to earlier.

I also have host entries: nextcloud.lan, wordpress.lan, drupal.lan all of then pointing to 192.168.111.149 and this was working fine. After the upgrade this produces a message like

ping: nextcloud: Temporary failure in name resolution

I completely erased config and just went baby steps to where everything is working with static addresses. As soon as I add host entries ( Network >> Hostnames >> add) the system DNS resultions start to fail.

I am not very technical and so any help would be appreciated. Option otherwise will be to roll back to 18x. I can do troubleshooting and list content of files etc, if you'd desire more info.

Many thanks.

Anil

When upgrading to another major release (like from 18 to 19) it is advised not to keep the settings, as things might break.

For your case, since you did a reset to default settings and you started from scratch, this should not be the case any more.
Post here the following to see what is the situation:

uci export network; uci export dhcp; \
ls -l  /etc/resolv.* /tmp/resolv.*; head -n -0 /etc/resolv.* /tmp/resolv.*
1 Like

Thanks. Here is the output:

agarg@E7440:~$ rm -r .ssh
agarg@E7440:~$ ssh -l root 192.168.111.1
The authenticity of host '192.168.111.1 (192.168.111.1)' can't be established.
RSA key fingerprint is SHA256:fPxjChrtUMI/tImMYY1PQ+es5/ZlXMoeg8huiRD/aUM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.111.1' (RSA) to the list of known hosts.
root@192.168.111.1's password: 


BusyBox v1.30.1 () built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 19.07.1, r10911-c155900f66
 -----------------------------------------------------
root@R6100GW:~# uci export network; uci export dhcp; \
> ls -l  /etc/resolv.* /tmp/resolv.*; head -n -0 /etc/resolv.* /tmp/resolv.*
package network

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd49:745c:1760::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth1.1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.111.1'

config interface 'wan'
	option ifname 'eth0'
	option proto 'dhcp'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '1 2 3 4 0t'

package dhcp

config dnsmasq
	option domainneeded '1'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option localservice '1'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'

config host
	option mac '8E:59:08:AC:4C:02'
	option name 'webserver'
	option dns '1'
	option ip '192.168.111.149'

config host
	option mac 'AA:57:03:D9:D8:16'
	option name 'database'
	option dns '1'
	option ip '192.168.111.92'

config host
	option mac '00:80:64:BF:DE:77'
	option name 'w7020'
	option dns '1'
	option ip '192.168.111.251'

lrwxrwxrwx    1 root     root            16 Jan 29 08:05 /etc/resolv.conf -> /tmp/resolv.conf
-rw-r--r--    1 root     root            32 Feb 25 04:30 /tmp/resolv.conf
-rw-r--r--    1 root     root            62 Feb 25 04:29 /tmp/resolv.conf.auto
==> /etc/resolv.conf <==
search lan
nameserver 127.0.0.1

==> /tmp/resolv.conf <==
search lan
nameserver 127.0.0.1

==> /tmp/resolv.conf.auto <==
# Interface wan
nameserver 192.168.11.254
search attlocal.net

Also attached ping response. When I ssh into the gateway WDS router shell, it does recognize the ping of host. However from the laptop connected to it via LAN or WIFI will not recognize if there is an entry on the static lease but does respond if it is totally dynamic as seen below:

root@R6100GW:~# ping webserver
PING webserver (192.168.111.149): 56 data bytes
64 bytes from 192.168.111.149: seq=0 ttl=64 time=0.897 ms
64 bytes from 192.168.111.149: seq=1 ttl=64 time=0.388 ms
^C
--- webserver ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.388/0.642/0.897 ms
root@R6100GW:~# ping database
PING database (192.168.111.92): 56 data bytes
64 bytes from 192.168.111.92: seq=0 ttl=64 time=0.545 ms
64 bytes from 192.168.111.92: seq=1 ttl=64 time=0.387 ms
^C
--- database ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.387/0.466/0.545 ms
root@R6100GW:~# ping w7020
PING w7020 (192.168.111.251): 56 data bytes
64 bytes from 192.168.111.251: seq=0 ttl=64 time=0.828 ms
^C
--- w7020 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.828/0.828/0.828 ms
root@R6100GW:~# exit
Connection to 192.168.111.1 closed.
agarg@E7440:~$ ping webserver
ping: webserver: Temporary failure in name resolution
agarg@E7440:~$ ping database
ping: database: Temporary failure in name resolution
agarg@E7440:~$ ping w7020
ping: w7020: Temporary failure in name resolution
agarg@E7440:~$ ping e7440
PING E7440 (127.0.1.1) 56(84) bytes of data.
64 bytes from E7440 (127.0.1.1): icmp_seq=1 ttl=64 time=0.053 ms
64 bytes from E7440 (127.0.1.1): icmp_seq=2 ttl=64 time=0.055 ms
^C
--- E7440 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1017ms
rtt min/avg/max/mdev = 0.053/0.054/0.055/0.001 ms
agarg@E7440:~$ 

The E7440 is in the subnet 192.168.111.0/24, right?
What is the content of /etc/resolv.conf of E7440?
You may advertise the available domains to the hosts with

        list dhcp_option '15,lan'
        list dhcp_option '119,lan'

under config dhcp 'lan'

yes, my laptop Dell E7440 gets the IP address from the R6100GW for which I don't have any static entry there. The output from laptop is as follows:

agarg@E7440:~$ uname -a
Linux E7440 5.3.0-40-lowlatency #32-Ubuntu SMP PREEMPT Fri Jan 31 21:48:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
agarg@E7440:~$ sudo cat /etc/resolv.conf
[sudo] password for agarg: 
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0
search lan
agarg@E7440:~$ 

Do you want me to add there two lines in the /etc/config/dhcp file r6100gw? Sort of like:

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
        list dhcp_option '15,lan'
        list dhcp_option '119,lan'

What does list dhcp_option '15,lan' or '119,lan' mean? any pointers?

Thanks.

Is it like: http://www.networksorcery.com/enp/protocol/bootp/options.htm

15	1+	Domain Name.
119	Variable	DNS domain search list.

A way to tell use lan for FQDN?
Tks.

Yes, that's right. Option 15 announces the domain name and 119 which suffixes to add when given a hostname only.
But you have it already in your resolv.conf, so that was not the issue.
Can you make sure the OpenWrt is used as resolver? From the E7440 do the following:

nslookup webserver.lan
nslookup webserver 192.168.111.1

Hi Trendy:

Oddly enough, it has started working just now without me doing anything for the last 12 hours or so...Very odd. Perhaps I did something and Alzheimer is making me think I did nothing!!
Thanks for your efforts and nslookup is also attached.

agarg@E7440:~$ nslookup webserver.lan
Server:		127.0.0.53
Address:	127.0.0.53#53

Non-authoritative answer:
Name:	webserver.lan
Address: 192.168.111.149

agarg@E7440:~$ nslookup webserver 192.168.111.1
Server:		192.168.111.1
Address:	192.168.111.1#53

Name:	webserver.lan
Address: 192.168.111.149

agarg@E7440:~$ 

Sorry to bug you but there is an erratic behavior. Just now webserver was responding to ping but google.com failed. And after sometime it was OK. I will be very happy to help narrow this problem. And can provide all the data from all the four routers under the WDS... See below:

agarg@E7440:~$ nslookup webserver 192.168.111.1
Server:		192.168.111.1
Address:	192.168.111.1#53

Name:	webserver.lan
Address: 192.168.111.149

agarg@E7440:~$ nslookup nextcloud 192.168.111.1
Server:		192.168.111.1
Address:	192.168.111.1#53

Name:	nextcloud.lan
Address: 192.168.111.149

agarg@E7440:~$ ping nextcloud.lan
ping: nextcloud.lan: Name or service not known
agarg@E7440:~$ ping database.lan
ping: database.lan: Name or service not known
agarg@E7440:~$ ping database
ping: database: Temporary failure in name resolution
agarg@E7440:~$ ssh -l root 192.168.111.9
root@192.168.111.9's password: 


BusyBox v1.30.1 () built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 19.07.1, r10911-c155900f66
 -----------------------------------------------------
root@R6100-9:~# ping 192.168.111.149
PING 192.168.111.149 (192.168.111.149): 56 data bytes
64 bytes from 192.168.111.149: seq=0 ttl=64 time=3.924 ms
64 bytes from 192.168.111.149: seq=1 ttl=64 time=5.065 ms
^C
--- 192.168.111.149 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 3.924/4.494/5.065 ms
root@R6100-9:~# ping webserver
ping: bad address 'webserver'
root@R6100-9:~# ping webserver
ping: bad address 'webserver'
root@R6100-9:~# ping webserver
ping: bad address 'webserver'
root@R6100-9:~# exit
Connection to 192.168.111.9 closed.
agarg@E7440:~$ ssh -l root 192.168.111.1
The authenticity of host '192.168.111.1 (192.168.111.1)' can't be established.
RSA key fingerprint is SHA256:fPxjChrtUMI/tImMYY1PQ+es5/ZlXMoeg8huiRD/aUM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.111.1' (RSA) to the list of known hosts.
root@192.168.111.1's password: 


BusyBox v1.30.1 () built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 19.07.1, r10911-c155900f66
 -----------------------------------------------------
root@R6100GW:~# ping webserver
PING webserver (192.168.111.149): 56 data bytes
64 bytes from 192.168.111.149: seq=0 ttl=64 time=0.453 ms
64 bytes from 192.168.111.149: seq=1 ttl=64 time=0.413 ms
^C
--- webserver ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.413/0.433/0.453 ms
root@R6100GW:~# exit
Connection to 192.168.111.1 closed.
agarg@E7440:~$ ssh -l root 192.168.111.3
The authenticity of host '192.168.111.3 (192.168.111.3)' can't be established.
RSA key fingerprint is SHA256:XY2lToY+5XIZuOZu22Q9bMfazgbM4aucpIENQsEqZiM.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.111.3' (RSA) to the list of known hosts.
root@192.168.111.3's password: 


BusyBox v1.30.1 () built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 19.07.1, r10911-c155900f66
 -----------------------------------------------------
root@wndr4700:~# ping webserver
ping: bad address 'webserver'
root@wndr4700:~# exit
Connection to 192.168.111.3 closed.
agarg@E7440:~$ ssh -l root 192.168.111.5
The authenticity of host '192.168.111.5 (192.168.111.5)' can't be established.
RSA key fingerprint is SHA256:QcR0uqWFMeQjOhKM3zFc38iWJsKMxXkMZfnNPx2b5x4.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.111.5' (RSA) to the list of known hosts.
root@192.168.111.5's password: 


BusyBox v1.28.4 () built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt 18.06.5, r7897-9d401013fc
 -----------------------------------------------------
root@Newifi-D2:~# ping webserver
ping: bad address 'webserver'
root@Newifi-D2:~# ping google.com
PING google.com (172.217.12.46): 56 data bytes
64 bytes from 172.217.12.46: seq=0 ttl=52 time=59.514 ms
^C
--- google.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 59.514/59.514/59.514 ms
root@Newifi-D2:~# exit
Connection to 192.168.111.5 closed.
agarg@E7440:~$ ping database
ping: database: Temporary failure in name resolution
agarg@E7440:~$ ping database
PING database.lan (192.168.111.92) 56(84) bytes of data.
64 bytes from database.lan (192.168.111.92): icmp_seq=1 ttl=64 time=2.67 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=2 ttl=64 time=1.06 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=3 ttl=64 time=1.68 ms
^C
--- database.lan ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.055/1.802/2.670/0.664 ms
agarg@E7440:~$ ping database
PING database.lan (192.168.111.92) 56(84) bytes of data.
64 bytes from database.lan (192.168.111.92): icmp_seq=1 ttl=64 time=2.14 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=2 ttl=64 time=1.88 ms
^C
--- database.lan ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 1.880/2.007/2.135/0.127 ms
agarg@E7440:~$ ping google.com
ping: google.com: Temporary failure in name resolution
agarg@E7440:~$ ping database
PING database.lan (192.168.111.92) 56(84) bytes of data.
64 bytes from database.lan (192.168.111.92): icmp_seq=1 ttl=64 time=0.752 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=2 ttl=64 time=1.51 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=3 ttl=64 time=1.60 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=4 ttl=64 time=1.54 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=5 ttl=64 time=1.87 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=6 ttl=64 time=2.16 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=7 ttl=64 time=1.62 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=8 ttl=64 time=1.82 ms
64 bytes from database.lan (192.168.111.92): icmp_seq=9 ttl=64 time=1.67 ms
^C
--- database.lan ping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8011ms
rtt min/avg/max/mdev = 0.752/1.616/2.163/0.360 ms
agarg@E7440:~$ ping google.com
ping: google.com: Temporary failure in name resolution
agarg@E7440:~$ ping google.com
ping: connect: Network is unreachable
agarg@E7440:~$ nslookup google.com
Server:		127.0.0.53
Address:	127.0.0.53#53

Non-authoritative answer:
Name:	google.com
Address: 172.217.14.174
Name:	google.com
Address: 2607:f8b0:4000:806::200e

agarg@E7440:~$ ping google.com
PING google.com (172.217.14.174) 56(84) bytes of data.
64 bytes from 172.217.14.174: icmp_seq=1 ttl=52 time=261 ms
64 bytes from 172.217.14.174: icmp_seq=2 ttl=52 time=375 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 261.116/318.291/375.466/57.175 ms
agarg@E7440:~$ 

It seems to me that the resolver in your OpenWrt is working fine. Each time you query the name directly from 192.168.111.1 you always get the correct answer.
When however you don't specify the nameserver to use and you rely on what is configured on each host, then you either get or not get a response.
Correct me if I am wrong so far.
So I tend to believe that your hosts are configured with both the OpenWrt and some public nameservers, which are queried in some round robin way. So when the OpenWrt is queried, then you get a response. But when the public nameserver is queried you of course won't get an answer.
Run the following on E7440:
resolvectl status
It will show you which nameservers it is using.
Also another way to verify is with tcpdump:
tcpdump -i eth1.1 -nv udp port 53
Then start querying from your hosts without specifying the namserver. If you see only destination address 192.168.111.1, the query will be answered. If you see destination a public address, like 8.8.8.8, 1.1.1.1 or your ISP, then it won't be answered.

Thanks Trendy. I will try these for academic purpose as I think I found out the problem. In one of my client (WDS) on this network, I had missed turning off the DHCP server and as a result the machines connected to its wifi may have exhibited this problem. I use common SSID on different channels on different radio and so that means the erratic behavior came from there.

I much appreciate your help and guidance and I believe that my problem is resolved as soon as I checked the box "ignore interface" to Disable DHCP for this interface.

Cheers! And thanks again.

If the problem is solved, feel free to mark the topic accordingly.

1 Like

Thank you so much, again!

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.