Multiple DNS queries, some rejected

firefexx · March 17, 2019, 12:54pm

Hi,

for DNS, I like to make use of https://1.1.1.1/ instead of my ISP's DNS. Therefore, I set option dns '1.1.1.1 1.0.0.1 2606:4700:4700::1111 2606:4700:4700::1001' in my /etc/config/network configuration within the config interface 'wan' section.

I imagined dnsmasq picks one of the configured DNS server addresses for its requests and switches if one is unavailable. But apparently, dnsmasq sends the queries to all four configured DNS servers simultaneously. I see four outgoing requests in tcpdump and the DNS servers answer the requests. Then, two or three of the slower responses get answered with ICMP unreachable errors.

I would like to ask you: Is this normal? That dnsmasq asks the servers simultaneously? Seems unnecessary. And that some of the responses are rejected afterwards? Is this because dnsmasq already knows the answer from the fastest response?

vgaetera · March 17, 2019, 1:10pm

Behavior changes since caching is enabled by default for OpenWrt.
Start troubleshooting from your LAN-client network configuration.

Better move DNSv6 servers to wan6 configuration.

firefexx · March 17, 2019, 1:41pm

Thanks a lot for your answer.

Good to know. I don't know how exactly I could troubleshoot this, since the config looks unsuspicious for my eyes. (But I haven't much experience with OpenWrt...) Here is the relevant config.

from /etc/config/network:

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config interface 'wan'
	option ifname 'eth0.7'
	option proto 'pppoe'
	option username 'xxxxxxxxxxxx'
	option password 'yyyyyyyyyyyyyyyyyyyy'
	option ipv6 'auto'
	option peerdns '0'
	option dns '1.1.1.1 1.0.0.1 2606:4700:4700::1111 2606:4700:4700::1001'

and from /etc/config/dhcp:

config dnsmasq
	option domainneeded '1'
	option boguspriv '1'
	option filterwin2k '0'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option nonegcache '0'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.auto'
	option nonwildcard '1'
	option localservice '1'

The thing is, I have no wan6 configuration. After setting up my wan configuration, a new wan_6 configuration automatically appears within luci when connecting, but I cannot edit it. It seems to be managed from wan configuration. Therefore, I deleted the unused wan6 interface that was present after flashing. Was this a bad idea?

vgaetera · March 17, 2019, 1:53pm

Check from your PC:

ipconfig /all

Then it should be fine.

firefexx · March 17, 2019, 2:03pm

That's a Windows command, I'm using Linux here. I guess you are interested in the configured name servers?!

$ cat /etc/resolv.conf 
# Generated by NetworkManager
search lan
nameserver 192.168.1.1
nameserver fdxxxxxxxxxxxxxxxxxx

It only contains the local IPs from my OpenWrt router.

vgaetera · March 17, 2019, 2:06pm

grep hosts /etc/nsswitch.conf

Openwrt:

head -n -0 /tmp/etc/dnsmasq*
pgrep -f -a dnsmasq

fuller · March 17, 2019, 2:09pm

sadly yes

iirc it was discussed on dnsmasq maling list sometime ago...

i put the following into /etc/firewall.user

# dont spam dns-servers with port-unreach
iptables -I output_rule -o $EXT_IF -p icmp --icmp-type 3 -j DROP

edit:
alternatively (from manpage)

-o, --strict-order

By default, dnsmasq will send queries to any of the upstream servers it knows about and tries to favour servers that are known to be up. Setting this flag forces dnsmasq to try each query with each server strictly in the order they appear in /etc/resolv.conf

vgaetera · March 17, 2019, 2:16pm

It shouldn't result in described behavior:

That one matches better:

fuller · March 17, 2019, 2:20pm

true ..... probably not a big (enough) problem thou

firefexx · March 17, 2019, 3:15pm

Thanks. Here is the requested output:

$ grep hosts /etc/nsswitch.conf
hosts:      files dns myhostname

$ head -n -0 /tmp/etc/dnsmasq*
# auto-generated config file from /etc/config/dhcp
conf-file=/etc/dnsmasq.conf
dhcp-authoritative
domain-needed
localise-queries
read-ethers
enable-ubus
expand-hosts
bind-dynamic
local-service
domain=lan
server=/lan/
dhcp-leasefile=/tmp/dhcp.leases
resolv-file=/tmp/resolv.conf.auto
stop-dns-rebind
rebind-localhost-ok
dhcp-broadcast=tag:needs-broadcast
addn-hosts=/tmp/hosts
conf-dir=/tmp/dnsmasq.d
user=dnsmasq
group=dnsmasq


dhcp-ignore-names=tag:dhcp_bogus_hostname
conf-file=/usr/share/dnsmasq/dhcpbogushostname.conf


bogus-priv
conf-file=/usr/share/dnsmasq/rfc6761.conf
dhcp-range=set:lan,192.168.1.100,192.168.1.249,255.255.255.0,12h
dhcp-range=set:guest,192.168.9.100,192.168.9.249,255.255.255.0,12h
no-dhcp-interface=pppoe-wan

$ pgrep -f -a dnsmasq
2248 /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf.cfg01411c -k -x /var/run/dnsmasq/dnsmasq.cfg01411c.pid

vgaetera · March 17, 2019, 3:26pm

OpenWrt:

killall tcpdump; tcpdump -ni any port 53 &
nslookup example.org
nslookup example.org 127.0.0.1

grep -v -e ^# -e ^$ /etc/dnsmasq.conf

firefexx · March 17, 2019, 3:48pm

I restarted dnsmasq, then executed tcpdump and nslookup:

16:39:23.380006 IP 127.0.0.1.37295 > 127.0.0.1.53: 42821+ A? openwrt.org. (29)
16:39:23.380578 IP myIPv4.54617 > 1.1.1.1.53: 48096+ A? openwrt.org. (29)
16:39:23.380662 IP 127.0.0.1.37295 > 127.0.0.1.53: 53967+ AAAA? openwrt.org. (29)
16:39:23.380756 IP myIPv4.54617 > 1.0.0.1.53: 48096+ A? openwrt.org. (29)
16:39:23.381053 IP6 myIPv6.47224 > 2606:4700:4700::1111.53: 48096+ A? openwrt.org. (29)
16:39:23.381235 IP6 myIPv6.47224 > 2606:4700:4700::1001.53: 48096+ A? openwrt.org. (29)
16:39:23.381741 IP myIPv4.3384 > 1.1.1.1.53: 7859+ AAAA? openwrt.org. (29)
16:39:23.386314 IP6 2606:4700:4700::1111.53 > myIPv6.47224: 48096 1/0/0 A 139.59.209.225 (45)
16:39:23.386794 IP6 2606:4700:4700::1001.53 > myIPv6.47224: 48096 1/0/0 A 139.59.209.225 (45)
16:39:23.386964 IP 127.0.0.1.53 > 127.0.0.1.37295: 42821 1/0/0 A 139.59.209.225 (45)
16:39:23.388596 IP6 2606:4700:4700::1111.53 > myIPv6.54843: 7859 1/0/0 AAAA 2a03:b0c0:3:d0::1af1:1 (57)
16:39:23.388699 IP6 2606:4700:4700::1001.53 > myIPv6.54843: 7859 1/0/0 AAAA 2a03:b0c0:3:d0::1af1:1 (57)
16:39:23.389191 IP 127.0.0.1.53 > 127.0.0.1.37295: 53967 1/0/0 AAAA 2a03:b0c0:3:d0::1af1:1 (57)
16:39:23.393962 IP 1.0.0.1.53 > myIPv4.54617: 48096 1/0/0 A 139.59.209.225 (45)
16:39:23.394250 IP 1.1.1.1.53 > myIPv4.54617: 48096 1/0/0 A 139.59.209.225 (45)
16:39:23.394651 IP 1.1.1.1.53 > myIPv4.3384: 7859 1/0/0 AAAA 2a03:b0c0:3:d0::1af1:1 (57)
16:39:23.395016 IP 1.0.0.1.53 > myIPv4.3384: 7859 1/0/0 AAAA 2a03:b0c0:3:d0::1af1:1 (57)
16:39:28.122926 IP 127.0.0.1.52577 > 127.0.0.1.53: 18109+ A? openwrt.org. (29)
16:39:28.123067 IP 127.0.0.1.52577 > 127.0.0.1.53: 27838+ AAAA? openwrt.org. (29)
16:39:28.123374 IP 127.0.0.1.53 > 127.0.0.1.52577: 18109 1/0/0 A 139.59.209.225 (45)
16:39:28.123745 IP 127.0.0.1.53 > 127.0.0.1.52577: 27838 1/0/0 AAAA 2a03:b0c0:3:d0::1af1:1 (57)

The grep finds nothing.

firefexx · March 17, 2019, 3:49pm

I get the impression you are right. I found an answer covering this at https://unix.stackexchange.com/a/361423 along with the referenced Debian issue https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=580064#10

Maybe my described behavior was not correct. I tested again, this time issuing multiple requests rapidly from my client. Only one or two lookups were repeated as I stated originally. The others are mainly towards the same IPv6 DNS server. This conforms to the behavior described in the Debian issue as normal behavior.

So may it be the case that everything works indeed as expected? Or is there still something you consider odd, @vgaetera ?

vgaetera · March 17, 2019, 4:08pm

Actually, I've managed to reproduce the issue.
Every new uncached request is forwarded to all the network resolvers.
It doesn't look like normal because this behavior contradicts description for option --all-servers in the official documentation.
May be the documentation is outdated and default behavior has changed...
Anyway this is a minor issue, and AFAIK Windows 10 resolver has similar behavior.

eduperez · March 17, 2019, 4:12pm

That is my experience, too. I just configured dnsmasq to query just one server, and the problem vanished.

firefexx · March 17, 2019, 5:12pm

Ok, great. Thanks @vgaetera for the support!

Maybe this option means that every request is forwarded to all servers. Since my tests seem to confirm what's stated in the Debian issue I linked, this happens often (all 30 secs or 50 requests) but not always. However...

For completion: Would you describe which config line in what file is necessary for this? I think I will stay with the default behavior but it is nice to know.

fuller · March 17, 2019, 6:41pm

tried this.
but sometimes lookups fail and if there are no other servers configured, the client will see the error (wife=unhappy)

vgaetera · March 17, 2019, 7:22pm

It looks like caching affects this behavior.
If you set cachesize=0, dnsmasq will use a single fastest resolver for some time.
But it still checks all the resolvers periodically supposedly looking for the fastest.

eduperez · March 17, 2019, 8:26pm

I added "option allservers '0'" line at the main section in '/etc/config/dhcp'.

vgaetera · March 17, 2019, 9:01pm

Unfortunately it changes nothing because allservers=0 by default:
https://openwrt.org/docs/guide-user/base-system/dhcp_configuration