Hello, I am using DoH with the luci app with dnsmasq. Since setting up DoH, I have noticed nearly daily loss of DNS resolution on all clients. It happens sometimes once every 2 days or sometimes several times a day, often for hours at a time. Most of the time a reboot fixes it. Restarting https_dns_proxy or dnsmasq doesn't solve the problem.
My router:
Model: TP-Link Archer A7 v5
FIrmware: OpenWrt SNAPSHOT r11582-b3779e920e / LuCI Master git-19.327.83508-5e1253f (recently updated to see if it fixed problem but it didn't)
Config: DHCP for all LAN clients, IPv6 disabled everywhere possible, adblock enabled, UPnP disabled
https_dns_proxy config (set via luci app, using cloudflare and dns.sb):
https_dns_proxy.@https_dns_proxy[0]=https_dns_proxy
https_dns_proxy.@https_dns_proxy[0].listen_addr='127.0.0.1'
https_dns_proxy.@https_dns_proxy[0].listen_port='5053'
https_dns_proxy.@https_dns_proxy[0].user='nobody'
https_dns_proxy.@https_dns_proxy[0].group='nogroup'
https_dns_proxy.@https_dns_proxy[0].bootstrap_dns='1.1.1.1,1.0.0.1'
https_dns_proxy.@https_dns_proxy[0].url_prefix='https://cloudflare-dns.com/dns-query?ct=application/dns-json&'
https_dns_proxy.@https_dns_proxy[1]=https_dns_proxy
https_dns_proxy.@https_dns_proxy[1].listen_addr='127.0.0.1'
https_dns_proxy.@https_dns_proxy[1].listen_port='5054'
https_dns_proxy.@https_dns_proxy[1].user='nobody'
https_dns_proxy.@https_dns_proxy[1].group='nogroup'
https_dns_proxy.@https_dns_proxy[1].bootstrap_dns='185.222.222.222,185.184.222.222'
https_dns_proxy.@https_dns_proxy[1].url_prefix='https://doh.dns.sb/dns-query?'
/etc/dhcp/config:
config dnsmasq
option domainneeded '1'
option localise_queries '1'
option rebind_protection '1'
option rebind_localhost '1'
option local '/lan/'
option domain 'lan'
option expandhosts '1'
option authoritative '1'
option readethers '1'
option leasefile '/tmp/dhcp.leases'
option nonwildcard '1'
option localservice '1'
option noresolv '1'
list doh_backup_server '127.0.0.1#5053'
list addnhosts '/tmp/adb_list.overall'
option logqueries '1'
option logdhcp '1'
option logfacility '/tmp/dnsmasq.log'
list server '127.0.0.1#5053'
list server '127.0.0.1#5054'
When the DNS resolution fails, I can still use nslookup on the router via ssh since I don't have local DoH configured. But nslookup on clients times out: "nslookup openwrt.org
;; connection timed out; no servers could be reached"
However, since I can resolve on the router, I nslookup on the router to find the IP of a site, then I can ping that direct IP from the clients. So the issue is DNS related in my view. Direct IP address in the browser of clients can see pages.
Example output of /tmp/dnsmasq.log when DNS resolution is WORKING:
Nov 30 19:39:14 dnsmasq[2125]: 1505 192.168.1.149/41641 query[A] openwrt.org from 192.168.1.149
Nov 30 19:39:14 dnsmasq[2125]: 1505 192.168.1.149/41641 forwarded openwrt.org to 127.0.0.1
Nov 30 19:39:14 dnsmasq[2125]: 1505 192.168.1.149/41641 forwarded openwrt.org to 127.0.0.1
Nov 30 19:39:14 dnsmasq[2125]: 1506 192.168.1.149/45237 query[AAAA] openwrt.org from 192.168.1.149
Nov 30 19:39:14 dnsmasq[2125]: 1506 192.168.1.149/45237 forwarded openwrt.org to 127.0.0.1
Nov 30 19:39:14 dnsmasq[2125]: 1505 192.168.1.149/41641 reply openwrt.org is 139.59.209.225
Nov 30 19:39:14 dnsmasq[2125]: 1506 192.168.1.149/45237 reply openwrt.org is 2a03:b0c0:3:d0::1af1:1
When it fails, I do not get the "reply" lines. Just the first 3 lines I think, the first query and the two forwards (I'd give example but not having the problem ATM).
Anyone have any ideas?