DNS responds on wrong ports

I'm having sporadic problems with my internet. In an attempt to find the problem I'm dumping LAN as well WAN traffic using a port mirror on my switch. While browsing today I caught a slow DNS query using Firefox (11 seconds for DNS resolution).

When I looked at the dump I found something weird. The dump shows queries from my S10 to the router and from the router (with public IP labelled "wan") to various DNS servers (local ISP, 1.1.1.1 and 8.8.8.8),

The first four lines show a working example of an A query. The rest is the failed attempt to resolve the AAAA record. The router (or probably dnsmasq to be specific) uses the source port 49749 for all outgoing queries. All responses (regardless which server) respond to port 60834. As a result the DNS servers receive "ICMP Port unreachable" and the S10 never receives any response.

What can cause such a port mismatch? I doesn't look right. I'm happy for any pointers. I gladly provide more details from the dump.

It's my first time working with port mirrors but the dump looks alright, so I believe that this is not an instrumentation fault.

Model Raspberry Pi 4 Model B Rev 1.2
Architecture ARMv8 Processor rev 3
Target Platform bcm27xx/bcm2711
Firmware Version OpenWrt 22.03.2 r19803-9a599fee93 / LuCI openwrt-22.03 branch git-22.288.45147-96ec0cd

Wild guess. Since port is different on the return, your dns packets are hijacked, redirected and the router finally receives an answer which was meant for the proxy dns.
The ICMP port unreachable is a very legit answer from your router, given that it receives a reply to a port which was not open in the first place.

1 Like

Since port is different on the return, your dns packets are hijacked, ...

This would have to happen on the ISP side, right? I want to make sure that I haven't messed anything up on my end before I blame my ISP :wink: Since I'm capturing pretty close to the edge on my end I don't see how I could have caused this. The only thing happening "behind the dump" is "untagging" the VLAN on the switch and then straight to the ONT.

Doesn't sound like it's possible. Can you make a quick diagram of the devices and also the output of

uci export network; uci export dhcp; ls -l  /etc/resolv.* /tmp/resolv.* /tmp/resolv.*/* ; head -n -0 /etc/resolv.* /tmp/resolv.* /tmp/resolv.*/*

Diagram (sort of...)

Switch (TL-SG105PE)
Port 1 & 2: WIFI APs (LAN untagged, PFERDE VLAN 8)
Port 3: Mirror
Port 4: Router (LAN untagged, WAN VLAN 2, PFERDE VLAN 8) Mirrored
Port 5: ONT (WAN untagged) 

Config

root@OpenWrt:~# uci export network; uci export dhcp; ls -l  /etc/resolv.* /tmp/resolv.* /tmp/resolv.*/* ; head -n -0 /etc/resolv.* /tmp/
resolv.* /tmp/resolv.*/*
package network

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd4a:ba87:0b21::/48'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'eth0'

config interface 'lan'
	option device 'br-lan'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'
	list dns '8.8.8.8'
	list dns '1.1.1.1'

config device
	option name 'eth0'

config interface 'wan'
	option proto 'pppoe'
	option username '...'
	option password '...'
	option mtu '1492'
	option device 'eth0.2'
	option ipv6 '0'

config device
	option type '8021q'
	option ifname 'eth0'
	option vid '2'
	option name 'eth0.2'

config device
	option name 'pppoe-wan'
	option type 'tunnel'

config device
	option name 'pppoe-wan'
	option type 'tunnel'

config device
	option type '8021q'
	option ifname 'br-lan'
	option vid '8'
	option name 'pferde'
	option ipv6 '0'

config interface 'Pferde'
	option proto 'static'
	option device 'pferde'
	option ipaddr '192.168.3.1'
	option netmask '255.255.255.0'

package dhcp

config dnsmasq
	option domainneeded '1'
	option boguspriv '1'
	option filterwin2k '0'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option nonegcache '0'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.d/resolv.conf.auto'
	option nonwildcard '1'
	option localservice '1'
	option ednspacket_max '1232'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv4 'server'
	option dhcpv6 'server'
	option ra 'server'
	list ra_flags 'managed-config'
	list ra_flags 'other-config'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'

config dhcp 'Pferde'
	option interface 'Pferde'
	option start '100'
	option limit '150'
	option leasetime '12h'

lrwxrwxrwx    1 root     root            16 Oct 14 22:44 /etc/resolv.conf -> /tmp/resolv.conf
-rw-r--r--    1 root     root            47 Jan  1 16:41 /tmp/resolv.conf
-rw-r--r--    1 root     root           121 Jan 16 14:53 /tmp/resolv.conf.d/resolv.conf.auto
-rw-r--r--    1 root     root            51 Jan 16 14:53 /tmp/resolv.conf.ppp

/tmp/resolv.conf.d:
-rw-r--r--    1 root     root           121 Jan 16 14:53 resolv.conf.auto
==> /etc/resolv.conf <==
search lan
nameserver 127.0.0.1
nameserver ::1

==> /tmp/resolv.conf <==
search lan
nameserver 127.0.0.1
nameserver ::1

==> /tmp/resolv.conf.d <==
head: /tmp/resolv.conf.d: I/O error

==> /tmp/resolv.conf.ppp <==
nameserver 217.24.194.66
nameserver 212.62.194.133

==> /tmp/resolv.conf.d/resolv.conf.auto <==
# Interface lan
nameserver 8.8.8.8
nameserver 1.1.1.1
# Interface wan
nameserver 217.24.194.66
nameserver 212.62.194.133

I don't see an issue here.
You can tell dnsmasq to ignore AAAA records, since you don't have IPv6 as a workaround.
Add filter-AAAA in /etc/dnsmasq.conf

Thanks @trendy for looking into it.

I don't really have evidence that this is specifically a problem with AAAA records. I mean this is the only sample I have. I have days worth of dumps, but I wouldn't know how to filter through them. Those "ICMP port unreachable" are pretty common if dnsmasq queries multiple servers for the same record (only the first response is accepted).

To top it all off I believe that my connection issues are broader than just DNS. But with everything being TLS it is really hard to find concrete evidence.

I've contacted my ISP. In the meantime, I'm always open to other ideas :wink:

Have you tried to use DoH with https-dns-proxy or DoT with stubby?

No, haven't tried it.

The problem happened again today. Turns out, the problem is not limited to DNS. I saw a QUIC connection that was initialized on port A and the response from the server was on port B. So at least UDP is problematic.

I disabled QUIC in my browser today and it felt like less disruptions.

I double checked checksum. They are all correct. So something between my router and the server must be rewriting those ports.

No word from my ISP yet.

My ISP is using a carrier-grade NAT. They're suspecting the problem there.

Quite possible.