I have an issue where if a decent number of UDP packets are sent from my lan to my router then my WAN no longer allows any UDP traffic (and thus DNS) to pass-through for a few minutes.
A simple nmap in udp mode to 1 target is enough to bring down the Internet since the DNS goes out. Or if 1 client sends say 2,000 DNS requests over UDP then all clients on the network can no longer make DNS requests using any resolvers (router's, 1.1.1.1, 8.8.8.8, etc.). Running a video call can also trigger the outage at times.
This is all with clients connected via wired Ethernet.
Hardware:
Raspberry Pi 4B rev1.4 with 8GB memory
OpenWrt 23.05.2 r23630-842932a63d / LuCI openwrt-23.05 branch git-24.006.68745-9128656
(wifi adapter disabled)
AX88179 USB adapter for the WAN connection
Integrated NIC connects to a managed switch for the LAN side
I enabled system.log to disk. Nothing of note shows up in the logs when the outage occurs.
I tried connecting a desktop integrated NIC directly to the fiber ONT bypassing OpenWRT entirely. If I called nslookup 50,000 times all at once I could cause some UDP packet loss, but cancelling the while loop of nslookup calls would see it recover very quickly.
Repeating the same test with the router in-place results in the UDP DNS packets failing for minutes even after the nslookup client stops.
I did do continuity and speed-benchmarks on the Ethernet wiring as well. All indicators there came back fine.
My next idea is to get a non-OpenWrt router and place that on the network. Attempt some nmap UDP and nslookup DNS UDP tests again to see if it responds better or not.
I wonder if the OpenWRT has some kind of low connection table limit being overrun?
Question: Any suggestions on additional logs or commands to analyze network tables on the Raspberry Pi with OpenWRT itself? Any tips on additional network logging I could enable?
I've read reports that the choice of USB Ethernet for the second NIC matters when using the Raspberry Pi 4 as a router. In particular, RTL8153-based NICs work better than basically any other USB NIC. My hypothesis is that the AX88179 can't handle the UDP traffic. I can't say for sure as I'm using a Raspberry Pi CM4 with RTL8111, which is a PCIe NIC.
Some packet loss is normal and expected when transiting the global Internet. Protocols like TCP and QUIC do a really good job of hiding this from applications. The fact that it works much better when using your desktop's NIC supports the hypothesis that the USB NIC is the problem.
One way to test this without buying new hardware is to swap the LAN/WAN roles of the two NICs. That is, use the built-in NIC on the Raspberry Pi for the WAN connection and do the UDP test from the Raspbery Pi itself.
If I run this from a wired Ethernet machine on the LAN. After about 2,824 UDP connections my router cannot communicate over the WAN out to DNS nor ping 8.8.8.8 successfully. This lasts for about a 5 minutes then subsides:
while true; do
date
for i in {1..50}; do
cat top-1000-dns.txt | xargs -I{} -n 1 nslookup {} 1.1.1.1 &
done
wait
done
I did try placing a laptop with Linux on it in place of the WAN side ONT. I ran iperf3 with iperf3 without issue, but the amount of traffic generated was likely smaller.
I put an R7800 in place of the Raspberry Pi. Stock OEM firmware R7800-V1.0.2.92.
Running my test script and an nmap udp scan from a lan node shows a few timeouts, but the connection seems more stable and recovers quickly when I stop the test queries.
Anyone have any tips for monitoring the number of UDP packets coming from a LAN client?
I think this started because some device on the LAN sent an increased number of DNS UDP packets causing my connection to intermittently go down for a minute.
This would have to be an amazingly excessive number of DNS packets. I doubt that this is a problem unless you have malware running an amplification attack.
Thanks for the tips. I installed iptraf-ng on the OpenWrt (R7800) router. It doesn't show me a UDP packet totals / LAN IP, but it is handy for a real-time overall counter.
I also am going to watch the Luci Connections tab.
Because my R7800 does not have much storage space nor CPU I'm going to setup a separate Linux node on the LAN to collect numbers of UDP packets / LAN IP.
With an R7800 running a snapshot build (23.05-SNAPSHOT r23904-f12cf43029) with hardware offloading enabled the active connections will max out to 59392 but other clients still retain their connectivity.
I'm only using wired Ethernet and not the built-in wifi.
I plan on looking for other hardware that could handle gigabit NAT routing and at least 4x the number of connections. Just appears that with ~57 devices on my home network and the increased usage I need to plan for better routing hardware.