Losing UDP communication over WAN

I have an issue where if a decent number of UDP packets are sent from my lan to my router then my WAN no longer allows any UDP traffic (and thus DNS) to pass-through for a few minutes.

A simple nmap in udp mode to 1 target is enough to bring down the Internet since the DNS goes out. Or if 1 client sends say 2,000 DNS requests over UDP then all clients on the network can no longer make DNS requests using any resolvers (router's, 1.1.1.1, 8.8.8.8, etc.). Running a video call can also trigger the outage at times.

This is all with clients connected via wired Ethernet.

Hardware:
Raspberry Pi 4B rev1.4 with 8GB memory
OpenWrt 23.05.2 r23630-842932a63d / LuCI openwrt-23.05 branch git-24.006.68745-9128656
(wifi adapter disabled)
AX88179 USB adapter for the WAN connection
Integrated NIC connects to a managed switch for the LAN side

I enabled system.log to disk. Nothing of note shows up in the logs when the outage occurs.

I tried connecting a desktop integrated NIC directly to the fiber ONT bypassing OpenWRT entirely. If I called nslookup 50,000 times all at once I could cause some UDP packet loss, but cancelling the while loop of nslookup calls would see it recover very quickly.

Repeating the same test with the router in-place results in the UDP DNS packets failing for minutes even after the nslookup client stops.

I did do continuity and speed-benchmarks on the Ethernet wiring as well. All indicators there came back fine.

My next idea is to get a non-OpenWrt router and place that on the network. Attempt some nmap UDP and nslookup DNS UDP tests again to see if it responds better or not.

I wonder if the OpenWRT has some kind of low connection table limit being overrun?

Question: Any suggestions on additional logs or commands to analyze network tables on the Raspberry Pi with OpenWRT itself? Any tips on additional network logging I could enable?

I've read reports that the choice of USB Ethernet for the second NIC matters when using the Raspberry Pi 4 as a router. In particular, RTL8153-based NICs work better than basically any other USB NIC. My hypothesis is that the AX88179 can't handle the UDP traffic. I can't say for sure as I'm using a Raspberry Pi CM4 with RTL8111, which is a PCIe NIC.

Some packet loss is normal and expected when transiting the global Internet. Protocols like TCP and QUIC do a really good job of hiding this from applications. The fact that it works much better when using your desktop's NIC supports the hypothesis that the USB NIC is the problem.

One way to test this without buying new hardware is to swap the LAN/WAN roles of the two NICs. That is, use the built-in NIC on the Raspberry Pi for the WAN connection and do the UDP test from the Raspbery Pi itself.

If I run this from a wired Ethernet machine on the LAN. After about 2,824 UDP connections my router cannot communicate over the WAN out to DNS nor ping 8.8.8.8 successfully. This lasts for about a 5 minutes then subsides:

while true; do
        date

        for i in {1..50}; do
                cat top-1000-dns.txt | xargs -I{} -n 1 nslookup {} 1.1.1.1 &
        done

        wait
done

I did try placing a laptop with Linux on it in place of the WAN side ONT. I ran iperf3 with iperf3 without issue, but the amount of traffic generated was likely smaller.

Powershell used for test:

while($true) {`
.\iperf3.exe --client 10.0.0.1 --time 5 -b 0 --bidir --udp`
}

Retesting again I can still ping but UDP packets going out the WAN fail for awhile when I run my test script.

I will try swapping routers to see if it makes a difference and then go looking for another usb nic.

I put an R7800 in place of the Raspberry Pi. Stock OEM firmware R7800-V1.0.2.92.

Running my test script and an nmap udp scan from a lan node shows a few timeouts, but the connection seems more stable and recovers quickly when I stop the test queries.

My next test will be to install the latest OpenWRT on the R7800 and see how it handles recovery.

With the nmap udp scan running reached 3350 udp connections with no issue on the R7800 with OpenWRT.

sudo nmap scanme3.nmap.org -p- -sU -v -T insane

Running my top-1000-dns.txt script peaks at 4203 with some noticeable slowness by other clients on the network but still working connections.

Anyone have any tips for monitoring the number of UDP packets coming from a LAN client?

I think this started because some device on the LAN sent an increased number of DNS UDP packets causing my connection to intermittently go down for a minute.

Use capture filters to ignore everything but UDP packets.

This would have to be an amazingly excessive number of DNS packets. I doubt that this is a problem unless you have malware running an amplification attack.

iptraf can also be used if you need the number of packets only and not the contents.

Thanks for the tips. I installed iptraf-ng on the OpenWrt (R7800) router. It doesn't show me a UDP packet totals / LAN IP, but it is handy for a real-time overall counter.

I also am going to watch the Luci Connections tab.

Because my R7800 does not have much storage space nor CPU I'm going to setup a separate Linux node on the LAN to collect numbers of UDP packets / LAN IP.

Will report back with my results.

There is a filter function to narrow down the results.

Changing the hardware from a raspberry pi 4b to my R7800 has seen my connection remain stable despite load.

Follow-up. The issue was noticeable if two people made a Facetime call. The UDP traffic would spike before stabilizing as per a tcpdump capture.

Could also kill the network with a command like:

sudo nmap -sU -p- -T3 --defeat-icmp-ratelimit --min-rate 5000 --stats-every 5s scanme.nmap.org

With an R7800 running a snapshot build (23.05-SNAPSHOT r23904-f12cf43029) with hardware offloading enabled the active connections will max out to 59392 but other clients still retain their connectivity.

I'm only using wired Ethernet and not the built-in wifi.

I plan on looking for other hardware that could handle gigabit NAT routing and at least 4x the number of connections. Just appears that with ~57 devices on my home network and the increased usage I need to plan for better routing hardware.