Router crashing during RDP sessions

Hi there! I've got two Archer C7 v2s running the current version OpenWRT in a mesh config. I've had a problem since v21 that I cannot figure out.

I have a Windows machine connected to the primary router's LAN. When I access that Windows machine over RDP using wireless clients, the primary router will sometimes close the connection with the Windows machine and then refuse to handle ANY further traffic from any client; essentially, the RDP session is breaking my house Internets.

When this happens, LuCI will usually (but not always!) stay responsive enough that I can at least connect to the router via HTTP and reboot it in LuCI to make my Internets work again. The only thing I've seen in the logs is some kind of dnsmasq failure but it doesn't seem directly correlated in terms of the time stamps. Also, I can't connect to anything (other than the routers, anyway) with direct IP address reference or ping anything successfully when this blows up, so this clearly is not purely DNS.

The crashes are most frequent with the combination of Remmina and a Linux box along with a relatively graphically active desktop (something like interface animations in a web browser, accidentally clicking on a YouTube video, or navigating around the interface of Inkscape or a DAW application on the Windows box). In some cases I'll have to reboot the router two or three times inside of half an hour, although it probably freezes up network traffic about once every 2-3 hours of RDPing on average. I have had less trouble when using Microsoft's RDP client for macOS and iPadOS than with Linux / Remmina, but I have seen the same router crashes happen there too.

Is there something additional I should be logging or don't know what to look at? I am technical but definitely a very basic OpenWRT user and a networking dunderhead just generally.

Sounds like dnsmasq is getting overloaded, but I am no expert and just guessing. I read here that some other people are having problems with dnsmasq and it seems the solution for them was running a snapshot build of OpenWRT. You could also try restarting the dnsmasq service to see if that fixes the issue temporarily. You could use the firmware builder page and add in luci to the package list to get an easy to use snapshot that you could try out. The snapshots I have been using are very stable on my less established hardware (Netgear Wax206) so I think they should work fine for you.

1 Like

Does this happen over WIFI and LAN?
Any chance its overheating?

The only machine I usually connect in my house via LAN is the Windows machine I'm remoting into. Overheating is possible but I haven't checked thermal conditions. It seems to be only very specific situations involving that Windows box, though - heavy network draw or numerous devices don't cause it to freak, it's only RDPing into that machine. I have noticed that the "silent crash" can occur when I'm also doing large transfers into the Windows machine while I am RDPing / viewing less graphically involved traffic, so in general it seems to have something to do with the thoroughput on the LAN, but it could also be something from the Windows machine itself that is causing the lockups.

I've also had this problem with two completely different Archer C7s in the "main router" role, I should add. I haven't tried anything with more CPU oomph than the C7s.

I'll check out the thread and probably try the snapshot. The problem has been persistent from 21.x to 22.04+ but maybe something is behaving better in recent commits? Thank you!

...Oh, and I have tried restarting (or killing / restarting) dnsmasq from ssh before when this happens, but no dice. This again makes me think that dnsmasq isn't really the issue.

1 Like

If you are connected wireless on 2.4Ghz,then you need to disable ldpc.

2 Likes

Thanks sammo, I will try this out!

1 Like

...although reading through the quoted thread, the issue isn't spontaneous unexplained death over wireless, it is instant death of the entire network caused specifically by and only in the midst of higher-bandwidth traffic to / from a specific client (edit: while connected to another client over RDP - it's only those circumstances that cause the issue).

I had an Archer C7 v2 that would freeze wireless upon high wired throughput. And it required a reboot to get it operational again.

Try swapping them (make backups, then cross restore), at least you can rule out hardware after.

1 Like

Hrm. Yeah, I have had two C7 v2s in the same role in the network now with the same outcomes. It sounds like it might be something specific to the hardware. What did you replace your C7 with? I went with a pile of these since they were budget friendly and maybe it's a good idea to try another candidate.

Another C7 v2, still having issues once in a while, but not as bad.

So after bigsmile mentioned the same issues with their C7 v2 (or v2s), I decided to try another OpenWRT router entirely as the "main router" with LAN-connected machines etc., and after perusing the hardware rec forums, ultimately went with a Netgear WAX206. With the WAX206 in place, so far, I can senselessly beat on the same Windows machine with the same LAN connection in the same situations and applications that would nearly always cause a wifi crash over RDP with one of my C7 v2s in that role, and it's seeming totally solid with the Netgear, no crashes wirelessly RDPing in over Linux or macOS so far. The WAX206 is running a snapshot build of OpenWRT so it's impossible to know if the problem is truly the C7 v2 hardware in the main-router role or if it's some fix in OpenWRT in a more recent build of 22.03. I will report back after a few weeks (and also a chance to try beating up on RDPing into the same machine over one of the mesh APs, which are still C7 v2s), but mostly I wanted to update this in case anyone else is experiencing similar symptoms in future - it might be worth trying a completely different router. The WAX206 is def more expensive than the C7 but it's hardly a billion dollars, and it has a lot more RAM and flash breathing room plus the MediaTek chips that seem to enjoy somewhat better support generally under OpenWRT.

1 Like

99.99999% sure it’s because of the ct-firmware. Replace it with the non-ct firmware and I’ll bet things will automagically work.