Bridge device stops passing traffic on ea8300s (ipq40xx) running 21.02.0

I'm running 21.02.0 on three routers: One Linksys ea8500 and two ea8300 "satellites". The ea8300s use one of their 5GHz radios for backhaul via WDS to the ea8500 and their other 5GHz radio to service clients. (The ea8500 also services ordinary clients on its single 5GHz radio).

Since upgrading to 21.02.0 a few weeks ago I've had an issue that packets randomly stop flowing between the satellites and their connected clients to the ea8500. This happens to both satellites about once every couple days.

After some troubleshooting I found that restarting br-lan on the satellites resolves the issue and gets packets flowing again until the next failure (Forcing reconnection of the WDS client, either by restarting the radio on the satellites or de-authenticating them from the WDS-AP, also resolves the issue). I set up a job on each of them to reboot every morning but this doesn't seem to help hold-off the problem because as often as not packets aren't flowing by the time I wake up an hour or so later.

I'm pretty sure I haven't had any real issues with the ea8500 since the upgrade. This leads me to wonder if its related to the SoC, since, as I recall, the ipq40XX used in the ea8300 has (other) issues with its switch.

Is anyone else seeing similar issues? Can anyone suggest a solution?

Thanks.

1 Like

There are some known issues with WDS links and 21.02.0 and 21.02.0_rc4. See the link below.

You could try 21.02.0_rc3 or the latest master builds. WDS worked well with rc3 and there should be fixes in the master now.

https://forum.openwrt.org/c/general/network-and-wireless-configuration/12

https://bugs.openwrt.org/index.php?do=details&task_id=3961&order=dateopened&sort=desc

1 Like

Thanks for the added information!

I tried rolling back to rc3 but it seems to be even worse for me. I'm pretty gun-shy about running snapshots after I ended up with a functional but un-upgradable router due to a glitch with a snapshot build.

I've rolled back to 19.08 which had been quite stable for me.

I have 3 x EA8300, all of which were on 19.7.4 with a BATMANadv+HaasMesh custom build, but that's been too problematic (too hard (for me) to add additional VLANs & Networks) and with only 3 nodes and now 2 of them wired, I don't really need the mesh functionality any more, so I reflashed one with stock 21.02.0 setup as a wired dumb-AP, and it was very flaky, stopped routing packets after a few hours, repeatedly, not responding to pings, LuCI unresponsive; only a reboot would revive it.

I was starting to think this EA8300 had a subtle hardware fault, but figured I'd try stock 19.07.8, again with a simple wired dumb-AP config, and for the last day or so it's been pretty solid.

No idea how to progress this issue, i'm not deep into the openwrt world...

I also experience freezes of the switch. The switch still receives packets, but they are not lifted to the host/interface. You can see that when doing diffs like I did here.

I hoped DSA would fix the issue, but it did not help:

I guess it is some offloading.

As a workaround I wrote naywatch:

It triggers the watchdog if the switch freezes but utilizing ipv6 mutlicast ping. However, be careful when doing sysupgrades. Always make sure procd took the control over the watchdog back again!

We use some 7530 ipq40xx device with multiple ubiquity antennas in transparent bridge mode (WDS). This device is also crashing! :open_mouth: