Help debugging Connection loss

Hi, I have trouble debugging my network. I have a "complex" in the sense that I have multiple VLAN's (WAN side, Guest network...), a Netgear GS724T, fallback 4G internet through a proxmox container etc.

I get connection loss several times a day now with the following messages in kernel and sys logs. I would like to do a tcpdump on the router when this happens but I cannot access it. In fact no traffic at all passes through the openwrt router during approximately 4-5minutes (quite consistently)

Don't know hot to best post my setup here but if I describe it with words it looks like this:

Router

OpenWRT 22.03.2 r19803-9a599fee93 on a WRT3200ACM
Vlans/interfaces:

  • 100 Guest network on br-lan. This interface is a copy of the LAN interface with equal firewall rules, its own DHCP server and subnet as you would expect
  • 700 WWAN Also on br-lan, this is configured with dhcp receiving IP from a virtual machine giving 4G internet. It is setup as a copy of the regular WAN interface with equal firewall rules
  • LAN (untagged) as default on br-lan. Nothing special here
  • WAN is also nothing special, running on its own port "wan"

Netgear switch

All settings besides VLAN (inc TAG and PVID) are default

  • VLAN1, the untagged default goes on ports 2-24
  • VLAN 50 goes to the fiber converter (Same as WAN on the router). It is untagged on port 1 and then tagged on ports 2-24. Also the only VLAN that is also configured with PVID on port 1 with id 50
  • VLAN 100 is tagged on ports 2-24
  • VLAN 700 is tagged on just a few ports
    I have removed all LAG(G) configurations

Proxmox

Running a 4G router VM with three interfaces

  • The Untagged VLAN 1 (LAN on OpenWRT) where it can communicate with the rest of the network
  • VLAN 700 giving the router an ip, internet etc.
  • USB Modem

Subnets

VLAN 1 192.168.1.0/24
VLAN 50 Externap IPs
VLAN 100 192.168.2.0/24
VLAN 700 10.0.0.0/24

syslog

Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.238 68:d7:9a:16:6b:36
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.238 68:d7:9a:16:6b:36 BiArea
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPDISCOVER(br-lan) 68:d7:9a:16:6b:36
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPOFFER(br-lan) 192.168.1.238 68:d7:9a:16:6b:36
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.238 68:d7:9a:16:6b:36
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.238 68:d7:9a:16:6b:36 BiArea
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.238 68:d7:9a:16:6b:36
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.238 68:d7:9a:16:6b:36 BiArea
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.123 7c:9e:bd:3d:b6:b8 mmwave-office
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPREQUEST(br-lan) 192.168.1.238 68:d7:9a:16:6b:36
Sun Jun 23 11:49:37 2024 daemon.info dnsmasq-dhcp[1]: DHCPACK(br-lan) 192.168.1.238 68:d7:9a:16:6b:36 BiArea
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.437410] net_ratelimit: 3864 callbacks suppressed
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.437414] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.452759] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.463014] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.473500] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.483989] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.494465] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.504948] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.515421] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.525996] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
Sun Jun 23 11:49:37 2024 kern.warn kernel: [312130.536481] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)

kernel log


[312125.352262] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.437410] net_ratelimit: 3864 callbacks suppressed
[312130.437414] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.452759] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.463014] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.473500] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.483989] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.494465] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.504948] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.515421] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.525996] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)
[312130.536481] br-lan: received packet on lan2 with own address as source address (addr:24:f5:a2:2d:81:70, vlan:0)

Thats switching loop and packet storm, you need to enable STP

Hi @brada4, thanks for the reply.

switching loops are of course no good thing. I had checked that STP was enabled on the netgear 24p switch previously but see now that it is off on the router. I'm enabling it now for the br-lan interface which is used by all VLANs on the router.

As you comment on my setup being prone to switching loops, do you have any good advice on how I can better setup my environment?
I have a few limitations, mainly that the proxmox host only has one NIC. I had hoped that using tagged VLAN's (802.1Q) would keep traffic "clean". I have separate virtual interfaces for all VLAN's on both my openwrt router and proxmox server on related VM's.
I would kindly take your advice.

I can also note that there are two devices constantly asking my DHCP server for a new lease. I have now disconnected these devices, one being a Unifi WAP and another a mmwave sensor (both are seen in the previously attached logs). I have multiple Unifi WAP's and this is the only one showing up in this way in the logs.

Explore lacp aka bonding. STP still usefull but does not KO whole thing

As far as I understand LACP its just dynamic LAGG which would help with bandwidth, not resolve any switching loops?
I had intentionally disabled all LAGG configurations in the netgear switch as I thought this is not something that I want? Had mentioned it but misspelled it in my first post, corrected now.

Dont loop cables?

This I have double checked and its not the case. Could easily cause these issues for sure!

It is a certain forwarding loop that overloads segments in your network (see the rate at rate limit place).
You cannot outwit that by boss talking.