Tagged VLANs not working on WRT1200AC with ver 23.05.5 (DSA)

I have weird behavior to report after a lot of testing. It appears that there is buggy operation when using tagged VLAN operation on a LAN-side port on the Linksys WRT1200AC when using OpenWrt 23.05.5. The appearance of bad behavior is subtle, however: It only occurs when packets originating at the WAN interface are destined for endpoints served via the tagged VLAN interface. In particular, it is possible to access the router (via DHCP, to get IP addresses) and even the HTTP webif, but in these cases the source packets are within the router itself.

In my usage scenario, VLAN 1 is for my main private LAN, and VLAN 3 is for a guest LAN (GLAN) interface.

What I see in the failure conditions are messages from web browsers (tried Firefox under Linux and Safari under iOS) indicating that web pages cannot be opened because secure connections could not be established.

Now, if I configure a port on the WRT1200AC as untagged for VLAN 3, I am then able to connect to it and browse with a wired LAN connection, and I can also connect it to a dumb AP and use the WiFi connection with no problem.

For reasons of brevity, I won't go into gory details here, but I also have two Aruba managed switches that normally attach to two of the LAN ports on the router, and I conducted some tests with those attached to the tagged VLAN port from the WRT1200AC and had the same issues. Ditto for a "fairly dumb" AP configured (with OpenWrt) WNDR4300 router, which had VLAN tagging enabled so that the WiFi radio could serve both my private LAN and the guest LAN with WiFi access.

I have a backup router -- a Netgear WNDR4700 running OpenWrt 21 that is configured to be a drop-in replacement. When I put the in the WNDR4700 all of the problems operating over the tagged VLAN interface disappear and operation is flawless. Thus I conclude that there is trouble living in the code running the WRT1200AC. I suspect the same troubles plague the WRT1900AC and WRT3200AC (i.e. all mvebu devices).

Not sure who, if anyone, is maintaining the mvebu code these days, but if you'll contact me I'll be happy to collect any further diagnostic info you need and also test patches. I continue to use the WRT1200AC because it outperforms the WNDR4700.

You should never use any VLAN below 10 and especially not 1 with any kind of managed switch. Start at 10 or even better at 100. e.g. VLAN 100 main, VLAN 300 guest, etc.

Let's take a look at your configuration from the WRT1200AC.

Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button:
grafik
Remember to redact passwords, MAC addresses and any public IP addresses you may have:

ubus call system board
cat /etc/config/network
cat /etc/config/wireless
cat /etc/config/dhcp
cat /etc/config/firewall

This is nonsensical. There is no reason that one cannot freely use the VLAN IDs below 10. Some people suggest avoiding VLAN 1 since it is the default for many managed switches and other hardware, but this is also not a problem as long as the configurations account for this default and/or the switches are configured otherwise. I would say that the "starting" VLAN ID or the use of 100, 300 etc can be a personal opinion and style, but it has no impact on performance/security/operational factors.

ubus board output ---

{
	"kernel": "5.15.167",
	"hostname": "XXXX",
	"system": "ARMv7 Processor rev 1 (v7l)",
	"model": "Linksys WRT1200AC",
	"board_name": "linksys,wrt1200ac",
	"rootfs_type": "squashfs",
	"release": {
		"distribution": "OpenWrt",
		"version": "23.05.5",
		"revision": "r24106-10cc5fcd00",
		"target": "mvebu/cortexa9",
		"description": "OpenWrt 23.05.5 r24106-10cc5fcd00"
	}
}

/etc/config/network ---
`
config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'lan4'
	list ports 'tap0'
	option ipv6 '0'

config interface 'lan'
	option device 'br-lan.1'
	option proto 'static'
	option ipaddr '192.168.10.1'
	option netmask '255.255.255.0'

config interface 'glan'
	option proto 'static'
	option device 'br-lan.3'
	option ipaddr '192.168.2.1'
	option netmask '255.255.255.0'
	
config device
	option name 'wan'
	option macaddr 'xx:xx:xx:xx:xx:xx'
	option ipv6 '0'

config interface 'wan'
	option device 'wan'
	option proto 'dhcp'

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan4'
	list ports 'tap0'

config bridge-vlan
	option device 'br-lan'
	option vlan '3'
	list ports 'lan1:t'
	list ports 'lan3'
	list ports 'lan4:t'

config device
	option name 'tun0'
	option ipv6 '0'

config device
	option name 'tap0'
	option ipv6 '0'

config interface 'bvpn'
	option proto 'none'
	option device 'tap0'

config interface 'rvpn'
	option proto 'none'
	option device 'tun0'

/etc/config/wireless ---

config wifi-device 'radio0'
	option type 'mac80211'
	option path 'soc/soc:pcie/pci0000:00/0000:00:01.0/0000:01:00.0'
	option channel '36'
	option band '5g'
	option htmode 'VHT80'
	option country 'US'
	option cell_density '0'
	option disabled '1'

config wifi-device 'radio1'
	option type 'mac80211'
	option path 'soc/soc:pcie/pci0000:00/0000:00:02.0/0000:02:00.0'
	option channel '1'
	option band '2g'
	option htmode 'HT20'
	option country 'US'
	option cell_density '0'
	option disabled '1'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option network 'lan'
	option mode 'ap'
	option ssid 'xxxx-xx'
	option encryption 'psk2+ccmp'
	option macaddr 'xx:xx:xx:xx:xx:xx'
	option dtim_period '3'
	option key 'xxxxxxxxxxxxxxx'
	option ieee80211r '1'
	option mobility_domain 'XXXX'
	option ft_over_ds '0'
	option ft_psk_generate_local '1'
	option disabled '1'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option network 'lan'
	option mode 'ap'
	option ssid 'xxxx-xx'
	option encryption 'psk2+ccmp'
	option macaddr 'xx:xx:xx:xx:xx:xx'	
	option dtim_period '3'
	option key 'xxxxxxxxxxxxxxx'	
	option ieee80211r '1'
	option mobility_domain 'XXXX'
	option ft_over_ds '0'
	option ft_psk_generate_local '1'
	option disabled '1'

config wifi-iface 'wifinet2'
	option device 'radio0'
	option mode 'ap'
	option ssid 'xxxxx'
	option encryption 'psk2+ccmp'
	option dtim_period '3'
	option key 'xxxxxxxx'
	option ieee80211r '1'
	option mobility_domain 'XXXX'
	option ft_over_ds '0'
	option ft_psk_generate_local '1'
	option network 'glan'
	option isolate '1'
	option disabled '1'

config wifi-iface 'wifinet3'
	option device 'radio1'
	option mode 'ap'
	option ssid 'xxxx'
	option encryption 'psk2+ccmp'
	option dtim_period '3'
	option key 'xxxxxxxx'
	option ieee80211r '1'
	option mobility_domain 'XXXX'
	option ft_over_ds '0'
	option ft_psk_generate_local '1'
	option network 'glan'
	option isolate '1'
	option disabled '1'

/etc/config/dhcp ---

config dnsmasq
	option domainneeded '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option nonwildcard '0'
	option quietdhcp '1'
	list server '127.0.0.1#1053'
	option noresolv '1'
	list notinterface 'wan'
	list interface 'br-lan.1, br-lan.3'
	option localservice '0'

config dhcp 'lan'
	option interface 'lan'
	option netmask '255.255.255.0'
	option limit '16'
	option leasetime '15m'
	option start '201'
	list dhcp_option '46,8'
	list dhcp_option '44,192.168.10.25'
	list dhcp_option '3,192.168.10.1'

config dhcp 'glan'
	option interface 'glan'
	option netmask '255.255.255.0'
	option limit '20'
	option leasetime '1h'
	option start '101'
	list dhcp_option '3,192.168.2.1'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'
	list ra_flags 'none'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '4'

/etc/config/firewall ---

config defaults
	option output 'ACCEPT'
	option forward 'REJECT'
	option disable_ipv6 '1'
	option drop_invalid '1'
	option input 'DROP'
	option synflood_protect '1'
	option flow_offloading '1'
	option flow_offloading_hw '1'

config zone
	option name 'wan'
	option output 'ACCEPT'
	option forward 'REJECT'
	option masq '1'
	option mtu_fix '1'
	option family 'ipv4'
	option input 'DROP'
	list network 'wan'

config zone
	option name 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	list network 'lan'
	list network 'bvpn'
	option family 'ipv4'

config zone
	option name 'glan'
	list network 'glan'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	option input 'ACCEPT'
	option family 'ipv4'

config zone
	option name 'vpn'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	option family 'ipv4'
	list network 'rvpn'

config forwarding
	option dest 'vpn'
	option src 'lan'

config forwarding
	option dest 'wan'
	option src 'lan'

config forwarding
	option dest 'lan'
	option src 'vpn'

config forwarding
	option dest 'wan'
	option src 'vpn'

config forwarding
	option dest 'wan'
	option src 'glan'

config nat
	list proto 'tcp'
	list proto 'udp'
	option name 'Masquerade VPN to WAN'
	option src_ip '192.168.11.0/24'
	option target 'MASQUERADE'
	option src 'wan'

config rule
	option name 'Allow-DHCP-Renew'
	option src 'wan'
	option proto 'udp'
	option dest_port '68'
	option target 'ACCEPT'
	option family 'ipv4'

config include
	option path '/etc/firewall.user'

Thanks for the reply. I'm familiar enough with VLAN configuration to know that the default management VLAN is 1, and on OpenWRT it's not unusual for the WAN interfaces to get VLAN 2. Haven't heard of anybody putting clamps on anything starting and 3 and up, however. In my case, I'm keeping it simple by having my private LAN and management LAN combined. The guest LAN is kept separated, and (at least with the WNDR4700) the whole collection of devices works well. The only "loose" area is that I don't wall off access to the router or APs to the guests, but I also don't have black hat hacker guests.. :slight_smile:

I recommend explicitly indicating the untagged + PVID status by adding :u* to those ports for each bridge VLAN. It will look like this:

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'
	list ports 'lan2:u*'
	list ports 'lan4:u*'
	list ports 'tap0:u*'

config bridge-vlan
	option device 'br-lan'
	option vlan '3'
	list ports 'lan1:t'
	list ports 'lan3:u*'
	list ports 'lan4:t'

Meanwhile, there are issues in your dhcp file.

Delete last 3 lines here:

Remove the netmask in both of these:

Reboot and test again. If that doesn't resolve the issue, I believe that the problem is DHCP related because of the changes you've made to the DNS handling. But we'll cross that bridge when we get there.

Wait a minute. If I didn't make this clear the first time, I'll do so now: When I connect either a wired device (a laptop computer) or a dumb AP to the lan3 (VLAN 3 untagged) port, everything works perfectly. And on VLAN 1 everything works perfectly. There is nothing to investigate re: DNS. I use Unbound in series with dnsmasq as an upstream resolver and it works fine. I did not include a lot of details about this since DNS resolution is never a problem on untagged LAN interfaces. Let's not turn this into a wild goose chase.

sure.... but I'm making a general recommendation here.

The interfaces and notinterfaces as shown in the config file are not necessary and more often than not cause problems. Same with the netmask in the DHCP server sections.

So while yes, the untagged stuff is working, lets make sure that these two things are cleaned up.

Made your suggested changes. The behavior of the router is unchanged. Any traffic using a tagged VLAN 3 interface doesn't arrive from the WAN, but if I connect directly to the untagged VLAN 3 interface (lan 3), everything is fine. Also, your suggestions for removing directives to dnsmasq did not result in any changes to the /tmp/etc/dnsmasq.conf.cfg01411c file. I have seen problems with dnsmasq not getting the correct interfaces in the past, however, which is why I maintained those directives now.

I took a look at the OpenWrt issues on Github, and there are at least a couple that seem to be reporting a similar issue. I think I should probably file a report there.

Hi

The following links discusses issues with the mvebu switch:

the mvebu switch was broken with 22.03, but not quite fixed with 23.05 .

upgrade to kernel 6.1 or higher (aka 24.10)

I stopped using 22.03 but I was using a trunked vlan and didn't notice the switch working as a hub. with 23.05, there was weird behavior with vlans, but it worked with multiple bridges. Ran the snapshot with the testing kernel 6.1, and it was working okay. Currently on 24.10.1, no issues.

1 Like

I was going to suggest this, but was beaten by @94121-usr ...

I think this is sound advice. Try the latest because it isn't terribly likely that 23.05 will be fixed if there is a bug there (presumably there could be one or two more maintenance releases before it is EOL'd, but you'll want to move up to 24.10 anyway).

Okay, since there's traction on kernel 6.1 and version 24.1, I'll upgrade to that first and see if the issue persists. Question for you: What methods did you use to detect "hub" operation in version 23.05?

I have not seen any indication of "leakage" between my two VLANs, but if there are precise tools to detect or test for that, I'd like to know about them, because aside from my own use (which uses tagged and untagged VLAN definitions on the LAN ports to managed switches), I am also assisting relatives with a small business who use the same routers... they do not need tagged VLAN support at the wired LAN ports, but they do rely upon having different VLANs available in untagged form at the wired LAN ports. But they sure do not need the mvebu switch acting like a hub!

Hi

The switch, acting like a hub, was fixed with 23.05. But some issues/weirdness remains.

With 23.05, I cannot recall most because I didn't test anything, because they were not reproduceable, random weirdness. The only consistent issue, was that android devices would get slaac addresses from other vlan subnets. It's normal to get multiple IPv6 addresses, but not from firewalled subnets. IPv6 was leaking through vlan filtering.

With 24.10, everything works.

Okay... I can report that my Linux laptop (an Asus X200CA) has only a 10/100 wired LAN port, but I did not notice any speed degradation in other parts of the network when it was attached. If there's a better way to test for "hub" behavior I'm all ears to hear it. The router did report the port's speed as 100 Mbps (in the bridge device VLAN filtering dialog), but all other interfaces (except tap0, a bridged VPN interface that is presently disabled) showed as operating a 1000 Mbps.

I believe one experienced user in Brazil had found a way to detect hub operation when using IPv6, but I don't use that and don't have any inclination to mess with it on a production router.

Issue was about packet behaviour of the switch acting as per a hub, described by issue, the commit when target was disabled.

Took awhile to upgrade (I always start from the stock firmware in the alternate partition and overwrite the OpenWrt partition, then I install and configure 20+ packages), but I now have version 24.10.1 (with kernel 6.6.86), and the issues with traffic on tagged VLAN interfaces has disappeared, with no new issues appearing. So I'd say this issue can be closed and marked as solved. Thanks for the assistance to you and other forum participants here.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.