VLAN, Cannot access LuCI (but ssh fine) [Solved]

hi @arrmo

Actually, bumped into one more oddity - ssh works great, but scp fails. Huh? LOL!

did you ever try to determine MTU ?
here i mean, ping size 1490, 1496, 1500 byte ?
because if ping,ssh,dhcp work, and Luci won't ... maybe it is DSA/Vlan (4byte header) issue ?

Really appreciate the comments / thoughts - thanks! Like I mentioned above, I'm going to reset back to defaults, walk through the setup and try to see where things break :grinning_face_with_smiling_eyes:

To try to clarify ... VLAN on the LAN side is working great (with the device not having DSA, if I understand the delta correctly). The issue is that I seem to lose WAN in the process somehow => that's my core / key issue (on the non-DSA router) ... no uplink anymore.

On the DSA router (again, if I have them right), that one I can't get to scp or LuCI, but ssh and DHCP work (over VLAN). Very odd. But getting back to that one, after figuring out why WAN is broken (above, non-DSA case).

Hopefully that's all clear. And again - thanks!

Start with the LAN from the default configuration -- verify that it works properly. Then create one new VLAN and configure it against another port on the built-in switch (as an access port). Keep it simple. If the LAN works but the second network doesn't, post your configuration details and we'll help figure out what might be wrong.

1 Like

That part it did - before at least ... LOL. That's why I plan to reset, start clean / fresh.

Sorry, just to understand ... do you mean, only add VLAN to a single port, don't mess with the others? And ... "access" port - you mean to check with the VLAN modified port, and one that is not part of the VLAN, correct?

Thanks!

An access port is a port that has only one network assigned -- untagged (no tagged networks).

What I am suggesting, in plain terms, is to start with the default configuration. In that setup, a device with 5 built-in ports would typically have a WAN and then 4 ports dedicated to the LAN. Test this first to make sure everything works as expected. Then, add another network/VLAN (call it LAN2), but assign it to say port 2 as untagged (and turn off VLAN 1 on port 2). Now you'd have port 1 = LAN, port 2 = LAN2. You can easily physically switch your computer's connection between ports 1 and 2 and see if the network functionality is operating as expected.

1 Like

Makes sense, will give it a try (once back home again, not good when work gets in the way of fun :stuck_out_tongue_winking_eye:).

Thanks!

OK, to store / capture a baseline ... here is what I have currently, before reset. It's very odd. I like your idea of having an access port, so what I did ...


Port 4, connected directly to a PC NIC, no VLAN on.
Port 2, to managed switch, tagged traffic on VLAN 250. That switch is also connected to a PC NIC, with VLAN 250 enabled.

What seems right :laughing: ... the PC NIC on VLAN 250, gets and IP, from the correct subnet (assigned to that VLAN). But what is odd,

  1. directly connected NIC gets and IP from my "other" DHCP server (on the network). Has to be untagged traffic, coming in on Port 4, back out on Port 2 (to the managed switch). I say that because,
  2. Really odd, and why I have no upstream (WAN) ... I see this in the kernel log. Huh?!?!
# dmesg | grep -i eth1
[   24.831576] ess_edma c080000.edma eth1: Link is Down
[   27.929472] ess_edma c080000.edma eth1: Link is Up - 1Gbps/Full - flow control off

It's not down! There is a connection there, even WAN LED on. Very odd. OK, time to factory reset :stuck_out_tongue_winking_eye:

Take everything out of the equation except the WAN, the router in question, and one computer. Set the computer to get an address via DHCP on an untagged network (i.e. clear any tagged network settings on that computer).

Make sure that you can get connectivity after doing the reset to defaults. Then add in the 2nd network as I described (assign port 2 as an access port for the second network). Connect the PC to port 2 and see what happens.

Agreed! So, after a clean reset - all is OK and working. I can ping upstream, WAN IP looks right, as does the direct connection from my (untagged) PC NIC to Port 4 (getting 192.168.1.x from the router). VLAN (untouched) looks like this,

So then, I removed LAN1, untagged => off. Still all good.

Next up - add VLAN 250, all ports for that VLAN (CPU, LAN1-4 set to off) ... boom! I can no longer ping upstream. The routing table looks unchanged, but it's not working now. Delete that (unused) VLAN ... upstream ... ping is OK again.

But! Then I changed the VLAN ID ... upstream is OK, only as long as that second VLAN ID is 2 - no other value works. Hmm ... but then I can add more, if they are sequentially added => it's only happy if then I add 3, then 4. But then, if I go back and delete 3 ... leaving 1, 2, 4. It's OK. Sounds like a bug to me, no?

But that aside, if I now go back only keep 1 and 2. All off on VLAN 2, and upstream ping is OK. Like this,

But, as soon as I turn on (even untagged) LAN1 ... upstream ping is down (config like below). Set LAN1 back to off (VLAN 2), and upstream is OK. Seems like something is broken, no?

Thanks!

This may be your issue. If so, this is a hardware limitation, not a bug in openwrt.

And actually, it may only allow vlan ids 1-15

Yes, agreed - no argument. But it breaks if I turn things on (as above), only using ID's 1 and 2. Nothing higher than 2 :smiley:

Thanks!

Assumign that you are using the EA3500 on a snapshot (from earlier in the thread), try installing an official stable release of OpenWrt (19.07.8)

Sometimes snapshots can have random things broken.

Actually, swapping between two routers - as they each seem to have issues with VLAN, just different. The debugging above (recent) is on an RT-AC58U. But in any case, did as you mentioned, stepped back to v19.07.8. Exactly the same issue. As soon as anything is tagged or untagged on VLAN 2 (vs. off), then the uplink breaks :frowning_face:. Dang it!

Thanks,

Stick with one router until we get you settled here -- otherwise we could end up battling different issues if the switch chips operate substantially differently.

Please post your /etc/config/network file.

I just configured one of my spare devices with a VLAN config similar to what you're trying to do. I'm using a Linksys E3000 (really old) and 19.07.7 (haven't bothered to update it since it was sitting in a closet). The VLANs work perfectly. Here is my network config file - you can use this as an example.

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd11:d771:abe3::/48'

config interface 'wan'
	option ifname 'eth0.1'
	option proto 'dhcp'

config interface 'wan6'
	option ifname 'eth0.1'
	option proto 'dhcpv6'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0.2'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '0 8t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '1 2 4 8t'

config switch_vlan
	option device 'switch0'
	option vlan '200'
	option ports '3 8t'

config interface 'VLAN200'
	option ifname 'eth0.200'
	option proto 'static'
	option ipaddr '192.168.200.1'
	option netmask '255.255.255.0'

EDIT: I just updated to 19.07.8 and it is still working as expected.

2 Likes

I just had another thought...

what is upstream of your OpenWrt router? Do you have a router (or a modem+router combo) connected between your ISP and your OpenWrt device? If you do, it is critical that you do not setup a subnet/VLAN on the OpenWrt router that is in the same address space as the upstream device. So for example, if your upstream router is 192.168.2.0/24, do not use that range on your OpenWrt router or you will cause the routing to break.

Your config is a canonical example of splitting ports/VLANs mentioned above.
Meanwhile the OP is trying to disable the CPU port and use mixed tagging.
I'm afraid that the router's built-in switch may not support this kind of setup.
Probably the hardware implementation supports only limited config scenarios.

Yeah, you're right about that... but my thinking is that hopefully we can get the OP running on a basic VLAN configuration working in general. Unless there is a hardware limitation, it seems that there is something wrong with the way that the OP is configuring the VLANs. My E3000 (from ~2010; I keep this as my little dev box to demonstrate exactly these types of things) supports VLANs without issue.

I don't understand why there would be any reason to turn off the VLAN on the CPU port, unless the device was purely acting as a managed switch and not a router (wrt the VLAN(s) in question). If working in that context, though, I can't see why turning off the CPU for a VLAN would cause the other network(s) to fail to function properly.

Anyway, in the below example, I setup a trunk port on physical port 4 (port 1 in the config) with VLAN 2 (LAN) untagged and VLAN 200 tagged. This also worked exactly as expected on the E3000. I could have made VLAN 2 tagged, too, so that the trunk would be only using tagged networks, but I just wanted to prove that the trunk worked as per the 802.1q spec on this device (and it does). I even turned off the CPU on VLAN 200 and VLAN 2 continued to function properly (obviously VLAN 200 became merely switched with no functional connection to the routing layers of the system). I also proved that VLAN 200 worked for switching with port 2 untagged, port 4 tagged, and the CPU off.

It is entirely possible that the newer hardware that the OP is using was cost-reduced (relative to my really old device) such that they went with a switch that was severely limited for VLANs, but from what I can tell, the chip in the EA3500 is supposed to be VLAN aware.

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '0 8t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '1 2 4 8t'

config switch_vlan
	option device 'switch0'
	option vlan '200'
	option ports '1t 3 8t'
2 Likes

Sorry for the confusion above - I was trying two routers intially, as I was having (different) issues with each :laughing:. But all of the recent info is on one, the rt-ac58u (though I can use the EA3500 if you prefer for now, just yell if you want me to). To the questions above ... yes, I am avoiding routing loops - only two ports are connected (to demonstrate the issue I am having) ... WAN => to my switch / uplink, and LAN4, directly to my PC NIC. Nothing else connected.

So then, running with v19.07.8, I have two cases (as above), both with LAN1 off (just the way I last had it)

  1. VLAN 1 = untagged, all ports (CPU, LAN2-4), VLAN 2 = off, all ports. This case works fine, can ping upstream.

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd77:b9d1:c221::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config device 'lan_eth0_dev'
	option name 'eth0'
	option macaddr '...4c:a8'

config interface 'wan'
	option ifname 'eth1'
	option proto 'dhcp'

config device 'wan_eth1_dev'
	option name 'eth1'
	option macaddr '...4c:ac'

config interface 'wan6'
	option ifname 'eth1'
	option proto 'dhcpv6'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option vid '1'
	option ports '0 3 2 1'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option vid '2'
  1. VLAN 1 = the same as case 1, no change. VLAN 2 = just turn on LAN1 ... tagged (or untagged, same result). No cable even connected to it. But then ... upstream fails, cannot even ping.

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd77:b9d1:c221::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth0'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config device 'lan_eth0_dev'
	option name 'eth0'
	option macaddr '...4c:a8'

config interface 'wan'
	option ifname 'eth1'
	option proto 'dhcp'

config device 'wan_eth1_dev'
	option name 'eth1'
	option macaddr '...4c:ac'

config interface 'wan6'
	option ifname 'eth1'
	option proto 'dhcpv6'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option vid '1'
	option ports '0 3 2 1'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option vid '2'
	option ports '4t'

I did compare the files, the only difference seems to be that single tagged port. Thoughts?

Thanks!