[Solved] Xiaomi mi ac2100 VLAN BUG?

EDIT: Fixed error, mikrotik<>openwrt cable should have untagged 192.168.1.0/24 and tagged vlan id 20 192.168.20.0/24
Something very wrong with vlans on xiaomi ac2100 or PEBCAK.

+---------------+                             +-------------------+                            +----------------------------------------+
|               |                             |                   |                            |                                        |
|               | untagged 192.168.1.0/24     |   mikrotik        | untagged 192.168.10.0/24   |              openwrt                   |
|    router     +---------------------------->|                   +--------------------------->|                                        |
|               |  vlan 10 192.168.10.0/24    |   vlan switch     |  vlan 20 192.168.20.0/24   |              vlan switch               |
|               |  vlan 20 192.168.20.0/24    |                   |                            |              + dumb ap                 |
+---------------+                             +-+-----------------+                            +-+-----------------------+--------------+
                                                |                                                |                       |
                                                |                                                |                       |
                                                |                                                |                       |
                                                |untagged 192.168.1.0/24                         |untagged               |untagged
                                                | vlan 10 192.168.10.0/24                        |192.168.10.0/24        |192.168.20.0/24
                                                | vlan 20 192.168.20.0/24                        |                       |
                                                |                                                |                       |
                                                |                                                |                       |
                                                v                                                v                       v
                                              +-------------------+                           +------------------+     +-----------------+
                                              |                   |                           |                  |     |                 |
                                              |     desktop       |                           |       nas        |     |   notebook      |
                                              |                   |                           |                  |     |                 |
                                              |     vlans         |                           |                  |     |                 |
                                              |                   |                           |                  |     |                 |
                                              +-------------------+                           +------------------+     +-----------------+

Everything works fine except openwrt vlan setup.

I have several network (dhcp, routing and firewalling on router), main two are:
192.168.10.0 - trusted hosts
192.168.20.0 - less trusted hosts

There are interfaces on openwrt:
lan1@eth0 - cable from mikrotik, there are untagged (192.168.10.0) and vlan10 (192.168.20.0) traffic
lan2@eth0 - cable to nas, only untagged (192.168.10.0)
lan3@eth0 - cable to notebook, only untagged (192.168.20.0)
wan - only untagged (192.168.20.0)
wifi ap which will be linked to "openwrt interface" of bridge br-lan10 (see below)

Don't know what lanX interfaces have in common with eth0 and what eth0 actually is. Maybe some sort of switch?

I've created new interface with vlan 20 on lan1.
Then created two bridges:

  • br-lan10 with ports lan1, lan2
  • br-lan20 with ports lan1.20, lan3, wan

Then i created "openwrt interfaces" for those bridges with static ipv4 addresses. Because otherwise openwrt would not create bridges, thank you very much.

At this moment my network 192.168.10.0 works fine - all ports and wifi work as they should.

I can't say the same about 192.168.20.0:

  • i can ping from (router, desktop) to openwrt
  • i can NOT ping from (router, desktop) to notebook

After maaany hours i found half-working solution - enable vlan filtering on br-lan20. Add random vlan id (why? i've tried several different, doesn't matter) with U|* for every port in this bridge (lan1.20, lan3, wan)
Now:

  • i can NOT ping from (router, desktop) to openwrt
  • i can ping from (router, desktop) to notebook

Another workaround - change "openwrt interface" device from br-lan20 to br-lan20.20, now openwrt is also pingable.

But there is no way to create wifi ap on this bridge. When trying to link wifi ap to "openwrt interface" owrt_br_lan20 - no traffic goes through wifi. When trying to link wifi ap to something like owrt_br_lan20.20 luci says "Expecting: valid UCI identifier".

Also cpu on openwrt ~50% used when testing desktop<>notebook connection with iperf3 (one direction, 500Mbit/s)

So:

  • openwrt
  • shitty xiaomi hardware
  • pebcak
    ?

almost forgot, here is latest half-working config (vlan ids for br-lan20 ports lan1.20, lan3, wan set to 99)


config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option packet_steering '1'
	option ula_prefix 'fd68:ff74:a7d0::/48'

config device
	option name 'lan1'
	option ipv6 '0'

config device
	option name 'lan2'
	option ipv6 '0'

config device
	option name 'lan3'
	option ipv6 '0'

config device
	option name 'wan'
	option ipv6 '0'

config device
	option type 'bridge'
	option name 'br-lan10'
	option bridge_empty '1'
	option stp '1'
	option ipv6 '0'
	list ports 'lan1'
	list ports 'lan2'

config interface 'owrt_br_lan10'
	option proto 'static'
	option device 'br-lan10'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'
	option gateway '192.168.10.1'
	list dns '192.168.10.1'
	option delegate '0'

config device
	option type '8021q'
	option ifname 'lan1'
	option vid '20'
	option name 'lan1.20'
	option ipv6 '0'

config device
	option type 'bridge'
	option name 'br-lan20'
	list ports 'lan1.20'
	list ports 'lan3'
	list ports 'wan'
	option bridge_empty '1'
	option stp '1'
	option ipv6 '0'

config interface 'owrt_br_lan20'
	option proto 'static'
	option device 'br-lan20.99'
	option ipaddr '192.168.20.111'
	option netmask '255.255.255.0'
	option gateway '192.168.20.1'
	option defaultroute '0'
	option delegate '0'

config bridge-vlan
	option device 'br-lan20'
	option vlan '99'
	list ports 'lan1.20:u*'
	list ports 'lan3:u*'
	list ports 'wan:u*'

There are lots of problems with this config. Most likely no bugs related to VLANs here, just incorrect configuration methods.

Reset your OpenWrt device to defaults and post the network config here... we'll guide you through adding the VLANs correctly.

Also, please revisit your description and diagram -- you have inconsistencies about what you're calling VLAN 10, VLAN 20, and "untagged".

No, i've checked diagram, everything is correct.

There are lots of problems with this config

For example?

See factory config at the end.

Again, i need:

  • one bridge with lan1, lan2 ports and wifi ap
  • another bridge with lan1.20, lan3, wan ports and another wifi ap
  • no ipv6
  • no dhcp, routing, nothing.

Factory config:


config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fde9:54b5:7e19::/48'
	option packet_steering '1'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'

config interface 'lan'
	option device 'br-lan'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config interface 'wan'
	option device 'wan'
	option proto 'dhcp'

config interface 'wan6'
	option device 'wan'
	option proto 'dhcpv6'

Are you certain? For example, I see the router to 'tik connection has 3 VLANs:

  • untagged 192.168.1.0/24
  • VLAN 10 (tagged) 192.168.10.0/24
  • VLAN 20 (tagged) 192.168.20.0/24.

Then, from the 'tik to the OpenWrt device, there are two:

  • untagged 192.168.10.0/24 <-- this subnet was tagged VLAN 10 previously, it's fine for this to be untagged here if you want.
  • tagged VLAN 10 192.168.20.0/24 <--- this subnet was VLAN 20. How did it suddently become VLAN 10?

You have the same ports defined in multiple bridges, you've mixed dotted notation with DSA syntax in invalid ways, incomplete definitions of what is tagged vs untagged on each port, etc. There's a lot wrong, but now that it is reset, we can build up from there.

Before going further...

  • What subnet and address do you want the OpenWrt router to use for its management?

Yes, that is correct.

this subnet was VLAN 20. How did it suddently become VLAN 10?

By creating vlan interfaces and creating bridges with different interfaces on mikrotik switch.
Yes again that is working configuration. In fact i previously had an old mikrotik instead of current openwrt xiaomi ac2100 and it worked fine.

You have the same ports defined in multiple bridges

I would like to see an example. Don't get me wrong, i didn't sleep this night and maybe missing something.

you've mixed dotted notation with DSA syntax in invalid ways, incomplete definitions of what is tagged vs untagged on each port

I used luci. Quiet a few times deleted almost everything in "network" tab. I chose the devices that luci offered.

What subnet and address

Lets say 192.168.10.249/24 for br-lan10 and 192.168.20.111/24 for br-lan20

Ok... if this is actually working, sure. I would say that it is bad practice to change the VLAN ID of a network... it becomes confusing and the configuration (as an entire system) may become fragile as a result. but we'll work with what you're saying you have there.

Best practice is to have an address on only the network that is used for management in a dumb AP/switch configuration. So for example, if you have a trusted lan, a guest network, and another one for IoT, you'd set your trusted lan as the management network and the other two would not hold addresses. This protects the device itself from potential threats on the other networks, and simplifies the overall network topology and configuration.

For now, I'll use the .10.249 address. I'm working with the following goals:

  • port lan1 is the uplink trunk with an untagged network (192.168.10.0/24) and tagged VLAN 10 (192.168.20.0/24)
  • port lan2 is untagged for the subnet 192.168.10.0/24 (VLAN 10 upsream at the router, untagged at the uplink to this router)
  • port lan2 and wan will be untagged VLAN 10.

First, edit br-lan to contain the wan port so it looks like this:

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'wan'

Remove the option device 'wan' from both of the wan interfaces:

create a bridge VLAN for the management network (which really should be VLAN 10, but we'll call it vlan 1):

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'
	list ports 'lan2:u*'

and now one for vlan 10 (aka VLAN 20 -- do you see why this is not good practice??):

config bridge-vlan
	option device 'br-lan'
	option vlan '10'
	list ports 'lan1:t'
	list ports 'lan3:u*'
	list ports 'wan:u*'

now, edit the lan network interface to use br-lan.1 and reflect the correct address:

config interface 'lan'
	option device 'br-lan.1'
	option proto 'static'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'

and create a new network interface that is unmanaged for vlan10 (aka vlan20)

config interface 'vlan10-aka-vlan20'
	option device 'br-lan.10'
	option proto 'none'

Now, you must disable the DHCP server on the OpenWrt for the lan network (disable it explicitly in the config file using the ignore interface option).

Associate the wireless networks as desired and you're good to go.


EDIT: Per conversation later in the thread, the OP really needed to use VLAN 20 tagged. Also, the dashes in the network interface name may have been problematic. Therefore, we make the following changes:

config bridge-vlan
	option device 'br-lan'
	option vlan '20'
	list ports 'lan1:t'
	list ports 'lan3:u*'
	list ports 'wan:u*'

config interface 'vlan20'
	option device 'br-lan.20'
	option proto 'none'

Optionally, the VLAN 1 assignment could be changed to VLAN 10 just for consistency with the upstream, although it is not required since VLAN 10 is untagged on the trunk:

config bridge-vlan
	option device 'br-lan'
	option vlan '10'
	list ports 'lan1:u*'
	list ports 'lan2:u*'

config interface 'lan'
	option device 'br-lan.10'
	option proto 'static'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'

Edited network file and scp -O network root@192.168.1.1:/etc/config
Then ssh root@192.168.1.1 reboot now

Did not disable dhcp. Just pulled cord from lan1. So there is notebook in lan2 and nothing else.

Manually added ip to notebook's interface
ip addr add 192.168.10.233/24 dev ethernet

Can't ping 192.168.10.249. Wireshark shows only notebook's outgoing traffic.

Something wrong with this config:

# cat /etc/config/network

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fde9:54b5:7e19::/48'
	option packet_steering '1'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'wan'

config interface 'lan'
	option device 'br-lan'
	option proto 'static'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'
	list ports 'lan2:u*'

config bridge-vlan
	option device 'br-lan'
	option vlan '10'
	list ports 'lan1:t'
	list ports 'lan3:u*'
	list ports 'wan:u*'

config interface 'vlan10-aka-vlan20'
	option device 'br-lan.10'
	option proto 'none'


Are you sure that the netbook’s address was added properly? Why not just disable the dhcp server on the openwrt side and plug in lan1. Then, set the netbook to get an address via dhcp and see if it does.

netbook’s address was added properly?

yes.

reset.
disable firewall, dnsmasq, odhcpd
copied that network config again.
reboot

now with cable from mikrotik plugged in in lan1 and notebook in lan2.
bridge (at least for the 192.168.10.0/24 network) works, but openwrt is invisible (cant ping its addr 192.168.10.249)

I see why. You forgot to change this to use br-lan.1

2 Likes

Got it somewhat working.


config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fde9:54b5:7e19::/48'
	option packet_steering '1'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'wan'

config interface 'lan'
	option device 'br-lan.1'                      # fixed
	option proto 'static'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'
	list ports 'lan2:u*'

config bridge-vlan
	option device 'br-lan'
	option vlan '10'                  # shouldnt it be "20" ?
	list ports 'lan1:t'
	list ports 'lan3:u*'
	list ports 'wan:u*'

config interface 'vlan10-aka-vlan20'
	option device 'br-lan.10'         # and here also "20" ?
	option proto 'none'

bridge for lan1, lan2 is working - can get ip addr from dhcp from my router (not openwrt).

bridge for lan1-with-tagged-id-20, lan3, wan - NOT working. No ip from dhcp.

no default route - cant download packages. after manually adding default route "ip route default via 192.168.10.1" still cant download - no resolver probably. this is not important at the moment.

anyway, luci presented me with nice red error when tried to go to "Network > Interfaces"

RPCError
RPC call to uci/get failed with ubus code 9: Unspecified error
  at ClassConstructor.handleCallReply (http://192.168.10.249/luci-static/resources/rpc.js?v=git-23.306.39416-c86c256:15:3)
Dismiss

This doesn't really surprise me... I called attention to the fact that, in your diagram, your VLAN IDs for the 192.168.20.0/24 network suddenly change through the 'Tik switch.

From the router to the switch:

and then for some reason from the 'Tik switch to the OpenWrt device:

You said that the 192.168.20.0/24 network's "shift" from VLAN 20 to VLAN 10 was both intentional and a working configuration through some bridges or whatever in the 'Tik.

If that's true, it's certainly confusing, and it is bad practice to change VLAN IDs.

But... I followed your diagram and your confirmation.

Now, you're asking...

And my answer would be, normally YES! But it does depend on your 'Tik switch which I suspect is either misconfigured or actually using VLAN 20 (as per my confusion and questions above). Check your switch configuration, and if it actually using tagged VLAN 20 (which is what I would expect and advise), change those values to 20.

Yes... this is expected. You didn't mention any need for the device to be able to reach the internet itself...

You did not need to add a default route. Instead, we add a gateway and dns to the lan network interface. From there, OpenWrt will automatically handle the route audomatically. You might want to delete that route -- let's take a look at what you added.

The lan interface should look like this (note the addition of the gateway and DNS -- the gateway must be the actual router address, please adapt if yours is different; the DNS can be the gateway address in most situations, or you can use a public DNS server):

config interface 'lan'
	option device 'br-lan.1'
	option proto 'static'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'
	option gateway '192.168.10.254'
	list dns '192.168.10.254'
1 Like

Yes. mikrotik<>openwrt cable should have untagged 192.168.1.0/24 and tagged vlan id 20 192.168.20.0/24, fixed in diagram in the first post. I'm sorry. I really need some attention.

That is why you said about inconsistencies.

So i've changed lower part of the config to be vlan id 20.

192.168.10.0/ net is working fine.
but
192.168.20.0/ net is NOT working. notebook is not getting ip from dhcp. (when connected to lan3 or wan)

Latest /etc/config/network
config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fde9:54b5:7e19::/48'
	option packet_steering '1'

config device
	option name 'br-lan'
	option type 'bridge'
	list ports 'lan1'
	list ports 'lan2'
	list ports 'lan3'
	list ports 'wan'

config interface 'lan'
	option device 'br-lan.1'
	option proto 'static'
	option ipaddr '192.168.10.249'
	option netmask '255.255.255.0'
	option gateway '192.168.10.1'
	list dns '192.168.10.1'

config bridge-vlan
	option device 'br-lan'
	option vlan '1'
	list ports 'lan1:u*'
	list ports 'lan2:u*'

config bridge-vlan
	option device 'br-lan'
	option vlan '20'
	list ports 'lan1:t'
	list ports 'lan3:u*'
	list ports 'wan:u*'

config interface 'vlan10-aka-vlan20'
	option device 'br-lan.20'
	option proto 'none'

Verify your switch configuration -- make sure that VLAN 20 is a tagged member of the physical port that is connected to the OpenWrt device.

Also, verify that the connection between the 'Tik switch and the OpenWrt device is made via port lan1.

1 Like

Verify your switch configuration

yes it is vlan id 20 (physical port eth22 - this is where mikrotik<>openwrt cable connected)

add bridge=bridge20 ingress-filtering=no interface=listlan20
add interface=eth22 name=eth22vlan20 vlan-id=20
add interface=eth22vlan20 list=listlan20

'Tik switch and the OpenWrt device is made via port lan1

Can confirm.

Do you have another device (other than the netbook) that you can try?

1 Like

Got an ancient pc. Booted linux mint live.
Again
port lan2 on openwrt - 192.168.10.0/24 as it should be
and
port wan or lan3 on openwrt - nothing, but there should be 192.168.20.0/24

ok... let's do this...

just as a very temporary test, disconnect the 'tik switch from the router, and directly connect the router to the OpenWrt lan1 port.

Then run the same test on VLAN 20.

The point of this test is to verify that the switch isn't causing the problem (if it still doesn't work, we'll still have a few followup tests).

1 Like

connected router<>openwrt (lan1) with cable.

got 192.168.1.0/24 when connected to (lan2 on openwrt) with notebook. this is correct - this net is untagged on router's internal-facing port.

nothing when connecting notebook to lan3