Out of curiosity, I tried a 802.11s mesh with batman-adv in my network. Yes, I know that it's not a good solution with only two wireless devices; yet, the goal was to learn, and it has been achieved, minus one question.
The target setup was to pass two separate VLANs (VID 1 and VID 4) through a wireless mesh. Mesh + batman-adv = new bat0
interface, which is documented to be VLAN-aware.
I already have a VLAN-aware bridge as br-lan
; I have created two VLANs as follows, and it works this way:
config interface 'bat0'
option proto 'batadv'
option routing_algo 'BATMAN_V'
option aggregated_ogms '1'
option bonding '1'
option bridge_loop_avoidance '1'
option gw_mode 'off'
option hop_penalty '30'
option defaultroute '0'
# This is referenced from /etc/config/wireless
config interface 'nwi0'
option proto 'batadv_hardif'
option master 'bat0'
option mtu '2304'
option defaultroute '0'
config device
option name 'br-lan'
option type 'bridge'
list ports 'bat0.1'
list ports 'bat0.4'
list ports 'lan1'
list ports 'lan2'
list ports 'lan3'
list ports 'lan4'
list ports 'lan5'
config interface 'lan'
option device 'br-lan.1'
option proto 'static'
option ipaddr '192.168.10.1'
option netmask '255.255.255.0'
option ip6assign '64'
list ip6class 'local'
list ip6class 'wan_6'
config interface 'rus'
option proto 'static'
option device 'br-lan.4'
option ipaddr '192.168.13.1'
option netmask '255.255.255.0'
option defaultroute '0'
option delegate '0'
option ip4table '4'
config bridge-vlan
option device 'br-lan'
option vlan '1'
list ports 'bat0.1:u*'
list ports 'lan1'
list ports 'lan2'
list ports 'lan3'
list ports 'lan4'
config bridge-vlan
option device 'br-lan'
option vlan '4'
list ports 'bat0.4:u*'
list ports 'lan5'
# "unmanaged" doesn't bring the interfaces up, so "static"
config interface 'bat01'
option proto 'static'
option device 'bat0.1'
config interface 'bat04'
option proto 'static'
option device 'bat0.4'
...plus the same setup, but with the 192.168.x.2 addresses, on the other side. Ignore the option ip4table '4'
on the rus
interface, it only affects routing, while this post is 100% about Layer 2.
With the above setup, I can ping 192.168.10.1 and 192.168.13.1 from the other side, and the packets go through the expected VLANs.
The question is: why do I have to create individual VLANs on the bat0
interface and list them as untagged bridge ports, thus defeating the point of a VLAN-aware bridge?
In other words, why doesn't it work if I delete the "bat01" and "bat04" interfaces, and replace the "bridge-vlan" sections as follows?
# no "bat01" and "bat04" interfaces
config device
option name 'br-lan'
option type 'bridge'
list ports 'bat0'
list ports 'lan1'
list ports 'lan2'
list ports 'lan3'
list ports 'lan4'
list ports 'lan5'
config bridge-vlan
option device 'br-lan'
option vlan '1'
list ports 'bat0:t'
list ports 'lan1'
list ports 'lan2'
list ports 'lan3'
list ports 'lan4'
config bridge-vlan
option device 'br-lan'
option vlan '4'
list ports 'bat0:t'
list ports 'lan5'
The kernel then starts spewing messages like this:
[16819.958117] batman_adv: bat0: adding TT local entry 94:83:c4:a7:ab:c2 to non-existent VLAN 4
Searching for this message yields this result:
Indeed, it says:
batman-adv since 2014.0.0 is 802.1Q VLAN-aware. It is only able to forward VLAN frames when it knows about the VLAN. This can either be done by creating a 802.1Q VLAN device with the correct VID on top of the batadv (bat0) device:
ip link add link bat0 name bat0.23 type vlan id 23
Or in case of a VLAN-aware bridge, it is better to add the VLANs as required to the specific ports:
bridge vlan add vid 23 dev bat0
Well, the first method of letting batman-adv know about the VLANs definitely works, and is implemented in the first (working) UCI configuration snippet.
I believe the second (non-working) UCI configuration snippet to be equivalent to the bridge vlan add vid XX dev bat0
method, which the FAQ suggests should also work, and should be preferred.
I also tried removing bat0
from the bridge entirely, adding it manually using the ip link set dev bat0 master br-lan
command, and running the suggested bridge
command manually to add the VLAN. This results in no traffic and many kernel messages about the non-existent VLAN.
In the (non-working) manual test, the output of bridge vlan
is identical to that from the second (i.e., non-working) UCI-based config:
root@gl-inet-main:~# bridge vlan
port vlan-id
lan2 1 PVID Egress Untagged
lan3 1 PVID Egress Untagged
lan4 1 PVID Egress Untagged
lan5 4 PVID Egress Untagged
lan1 1 PVID Egress Untagged
br-lan 1
4
phy0-ap0 1 PVID Egress Untagged
phy1-ap0 1 PVID Egress Untagged
phy1-ap1 1 PVID Egress Untagged
bat0 1
4
phy0-ap1 4 PVID Egress Untagged
For a comparison, this is the output in the working case:
root@gl-inet-main:~# bridge vlan
port vlan-id
lan2 1 PVID Egress Untagged
lan3 1 PVID Egress Untagged
lan4 1 PVID Egress Untagged
lan5 4 PVID Egress Untagged
lan1 1 PVID Egress Untagged
br-lan 1
4
phy0-ap0 1 PVID Egress Untagged
phy1-ap0 1 PVID Egress Untagged
phy1-ap1 1 PVID Egress Untagged
phy0-ap1 4 PVID Egress Untagged
bat0.4 4 PVID Egress Untagged
bat0.1 1 PVID Egress Untagged
Is this the case of outdated upstream batman-adv documentation, or a bug in OpenWrt?