Using batman-adv on devices without DSA implemented

Hi again,

I'm trying to set up batman-adv on my network, as in Marc's video on the topic. ( https://www.youtube.com/watch?v=t4A0kfg2olo&t)

On my network, however, there are 3 devices that do not support the DSA changes implemented to Openwrt in version 21 - they still have the old Switch menu, and require you to configure VLANs in that way.

I'm a bit uncertain on how to proceed. In Marc's tutorial, I would go forward with creating Bridge Devices, using the VLAN tags inside the Batman Device - (In Marc's case, this would be specifying the Bridge port as bat0.x)

But without DSA changes, I'm concerned that this wouldn't work. When a packet egresses from the port, I'm assuming the VLAN specified in the Switch menu would overwrite the packet tags from the Batman system, causing obvious problems.

If anyone can suggest a solution, I would be very grateful. Should I link the Batman Device somehow to a Switch VLAN? Should this VLAN specify the ports I want to use as tagged or untagged?

In a swconfig device, create a separate bridge for each VLAN and attach the Ethernet system using the syntax eth0.N or attach tagged BATMAN link using bat0.N. To avoid confusion the VLAN number should be the same for everything in a bridge (also good to name the bridge itself after the VLAN number), but it doesn't have to be. Note that devices of type bridge are purely software in a swconfig-based build. They act like VLANs insofar as each bridge is a separate virtual network path, but they don't contain VLAN numbers.

So for Ethernet hardware switching to occur, you also have to have defined in the swconfig page (Network-Switch or switch-vlans) a hardware VLAN that switches eth0.N to one or more physical ports and defines whether it will be tagged or untagged on the physical port. The switch cannot rewrite VLAN numbers, so N is the same on the CPU port and any tagged physical ports.

Some hardware has two CPU ports in that case eth1.N nay be the link to the switch. All switch-vlans must be tagged on the switch's CPU port.

(In DSA, bridges are usually set up to contain VLAN numbers as that is how VLAN numbers are conveyed to the physical hardware. A bridge in a DSA kernel may also control hardware.)

1 Like

Here is a minimal config (the relevant parts) running on my TP-Link Archer C7v5 (non-DSA) with 21.02

batman-adv device and interface section

config device
    option  name            'bat0'
    option  macaddr         '02:00:01:00:00:01'
    # Set if you need / like to have a "static" address

config interface            'bat0'
    option  proto           'batadv'
    option  routing_algo    'BATMAN_IV'

config interface            'bat0_hardif_mesh0'
    option  proto           'batadv_hardif'
    option  master          'bat0'
    option  mtu             '2304'

config interface            'bat0_hardif_mesh1'
    option  proto           'batadv_hardif'
    option  master          'bat0'
    option  mtu             '2304'

Example of a vlan device and interface

config switch_vlan
    option  device          'switch0'
    option  ports           '2t 0t'
    option  vlan            '16'

config device
    option  name            'br-vlan16'
    option  type            'bridge'
    list    ports           'eth0.16'
    list    ports           'bat0.16'

config interface            'vlan16'
    option  device          'br-vlan16'
    ...

wireless

config wifi-iface 'mesh0'
    option  disabled    '0'
    option  device      'radio0'
    option  ifname      'mesh0'
    option  macaddr     '02:00:01:02:00:01'
    option  network     'bat0_hardif_mesh0'
    option  mode        'mesh'
    option  mesh_fwding '0'
    option  mesh_id     '<your mesh ID>'
    option  encryption  'psk2+ccmp'
    option  key         '...'

config wifi-iface 'mesh1'
    option  disabled    '0'
    option  device      'radio1'
    option  ifname      'mesh1'
    option  macaddr     '02:00:01:03:00:01'
    option  network     'bat0_hardif_mesh1'
    option  mode        'mesh'
    option  mesh_fwding '0'
    option  mesh_id     '<your mesh ID>'
    option  encryption  'psk2+ccmp'
    option  key         '...'
1 Like

Thanks for both replies. Good to have some explanation on how exactly bridge devices work, too.

Just to confirm I am understanding, in simple terms-
In my case, the best practice would be to create a Switch VLAN- with the same number tag as the bat0.N tag I am using?

For example - When I create my br-lan device, I should add in the Bridge Ports dropdown menu both bat0.3 and Switch Vlan eth0.3 and/or eth1.3.

Then this would allow me to continue with the regular steps in setting up batman-adv, as it would be in any device with DSA implemented?

I assume... The bat0 device and its interface, and the hardif interfaces don't get the VLAN assigned directly.

Your switch_vlan configures only that the hardware switch chip knows about the VLAN ID.
Then your bridge device (could be br-lan and/or br-guest or what ever floats your boat), defines the bridge device and includes/enslaves the VLAN-Subinterfaces, like eth0.${VID} and bat0.${VID}. IF you have router with more then one "proper" network port (NIC), it could be that you have eth0 and eth1 available. It depends on your hardware. Most devices however only have a switch which has like 5 physical ports but are only separated by a VLAN ID. That's why you often see eth0.1 for br-lan and eth0.2 for wan.

I'm sorry but I can not offer you any help what to click/set in LUCI. Never have used it for configuring my devices. I can also give no guarantees what happens when you set something directly in the config files and how this is reflected on LUCI. Maybe @mk24 could jump in?

So I've only succeeded in a lot of frustration in the meantime. I attempted to test this out on my main router and my main dumb AP (The dumb AP uses the old switch menu, my main router does not)

Now I've somehow gotten myself into a situation that feels utterly bizarre.
My PC that I am configuring on is connected to a port specified in br-lan, and so is the port leading to my unmanaged switch, and the dumb AP beyond. Also specified on the bridge is BATDEVICE.3 (Yes, a silly name to have to type, but I didn't realise I would be typing it, and decided to roll with it)

This is connected to my LAN interface.

On the dumb AP, my config is pretty much parallel, but instead of the ethernet devices being specified, it is the Switch VLANS and BATDEVICE.3 .

I have lost access to Luci on the AP with this config, but if I use Bridge VLAN filtering on the br-lan device of my main router, and select the port leading to AP as tagged on VLAN 3 (My old, and new LAN VLAN) and the port leading to my PC as untagged, I can access the Luci of the dumb AP, but only for the 90 seconds of rollback on my main router. I cannot connect with Luci to the main one at all with Bridge filtering enabled.

In any case, there is no internet connection over WiFi on the dumb AP with either setting. It may be because I'm tired and making foolish mistakes, but I'm at a loss of how to go forward with configuring this tonight.

Well- I have access to the AP by changing my lan interface back to br-lan.3, but batman seems to definitely not be working.

I'm not sure what you're trying to set up. Please post both /etc/config/network files. Are both of these routers non-DSA or only one of them?

Do not try to set up bridged parallel paths by Ethernet and by mesh, that is a loop and the network will crash as it DoS's itself.

Configuring admin interfaces on both routers so you can log in directly by wifi will make configuring Ethernet easier since you won't lose access if it is wrong.

1 Like

And if you want to transport tagged frames via an unmanaged switch I assume this will not work properly.
If you can run a cable directly use this and transport all vlans as tagged. And normally people are advised to stay away from Vlan 1 for reasons...

If you loose access you can most of the time still reach a device on its IPv6 link local address. Plugin your (hopefully) Linux Computer on a switch port and ping ff02::1%<the name of your ethernet device>

You get multiple answers and one of them you can use to ssh to the router or access point...

Don't give up. First time I tried to get my setup running it was a journey of like 3 weeks or something. In the end you will have learned something. Promised. :smile:

1 Like

Don't worry, not giving up yet! New day begins.

My main router is an x86 device - Running DSA.
The dumb AP is a Netgear router which has the old Switch menu.

Here are the two network files you requested:

Main router/Internet Gateway

cat /etc/config/network

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd90:6a68:a309::/48'

config interface 'lan'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.2.1'
	list dns '1.1.1.1'
	list dns '1.0.0.1'
	option defaultroute '0'
	option device 'br-lan.3'

config interface 'UntrustedLAN'
	option proto 'static'
	option ipaddr '10.1.1.1'
	option netmask '255.255.255.0'
	list dns '1.1.1.1'
	list dns '1.0.0.1'
	option defaultroute '0'
	option device 'br-untrustedlan.9'

config interface 'SecureIOT'
	option proto 'static'
	option ipaddr '192.168.99.1'
	option netmask '255.255.255.0'
	option defaultroute '0'

config interface 'PublicIOT'
	option proto 'static'
	option ipaddr '10.99.99.1'
	option netmask '255.255.255.0'
	option defaultroute '0'

config device
	option type 'bridge'
	option name 'br-lan'
	list ports 'BATDEVICE.3'
	list ports 'eth0'
	list ports 'eth1'

config interface 'WAN'
	option proto 'dhcp'
	option device 'br-wan.10'
	option peerdns '0'
	list dns '1.1.1.1'
	list dns '1.0.0.1'

config device
	option type 'bridge'
	option name 'br-wan'
	list ports 'eth2'
	option mtu '1500'

config bridge-vlan
	option device 'br-wan'
	option vlan '10'
	list ports 'eth2:t'

config interface 'BATDEVICE'
	option proto 'batadv'
	option routing_algo 'BATMAN_V'
	option bridge_loop_avoidance '1'
	option gw_mode 'server'
	option hop_penalty '30'
	option defaultroute '0'

config interface 'BATMESH'
	option proto 'batadv_hardif'
	option master 'BATDEVICE'
	option defaultroute '0'

config device
	option type 'bridge'
	option name 'br-mgmt'
	list ports 'eth1'

config bridge-vlan
	option device 'br-mgmt'
	option vlan '77'
	list ports 'eth1:t'

config interface 'BATWIRE'
	option proto 'batadv_hardif'
	option device 'br-mgmt.77'
	option master 'BATDEVICE'
	option defaultroute '0'

config device
	option type 'bridge'
	option name 'br-untrustedlan'
	list ports 'BATDEVICE.9'
	list ports 'eth1'

config bridge-vlan
	option device 'br-lan'
	list ports 'BATDEVICE.3:t'
	list ports 'eth0:u*'
	list ports 'eth1:t'
	option vlan '3'

config bridge-vlan
	option device 'br-untrustedlan'
	option vlan '9'
	list ports 'BATDEVICE.9:t'
	list ports 'eth1:t'

The Dumb AP

cat /etc/config/network

config interface 'loopback'
	option device 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd26:df77:5d6e::/48'
	option packet_steering '1'

config device
	option name 'br-lan'
	option type 'bridge'
	option stp '1'
	list ports 'eth0.3'
	list ports 'eth1.3'
	list ports 'BATDEVICE.3'

config interface 'lan'
	option proto 'dhcp'
	option device 'br-lan'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '4'
	option vid '3'
	option ports '0t 6t 4t 5t'

config switch_vlan
	option device 'switch0'
	option vlan '5'
	option vid '33'
	option ports '5t'

config switch_vlan
	option device 'switch0'
	option vlan '6'
	option vid '9'
	option ports '0t 6t 3 2 1 5t'

config switch_vlan
	option device 'switch0'
	option vlan '7'
	option vid '99'
	option ports '5t'

config device
	option type 'bridge'
	option name 'br-untrustedlan'
	option stp '1'
	list ports 'BATDEVICE.9'
	list ports 'eth0.9'
	list ports 'eth1.9'

config interface 'UNTRUSTEDLAN'
	option proto 'none'
	option device 'br-untrustedlan'

config interface 'BATDEVICE'
	option proto 'batadv'
	option routing_algo 'BATMAN_V'
	option bridge_loop_avoidance '1'
	option gw_mode 'off'
	option hop_penalty '30'

config interface 'BATMESH'
	option proto 'batadv_hardif'
	option master 'BATDEVICE'

config switch_vlan
	option device 'switch0'
	option vlan '8'
	option vid '77'
	option description 'BATWIRE'
	option ports '0t 6t 5t'

config device
	option type 'bridge'
	option name 'br-mgmt'
	list ports 'eth0.77'

config interface 'BATWIRE'
	option proto 'batadv_hardif'
	option device 'br-mgmt'
	option master 'BATDEVICE'

config bridge-vlan
	option device 'br-untrustedlan'
	option vlan '9'
	list ports 'BATDEVICE.9:t'
	list ports 'eth0.9:t'
	list ports 'eth1.9:t'

I may have forgotten to make it clear. The dumb AP and the main router in question are parts of the network connected by ETHERNET, not Wifi Mesh.

I think not explaining this part obviously might have caused confusion as to what I am trying to do. I am trying to link the two through the BATWIRE batman interfaces, and the br-mgmt VLAN 77 tags.

I'm not following everything you're trying to do there, in particular I don't have experience with BATMAN over wire. I thought that it was for managing redundant connections that may exist by two wired paths, or wired and wireless, etc. If you just need to transport a VLAN over a single wire that is always connected, use an ordinary VLAN no need to involve BATMAN.

On x86, there is no switch, so no DSA. Since the physical ports are only connected to the CPU and have no hardware shortcut between each other, the only way to link traffic between ports is via software bridges.

Create a separate bridge for each network then attach the x86 ports directly using the notation eth0.3 eth1.3 etc. Don't have bridge-vlans. Each tag number must be in only one network. If you want an untagged ("access") port, use eth0 with no VLAN number and have it only in one network. Do not try to mix tagged and untagged on the same port. Under this scheme it is possible to have the CPU rewrite VLAN numbers as packets move lan-lan through a software bridge, but that is not a good practice.

Now on to the swconfig and a major mistake I see there.
On many ath79 devices, there are two CPU ports which are usually ports 0 and 6 on the switch. This is so that a lan->wan router can be built without bottlenecking the incoming and outgoing packets through a single CPU port. In an AP it would be better to use only one CPU port for everything and detach the other one.

Some of your configuration places switch ports 0 and 6 in the same hardware VLAN as well as having the corresponding tagged eth0 and eth1 in the same software bridge. This creates a network loop which is very bad.

1 Like

That's not quiet correct.
Jow had pointed that out just recently that the (UCI) "config" is the same for x86 with multiple NIC. DSA or not. I have just setup my x86 with 4 Intel NICs that way.
In addition, vlan-aware bridges are a think since 2015? or even early. Again, on my new x86 box I've set it up this way. A single bridge, multiple physical interfaces, and a bunch of VLANs.

You can how ever just configure multiple bridges, one for each VLAN sub-interface, but there is no real gain from it. A single vlan aware bridges reduces lots of complexity IMHO.

The config is not finished because I got distracted, but to illustrate my point:

root@OpenWrt:~# ip -br l
lo               UNKNOWN        00:00:00:00:00:00 <LOOPBACK,UP,LOWER_UP>
eth0             DOWN           20:7c:14:a2:b0:84 <NO-CARRIER,BROADCAST,MULTICAST,UP>
eth1             DOWN           20:7c:14:a2:b0:85 <BROADCAST,MULTICAST>
eth2             DOWN           20:7c:14:a2:b0:86 <NO-CARRIER,BROADCAST,MULTICAST,UP>
eth3             DOWN           20:7c:14:a2:b0:87 <NO-CARRIER,BROADCAST,MULTICAST,UP>
br-vlan          UP             20:7c:14:a2:b0:86 <BROADCAST,MULTICAST,UP,LOWER_UP>
br-vlan.16@br-vlan UP             02:00:01:01:00:10 <BROADCAST,MULTICAST,UP,LOWER_UP>
br-vlan.17@br-vlan UP             02:00:01:01:00:11 <BROADCAST,MULTICAST,UP,LOWER_UP>
br-vlan.4094@br-vlan UP             02:00:01:01:0f:fe <BROADCAST,MULTICAST,UP,LOWER_UP>
bat0             UNKNOWN        02:00:01:00:00:01 <BROADCAST,MULTICAST,UP,LOWER_UP>
eth0.7@eth0      LOWERLAYERDOWN 90:9a:4a:1a:f6:01 <NO-CARRIER,BROADCAST,MULTICAST,UP>
root@OpenWrt:~#
root@OpenWrt:~# bridge vlan
port              vlan-id
eth2              16
                  17
eth3              16
                  17
br-vlan           16
                  17
                  4094
bat0              16
                  17
root@OpenWrt:~# bridge link
4: eth2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 master br-vlan state disabled priority 32 cost 100
5: eth3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 master br-vlan state disabled priority 32 cost 100
10: bat0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br-vlan state forwarding priority 32 cost 100

root@OpenWrt:~# cat /etc/board.json
{
        "model": {
                "id": "intel-corporation-qhsw02",
                "name": "INTEL Corporation QHSW02"
        },
...
1 Like

Now on to the swconfig and a major mistake I see there.
On many ath79 devices, there are two CPU ports which are usually ports 0 and 6 on the switch. This is so that a lan->wan router can be built without bottlenecking the incoming and outgoing packets through a single CPU port. In an AP it would be better to use only one CPU port for everything and detach the other one.

Some of your configuration places switch ports 0 and 6 in the same hardware VLAN as well as having the corresponding tagged eth0 and eth1 in the same software bridge. This creates a network loop which is very bad.

Would there be any corresponding reduction in performance to changing this?

A layer 2 network loop causes packets to go repeatedly in circles at maximum bandwidth, which will almost or completely block the network from working.

@mk24, Could explain what the CPU port has to do with it? Or could you give a short (theoretical) example what you need to setup to get a loop "in software"? I'm not sure I got the point... (You described the behavior of a device with at least 2 separated interfaces? Because even such a devices could have just a single core CPU...)

Regarding batman-adv: As far as I know, the loop avoidance is enable by default
And regarding Linux bridges: If used with batman-adv, STP (Spanning Tree Protocol) should/has to be disabled because both will do not work very well together; but, you (@Eric12) could, if you haven't done so already, setup a simple network, with one or more brides and interfaces yadda yadda, enable STP and then using wires to get a loop. With STP one link will get blocked, and without STP your device will just get really hot because packets will endlessly circulate and the cpu will just produce warmth

The OP has in swconfig a VLAN vid number 9 containing ports 0t and 6t. This means that eth0.9 will be hardware switched to eth1.9. There's no reason to do that in practice. It gets really bad since bridge-vlan 9 of br-untrustedlan makes a software bridge between eth0.9 and eth1.9. I 'm not sure if the overall STP on the bridge works here but STP is not meant to save you from a fully improper configuration.

1 Like

Thank you for assisting with this, it's much appreciated, and for finding those issues with my config.
I've decided to use a regular mesh setup with GRETAP over ipv4 instead as a simpler solution, but I will make another thread to discuss that.

1 Like

Late to the party. This showed up when I was searching for something else...but with Batman you never have too touch any vlan at all. Wired or wireless, it doesn't matter. Everything goes through the Batman device which uses it's own vlan-esque tunneling which you are supposed to bridge to the various interfaces (lan, guest, admin, iot, etc)

The only thing is that if you're on a non dsa switch, then you need to make a new untagged vlan just for the port you're using for wired Batman, and to attach that vlan to the Batman interface. On DSA you just remove the lan port from the usual bridge (typically br-lan) and attach it to the Batman interface.

Of course you can do it the the "normal" way with traditional vlans too, but then the wired network won't be a part of the mesh. But to be perfectly honest, the end user experience is going to be more or less the same.

Would it make sense to run batman-adv over a wired tagged port?

Some background: I have a MoCA network, to which I want to connect the router, dumb APs, and batman-adv-unaware clients. I thought it makes sense to use untagged VLAN for the clients and a tagged VLAN for batman-adv (which is carrying the rest of the VLANs).