No LAN communication with disabled wireless batman mesh

Hi all,

Background Story
I'd like to ask this question primarily for general understanding.
I have a functioning batman-adv mesh network which transports a lan and a guest subnet to one of two batman gateways (internet upstream). One day (the wind direction must have changed or anything like that) the wifi connection for the batman communication started constantly trying to connect or anything - I am not really sure. This caused high cpu usage and bad routing performance to the internet.

actual Problem
Since at the moment I was not relay using this gateway with the rest of the mesh I thought I could simply disable the corresponding network interface in lucis Wireless tab. That worked as expected and everything was fine for the moment. Later i noticed that I could not ping any other devices in the lan subnet apart from the actual gateway router.
but still working fine:

  • internet access
  • vpn connection and pingin (not in lan brindge)
  • services on router
  • and in general everything apart from the mentioned issue

My Guess
Batman-adv is a layer 2 routing / switching technology and might get confused if the bridged mesh interface is not available?
Apart from all of that... I am aware that this is not the intended way of using openwrt and meshing. I am just really curious what causes this behavior.

config interface 'lan'
        option type 'bridge'
        option proto 'static'
        option stp '1'
        option netmask '255.255.255.0'
        option ifname 'bat0.1 eth0.1 if-lan'
        option delegate '0'

config interface 'guest'
        option type 'bridge'
        option ifname 'if-guest bat0.2'
        option proto 'static'
        option netmask '255.255.255.0'

config interface 'lan_mesh'
        option ifname 'if-lan-mesh'
        option master 'bat0'
        option mtu '2304'
        option proto 'batadv_hardif'
                               
config interface 'bat0'            
        option proto 'batadv'         
        option routing_algo 'BATMAN_V'    
        option gw_mode 'server'    
        option orig_interval '5000' 
config wifi-iface 'lan'
        option device 'radio0'
        option mode 'ap'
        option encryption 'psk2+ccmp'
        option network 'lan'
        option ifname 'if-lan'
        option disabled '0'

config wifi-iface 'wifinet0'
        option ifname 'if-lan-mesh'
        option device 'radio0'
        option mode 'mesh'
        option mesh_id 'lan-bridge'
        option mesh_fwding '0'
        option encryption 'psk2+ccmp'
        option network 'lan-mesh'
        option hidden '1'
        option disabled '1'

Which OpenWrt version and which version of batman-adv?

opkg list-installed | fgrep bat

It looks like there's a lot of config missing for current batman-adv.

(batman-adv shouldn't get "confused" if it's not connected to a local bridge on one or more of the nodes)

Hi Jeff,

on my Fritzbox 3370 I am using the following software versions

  • OpenWrt SNAPSHOT, r11452-6ffd8a8f92
  • batctl-default - 2019.4-0
  • kmod-batman-adv - 4.19.81+2019.4-0

Please correct me if I am wrong. I assume that every option which is not present in config interface 'bat0' will get default values like they where set in the special batman file in /etc/config/ in older versions?
At least from that configuration point of view I have been happy with this set up so far.

I would assume that defaults are applied if not specified explicitly.

My notes show that I had problems with BATMAN-V and that I reverted back to BATMAN-IV for my opwn setup. That may be worth a poke, though I doubt it is the issue.

Looking through things, it looks like the wireless may be "confused"

config wifi-iface 'wifinet0'
        option ifname 'if-lan-mesh'
        option device 'radio0'
        option mode 'mesh'
        option mesh_id 'lan-bridge'
        option mesh_fwding '0'
        option encryption 'psk2+ccmp'
        option network 'lan-mesh'

at least as I understand it, tells netifd to associate it with the "lan-mesh" interface when it comes up. However, your batadv_hardif is identified as "lan_mesh" and has an ifname of "if-lan-mesh", which seems to conflict with the ifname of the wireless interface. I remember it taking a long time to follow through the old wiki page that used the same string for everything. Maybe I haven't followed it through properly?

Here's the segments that are running for me

config wifi-iface 'mesh0'
        option device 'radio5pci'
        option ifname 'mesh0'
        option network 'nwi_mesh0'
        option mode 'mesh'
        option mesh_id '<redacted>'
        option mesh_fwding '0'
        option encryption 'psk2+ccmp'
        option key '<redacted>'
config interface 'nwi_mesh0'
        option proto 'batadv_hardif'
        option master 'bat0'
        option mtu '1500'

For reference, not because it is necessarily "right", here's my current, running config for the batadv interface

config interface 'bat0'
        option proto 'batadv'
        option routing_algo 'BATMAN_IV'
        option aggregated_ogms 1
        option ap_isolation 0
        option bonding 0
        option fragmentation 1
        #option gw_bandwidth '10000/2000'
        option gw_mode 'off'
        #option gw_sel_class 20
        option log_level 0
        option orig_interval 1000
        option bridge_loop_avoidance 1
        option distributed_arp_table 1
        option multicast_mode 1
        option network_coding 0
        option hop_penalty 30
        option isolation_mark '0x00000000/0x00000000'

Okay
From what you are saying I will read up a bit and try out version V vs VI of batman.adv.

Regarding the naming of interfaces and networks you are correct that I never completely understood how they are related. I was thinking that the "network" option of the wireless interface is related to the "ifname" of the virtual network interface. Therefore those two are the same and I deliberately set everything else differently hoping to get errors if it wasn't correct (for the purpose of learning and avoiding ambiguity). this never happened so I suspected it to by alright. But trying will not hurt ether.

Thanks already for you input