Batman (in production with post 19.07 snapshot) not working under 21.02

So. I removed wpad-basic-wolfssl and installed wpa-supplicant-mesh-openssl. It helped:

root@rpc149:~# logread | grep -i wpa
Fri Jun 25 10:12:59 2021 daemon.notice wpa_supplicant[1477]: Successfully initialized wpa_supplicant
Fri Jun 25 10:13:53 2021 daemon.notice wpa_supplicant[1477]: wlan0: leaving mesh
Fri Jun 25 10:13:54 2021 daemon.err wpa_supplicant[1477]: wlan0: mesh leave error=-134
Fri Jun 25 10:13:56 2021 daemon.notice wpa_supplicant[1477]: wlan0: interface state UNINITIALIZED->ENABLED
Fri Jun 25 10:13:56 2021 daemon.notice wpa_supplicant[1477]: wlan0: AP-ENABLED
Fri Jun 25 10:13:56 2021 daemon.notice wpa_supplicant[1477]: wlan0: joining mesh meshD
Fri Jun 25 10:13:56 2021 daemon.notice wpa_supplicant[1477]: wlan0: CTRL-EVENT-CONNECTED - Connection to 00:00:00:00:00:00 completed [id=0 id_str=]
Fri Jun 25 10:13:56 2021 daemon.notice wpa_supplicant[1477]: wlan0: MESH-GROUP-STARTED ssid="meshD" id=0

...but Batman still wasn't working. "batctl o" still produced no output at all. And what's this about "wlan0", an identifier which appears nowhere in my configuration??

Experimentally, I changed "config wifi-iface 'mesh0'" to "config wifi-iface 'wlan0'". No help. So I put it back the way it was. Then I changed "option network 'nwi_mesh0' in that same stanza to "option network 'wlan0'". Accordingly, in /etc/config/network, I also changed "config interface 'nwi_mesh0'" to "config interface 'wlan0'". Now, at last:

root@rpc149:~# batctl o
[B.A.T.M.A.N. adv 2021.1-openwrt-2, MainIF/MAC: wlan0/36:9b:9b:5e:27:95 (bat0/8a:0e:dd:c4:2c:9f BATMAN_IV)]
   Originator        last-seen (#/255) Nexthop           [outgoingIF]

So batctl now thinks there's something there, but I'm still not joining the mesh. Why? I see nothing untoward in the log.

Following your advice, mk24, I replaced wpa-supplicant-mesh-openssl with wpad-mesh-wolfssl. No change; still not joining the mesh. But at least "batctl o" has some output.

Is it not weird that I have a wlan0 whether I want it or not?

There's no need to force names onto radio interfaces, just let it propagate the other way with option network.

The wpad series of packages include both hostapd and wpa-supplicant capability, so you don't need to install wpa-supplicant separately.

Run iwinfo wlan0 assoclist to see if there is a radio link to at least one other node. If there is, you can conclude that the radio and wpad are configured properly, and it should be sending packets to the BATMAN layer.

But BATMAN used to be real critical about only linking with other nodes running the same version of BATMAN-- any node running a different version was invisible to it. Since BATMAN is part of the kernel, getting all your nodes onto the same version means running similar kernels. I think there are some ways to override that now.

Aha! I will take appropriate measures and see what happens then.

It outputs nothing. However, radio0's LED is lit and it blinks every now and then. On the working batman nodes, radio0's LED blinks pretty constantly. And, provided there is at least one other node that is online and a member of the same mesh, iwinfo wlan0 assoclist has output that looks reasonable.

Even if only to eliminate the possibility of malfunctioning hardware, I need to set up at least one more 21.02 node. You can expect a report!

In your files you're using channel 100, a DFS channel. Mesh operation on DFS is an uncertain thing, since the mesh expects to be on a defined channel and if radar is detected there's no defined way to move the whole mesh to another channel. So I'd try a non DFS channel.

Good call, but yeah, I know about that. In fact I have written a cron job that detects when that has happened and reboots in order to revert to the correct channel. (I haven't successfully drunk the Freifunk kool-aid, but they offer a similar watchdog.) Radar interruptions happen to at least some nodes a couple of times per month, usually in the wee small hours. We're in a near-urban environment. Interestingly, local police radars interrupt channels 100 and 116, but not 132.

But for these tests I'm using channel 149 -- it's not a DFS channel.

I have another A7 coming in four days. Evidently my test bed needs to have four nodes lest one of them is ill-behaved, as I currently suspect, and interferes with the others. With only 3 test nodes, I can't tell which one is the culprit.

Mystery solved. It works.

My error was that I failed to say
option country 'US'
in
config wifi-device 'radio0'

Channel 149 is a regdom issue, a fact that I knew once but had forgotten.
Many thanks for all your help, mk24!!!

If your problem is solved, please consider marking this topic as [Solved]. See How to mark a topic as [Solved] for a short how-to.

For the record, and so I can mark this topic "solved" (but see here for a remaining issue having to do with the specification of mac addresses within a mesh network), here are the /etc/config/network and /etc/config/wireless files that actually worked (modulo the mac address problem).

/etc/config/network
(This one is for the router that plays the "server" role in its mesh, which means that the mesh interface is in the LAN.)

config interface 'loopback'
	option device 'lo'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'
	option proto 'static'

config globals 'globals'
	option ula_prefix 'fdf9:f652:f605::/48'

config interface 'lan'
	option delegate '0'
	option device 'br-lan'
	option ipaddr '192.168.150.1'
	option mtu '1312'
	option netmask '255.255.255.0'
	option proto 'static'
	option stp '0'

config device
	option macaddr '26:9b:9b:5e:27:96'
	option name 'br-lan'
	list ports 'bat0.1'
	list ports 'eth0.1'
	option type 'bridge'

config device
	option macaddr '66:9b:9b:5e:27:96'
	option name 'br-wan'
	list ports 'eth0.2'
	option type 'bridge'

config interface 'wan'
	option delegate '0'
	option device 'br-wan'
	option dns '192.168.4.1'
	option gateway '192.168.4.1'
	option ipaddr '192.168.4.150'
	option mtu '1312'
	option netmask '255.255.255.0'
	option proto 'static'
	option stp '0'

config switch
	option enable_vlan '1'
	option name 'switch0'
	option reset '1'

config switch_vlan
	option device 'switch0'
	option ports '2 3 4 5 0t'
	option vlan '1'

config switch_vlan
	option device 'switch0'
	option ports '1 0t'
	option vlan '2'

config interface 'bat0'
	option aggregated_ogms '1'
	option ap_isolation '0'
	option bonding '0'
	option bridge_loop_avoidance '1'
	option distributed_arp_table '1'
	option fragmentation '1'
	option gw_mode 'server'
	option hop_penalty '30'
	option isolation_mark '0x00000000/0x00000000'
	option log_level '0'
	option multicast_fanout '16'
	option multicast_mode '1'
	option network_coding '0'
	option orig_interval '1000'
	option proto 'batadv'
	option routing_algo 'BATMAN_IV'

config interface 'wlan0'
	option master 'bat0'
	option mtu '1500'
	option proto 'batadv_hardif'

/etc/config/wireless

config wifi-device 'radio0'
	option channel '36'
	option country 'US'
	option disabled '0'
	option htmode 'VHT80'
	option hwmode '11a'
	option path 'pci0000:00/0000:00:00.0'
	#option txpower '23'
	option type 'mac80211'

config wifi-device 'radio1'
	option channel '11'
	option country 'US'
	option disabled '1'
	option htmode 'HT20'
	option hwmode '11g'
	option path 'platform/ahb/18100000.wmac'
	option txpower '24'
	option type 'mac80211'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option encryption 'psk2'
	option key 'XXXX'
	option macaddr '56:9b:9b:5e:27:96'
	option mode 'ap'
	option network 'lan'
	option ssid 'rpc150.rosepark.us'

config wifi-iface 'mesh0'
	option device 'radio0'
	option encryption 'psk2+ccmp'
	option key 'XXXX'
	#option macaddr '36:9b:9b:5e:27:96'
	option mesh_fwding '0'
	option mesh_id 'meshD'
	option mode 'mesh'
	option network 'wlan0'
1 Like

I can no longer edit the previous solution, alas, so I'll just mention here that removing the "option macaddr" from the "config interface wlan0" stanza made a world of difference in both reliability and performance. Moreover, it is now no longer necessary to specify "option mtu '1312'" in the "config interface lan" and "config interface wan" stanzas, and "option mtu '1560'" can be specified in the "config interface 'wlan0'" stanzas (as recommended in a log warning before I did that). None of those things used to work. Now they work. I don't think I ever got reasonable performance from my mesh networks before, but they are performing well now.
So it is not a good idea to attempt to control the MAC address of a mesh interface. I didn't know that. I've updated that discussion here.

5 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.