I assume you mean "Secure Mesh", rather than "Sure Mesh".
You do not give any details about your mesh eg number of nodes, geographic spread, hardware in use, backhaul channel etc etc.
A large community mesh with segments of backhaul using cable benefits from using Batman with its layer 3 routing functionality, but an "all radio" mesh for even a large outdoor venue, an 802.11s mesh is all that is needed. As you say you want a "simple mesh link", I wonder if you need Batman at all.
The errors you are getting are typical of inadequate signal strength and possibly high noise. How far apart are the nodes? Are you using 5GHz?
Try moving the nodes closer together.
A mesh network requires a fixed channel number, you have not shown that part of the wireless config....
You have not set the rssi threshold (a value of zero indicates try to connect regardless of signal strength). Try setting to -70, but as far as I am aware, this does not get set from the config file, instead you have to set this using the iw utility once the mesh interface has come up. I'm not sure but you can test using iw.
Am using 4 nodes in total, am using tp-link archer c20 v4 and my backhaul channel is 36.
I was using a simple 802.11s mesh, but since I wanted to try BATMAN out I started using BATMAN.
Signal strength is not the issue, I get a good -56 dBm signal between the nodes and yes am using it on 5Ghz.
Maybe it could be because of noise but am not so sure.(My neighbor uses the same channel, but when I tried that by changing the channel it didn't make much of a difference so maybe not ...as I said not so sure.
Kept them side by side still am having the same issue.
I'm currently experiencing exactly the same MESH_AUTH_FAILURE and then MESH_AUTH_BLOCKED using wpad-mesh-wolfssl on OpenWrt 21.02.0-rc.1. But it only seems to be an issue if one mesh partner reboots (we have a maintenance reboot during the night once a week) and then the MESH_AUTH_FAILURE come up while it worked perfectly before that reboot event.
Something is very strange. Had two weeks in a row where the problem came during the nightly reboot. If I // now // issue the reboot command to one of the nodes or reboot via Web UI I cannot reproduce the problem.
My devices: TP-Link Archer C7v2 (mesh AP 1), TP-Link Archer C7v5 (mesh AP 2) using ath10k-non-ct drivers for 5 GHz mesh.
Same problem here:
Aug 24 14:02:35 WifiAP-01 wpa_supplicant[14237]: wlan0: MESH-SAE-AUTH-BLOCKED addr=68:ff:7b:0e:32:98 duration=300
When this happens, I'll have to wait 5 minutes "for nothing" and then the mesh starts to work again. I only have two AP devices doing mesh with each other.
I have a regular mesh using C7 v2's running on 2.4Ghz and I occasionally get that message:
I suspect its due to interference in my case / busy neighborhood. The nodes it happened to are only 20ft apart with -30dBm signal.
I also observed it when I tried to connect a AR750S to the C7 mesh on 2.4Ghz and then it became unresponsive for a period of time (ie. 5 minutes / same log message).
The C7 (and AR750S) is using 19.07.7, ath10k non-CT drivers (5Ghz I know) and wpad-mesh-openssl.
After a planned reboot of one AP at 04.00, my mesh consisting of two APs failed again.
Log:
Aug 31 04:01:35 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:01:51 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:02:07 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:02:22 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:02:22 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-BLOCKED addr=68:ff:7b:0e:xx:xx duration=300
Aug 31 04:06:43 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:06:53 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:07:07 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:07:17 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:07:17 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-BLOCKED addr=68:ff:7b:0e:xx:xx duration=300
Aug 31 04:11:43 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:11:55 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:12:14 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:12:33 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:12:33 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-BLOCKED addr=68:ff:7b:0e:xx:xx duration=300
Aug 31 04:16:44 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 04:16:55 WifiAP-01 wpa_supplicant[1424]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
After all troubeshooting and no other concurrent issues left, I'm sure this is a BUG in OpenWrt 21.02.0-rcX or packages.
Also tried "/etc/init.d/wpad restart" and the same error appears.
Also tried "/etc/init.d/network restart" and the same error appears.
Aug 31 09:11:29 HST-WifiAP-01 netifd: Interface 'nwi_mesh0' is now up
Aug 31 09:11:30 WifiAP-01 wpa_supplicant[26069]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
Aug 31 09:11:41 WifiAP-01 wpa_supplicant[26069]: wlan0: MESH-SAE-AUTH-FAILURE addr=68:ff:7b:0e:xx:xx
MANUAL WORKAROUND FOUND: "DISABLE, WAIT FOR 5+ MINUTES, RE-ENABLE MESH"
Aug 31 09:26:03 Disabled the mesh SSID interface via LUCI manually.
(Waited for 6 minutes, "more than 5 minutes", then re-enabled the mesh SSID interface via LUCI)
Aug 31 09:34:11 HST-WifiAP-01 hostapd: wlan0-1: AP-ENABLED
Aug 31 09:34:18 WifiAP-01 netifd: Interface 'nwi_mesh0' is now up
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: new peer notification for 68:ff:7b:0e:xx:xx
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: new peer notification for 68:ff:7b:0e:xx:xx
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: new peer notification for 68:ff:7b:0e:xx:xx
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: new peer notification for 68:ff:7b:0e:xx:xx
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: new peer notification for 68:ff:7b:0e:xx:xx
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: mesh plink with 68:ff:7b:0e:xx:xx established
Aug 31 09:34:19 WifiAP-01 wpa_supplicant[7331]: wlan0: MESH-PEER-CONNECTED 68:ff:7b:0e:xx:xx
The mesh link came up immediately and was use-able.
@devs@hnyman: does this help you identify the issue?
The issue occured again after the weekly reboot of the "backhaul node A".
They are connected like this: LAN ==> node A ==> MESH ==> node B.
Node B had the BLOCKED SAE failure and wasn't able to re-establish the mesh with node A after node A did its planned reboot.
I let the situation sit like it was and didn't interfere with any node. After 30 hours, our monitoring system said "node B reachable again" so it succeeded to re-establish the mesh by itself. It SHOULD have done this more quicker.
node A uptime:
10:46:37 up 1 day, 6:46, load average: 0.42, 0.34, 0.24
node B uptime:
10:46:56 up 1 day, 21:50, load average: 0.15, 0.10, 0.09
Note: the planned reboot was 1 day 6:46 hours ago.