Intermittent ping dropouts on BATMAN mesh?

I've very recently setup an OpenWRT BATMAN mesh using 6 x ASUS Lyra AC2200 nodes + 2 x Linksys EA8300 all running OpenWRT 24.10.1.

I've occasionally noticed when browsing the internet, that web pages stop loading for several seconds, before normal behaviour returns.

So, I setup 'Uptime Kuma' on a Raspberry Pi to ping various devices every 60secs. It seems that every 5-15 minutes the ping rate on a number of devices, including some of OpenWRT BATMAN mesh nodes, shoots up from 0-5ms to suddenly 250ms (always seems to peak at around 250ms) before dropping back to 0-5ms.

Would anyone have any idea why I'm getting this constant but intermittent drop-out on my BATMAN mesh?


In the System Log on at least one of my Mesh Access Points, I get a lot of disassociated/authenticated/associated every 4-6mins. Is that normal for a BATMAN mesh? Is this just the Access Point 'checking in' every so often?

There doesn't seem to be anything else significant in the System Log.

Thu Jun 26 04:26:30 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: disassociated
Thu Jun 26 04:26:30 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: authenticated
Thu Jun 26 04:26:30 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: associated (aid 1)
Thu Jun 26 04:26:30 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 00:c0:00:05:35:d7 auth_alg=open
Thu Jun 26 04:26:30 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 RADIUS: starting accounting session AE37CA0BE8DCCC5B
Thu Jun 26 04:26:30 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 WPA: pairwise key handshake completed (RSN)
Thu Jun 26 04:26:30 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 00:c0:00:05:35:d7
Thu Jun 26 04:31:32 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 00:c0:00:05:35:d7
Thu Jun 26 04:31:32 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: disassociated
Thu Jun 26 04:31:32 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: authenticated
Thu Jun 26 04:31:32 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: associated (aid 1)
Thu Jun 26 04:31:32 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 00:c0:00:05:35:d7 auth_alg=open
Thu Jun 26 04:31:32 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 RADIUS: starting accounting session 3150353A04AD0074
Thu Jun 26 04:31:32 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 WPA: pairwise key handshake completed (RSN)
Thu Jun 26 04:31:32 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 00:c0:00:05:35:d7
Thu Jun 26 04:36:33 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 00:c0:00:05:35:d7
Thu Jun 26 04:36:33 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: disassociated
Thu Jun 26 04:36:34 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: authenticated
Thu Jun 26 04:36:34 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: associated (aid 1)
Thu Jun 26 04:36:34 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 00:c0:00:05:35:d7 auth_alg=open
Thu Jun 26 04:36:34 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 RADIUS: starting accounting session D39A9FBB3A4957B7
Thu Jun 26 04:36:34 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 WPA: pairwise key handshake completed (RSN)
Thu Jun 26 04:36:34 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 00:c0:00:05:35:d7
Thu Jun 26 04:41:35 2025 daemon.notice hostapd: phy1-ap0: AP-STA-DISCONNECTED 00:c0:00:05:35:d7
Thu Jun 26 04:41:35 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: disassociated
Thu Jun 26 04:41:36 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: authenticated
Thu Jun 26 04:41:36 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 IEEE 802.11: associated (aid 1)
Thu Jun 26 04:41:36 2025 daemon.notice hostapd: phy1-ap0: AP-STA-CONNECTED 00:c0:00:05:35:d7 auth_alg=open
Thu Jun 26 04:41:36 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 RADIUS: starting accounting session 4C0FA84976A08C28
Thu Jun 26 04:41:36 2025 daemon.info hostapd: phy1-ap0: STA 00:c0:00:05:35:d7 WPA: pairwise key handshake completed (RSN)
Thu Jun 26 04:41:36 2025 daemon.notice hostapd: phy1-ap0: EAPOL-4WAY-HS-COMPLETED 00:c0:00:05:35:d7


I did some Googling and there doesn't seem to be any clear solution for my problem with BATMAN and intermittent pings dropping out. But there is some suggestion that the MTU for BATMAN needs increased to at least 1532 to allow for some extra 'header' data that the BATMAN protocol uses.

So the first thing i wanted to understand, is where do I set this MTU value? Was it on the 'bat0' interface (there's no edit box in Luci for changing MTU on bat0) or was it on the 'batmesh' interface (ie. the wireless 'backhaul' between Access Points)

When I look at /etc/config/network on my OpenWRT router, it gives:

config interface 'bat0'
        option proto 'batadv'
        option routing_algo 'BATMAN_V'
        option bridge_loop_avoidance '1'
        option gw_mode 'server'
        option hop_penalty '30'

config interface 'batmesh'
        option proto 'batadv_hardif'
        option master 'bat0'

and as best I could tell from Google, setting the MTU to 1532 is for the 'batmesh' (ie. wireless backhaul). When I looked in Luci, there is an "Override MTU" edit box on the 'batmesh' interface (see attached image). So I changed the 'Override MTU' to 1532 on one of my Access Points and saved/applied. At that point my Access Point went offline and I couldn't connect to it - after 90secs without being able to log back in, it seems Luci reverted my change and I was able to reconnect (and the default 1500 MTU was showing again).

Q. Am I changing the correct MTU value? (on the 'batmesh' wireless backhaul interface?)
Q. Should I set this MTU to 1532 for all BATMAN Access Points & router?
Q. Why, when I entered the value in the 'Override MTU' in Luci, did it take my Access Point offline and revert my changes after 90secs? (there's nothing obvious in the System Log).

I did find this post which says, while using BATMAN_V protocol, "any attempt to set the MTU completely bricks them and requires a system reset":

I think "brick" might be an extreme description, but this could be why I couldn't change the MTU?

I am using BATMAN_V as it seems it offers significant benefits over BATMAN_IV, but i may just try switching all my Access Points & Router to IV as I cant find a way to 'fix' this intermittent ping drop-out (seems to happen every 5-7 minutes on the Access Points).

So, a bit more Googling, tells me that two key bits of information needed to help diagnose issues, are the contents of /etc/config/network & /etc/config/wireless:

network:

config interface 'loopback'
        option device 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fd38:7802:2160::/48'
        option packet_steering '1'

config device
        option name 'br-lan'
        option type 'bridge'
        list ports 'bat0'
        list ports 'lan'
        list ports 'wan'

config interface 'lan'
        option device 'br-lan'
        option proto 'static'
        option ipaddr '192.168.1.3'
        option netmask '255.255.255.0'
        option ip6assign '60'
        option gateway '192.168.1.1'
        option delegate '0'
        list dns '192.168.1.212'
        list dns_search 'lan'

config interface 'bat0'
        option proto 'batadv'
        option routing_algo 'BATMAN_V'
        option bridge_loop_avoidance '1'
        option gw_mode 'client'
        option hop_penalty '30'

config interface 'batmesh'
        option proto 'batadv_hardif'
        option master 'bat0'

wireless:

config wifi-device 'radio0'
        option type 'mac80211'
        option path 'soc/40000000.pci/pci0000:00/0000:00:00.0/0000:01:00.0'
        option band '5g'
        option channel '36'
        option htmode 'VHT80'
        option cell_density '0'
        option country 'US'

config wifi-iface 'default_radio0'
        option device 'radio0'
        option network 'batmesh'
        option mode 'mesh'
        option encryption 'sae'
        option mesh_id 'Constellation-mesh'
        option mesh_fwding '0'
        option mesh_rssi_threshold '0'
        option key 'mymeshkey'
        option disassoc_low_ack '0'

config wifi-device 'radio1'
        option type 'mac80211'
        option path 'platform/soc/a000000.wifi'
        option band '2g'
        option channel 'auto'
        option htmode 'HT20'
        option cell_density '0'
        option country 'US'

config wifi-iface 'default_radio1'
        option device 'radio1'
        option network 'lan'
        option mode 'ap'
        option ssid 'Constellation'
        option encryption 'psk-mixed'
        option key 'mykey'
        option ieee80211r '1'
        option ft_over_ds '0'
        option ft_psk_generate_local '1'

config wifi-device 'radio2'
        option type 'mac80211'
        option path 'platform/soc/a800000.wifi'
        option band '5g'
        option channel 'auto'
        option htmode 'VHT80'
        option cell_density '0'
        option country 'US'

config wifi-iface 'default_radio2'
        option device 'radio2'
        option network 'lan'
        option mode 'ap'
        option ssid 'Constellation'
        option encryption 'sae-mixed'
        option key 'mykey'
        option ieee80211r '1'
        option ft_over_ds '0'
        option ocv '0'

Is anyone able to suggest what the problem is??

I don't think this is anything to do with your problem, it could be, but I have seen this (or at least very similar) many times.
Here, it was a user device (indicated by STA 00:c0:00:05:35:d7. Many mobiles will, when idle, go into a power save mode and turn off the radio, causing the disconnect.
All mobile devices have a built in "portal detection" function and periodically send a "canary test" and have to power up the radio again to do this. Typical test intervals are 5 (android), 6 (apple) or 10 (Linux/Microsoft laptops) minutes.

If the test passes (got an Internet connection), turns off the radio again.
If the test fails (no Internet), displays a "limited connectivity" type of warning, and if present, triggers a captive portal.

Ping dropouts however, imply the mesh backhaul is stuttering in some way....

I think you're right - the disassociate/authenticate/associate thing has nothing to do with my problem.

I've been fiddling with some of the parameters on the BATMAN_V algorithm for my bat0 interface, but this doesnt seem to have had any noticeable effect.

Why do you think the 'batmesh' wireless backhaul is the issue? There doesn't seem to be many parameters to fiddle with on the 'batmesh' interface??

Because it is the only thing left.

No, but there are MANY 802.11s parameters that can be adjusted outside of Batman configs.

I am not deeply into Batman, but it does use 802.11s mesh for its backhaul and I have gone into the nitty-gritty details of 802.11s while developing the mesh11sd package.

Batman does not make use of the 802.11s HWMP protocol built into the kernel, in fact it attempts to turn it off by setting option mesh_fwding '0'. Unfortunately this does not turn HWMP off completely.

I'm guessing, but here is a possible scenario:
Depending on signal strengths, node locations and many other factors, HWMP might change the mesh backhaul path at layer 2, pulling the rug out from under Batman's own protocol. Batman then takes a second or two to "reroute" its traffic, resulting in the dropouts you are seeing.

Can you do anything to test my hypothesis? Yes, maybe.

First, lets see what Batman sets up in terms of mesh parameters.
Run iwinfo to find out the interface name of the mesh interface.
Lets say it is mesh0

Substituting the actual interface name, show the output of:

iw dev mesh0 mesh_param dump

Also, to see the current backhaul status, show:
iw dev mesh0 mpath dump

And to see what mesh nodes are connected to the node we are working with:
iw dev mesh0 station dump

Okay, here's a lot of output - hopefully the forum can collapse into a scrollable 'code' box so its legible:

root@Linksys1-main:~# batctl n
[B.A.T.M.A.N. adv 2024.3-openwrt-4, MainIF/MAC: phy2-mesh0/ea:9f:80:a5:77:d6 (bat0/4e:1d:05:ef:3b:cc BATMAN_V)]
         Neighbor   last-seen      speed           IF
            Lyra3    0.270s (      257.6) [phy2-mesh0]
         Linksys2    0.300s (      134.5) [phy2-mesh0]
            Lyra2    0.390s (      132.6) [phy2-mesh0]
            Lyra7    0.040s (       97.4) [phy2-mesh0]
            Lyra6    0.520s (      173.0) [phy2-mesh0]
            Lyra4    0.400s (       65.0) [phy2-mesh0]
            Lyra5    0.400s (      218.0) [phy2-mesh0]

root@Linksys1-main:~# iwinfo
phy0-ap0  ESSID: "Constellation"
          Access Point: EA:9F:80:A5:77:D7
          Mode: Master  Channel: 144 (5.720 GHz)  HT Mode: VHT80
          Center Channel 1: 138 2: unknown
          Tx-Power: 24 dBm  Link Quality: unknown/70
          Signal: unknown  Noise: -101 dBm
          Bit Rate: unknown
          Encryption: mixed WPA2/WPA3 PSK/SAE (CCMP)
          Type: nl80211  HW Mode(s): 802.11ac/n
          Hardware: 168C:0056 0000:0000 [Qualcomm Atheros QCA9886]
          TX power offset: none
          Frequency offset: none
          Supports VAPs: yes  PHY name: phy0

phy1-ap0  ESSID: "Constellation"
          Access Point: EA:9F:80:A5:77:D5
          Mode: Master  Channel: 6 (2.437 GHz)  HT Mode: HT20
          Center Channel 1: 6 2: unknown
          Tx-Power: 30 dBm  Link Quality: 54/70
          Signal: -56 dBm  Noise: -97 dBm
          Bit Rate: 58.4 MBit/s
          Encryption: mixed WPA/WPA2 PSK (CCMP)
          Type: nl80211  HW Mode(s): 802.11b/g/n
          Hardware: embedded [Qualcomm Atheros IPQ4019]
          TX power offset: none
          Frequency offset: none
          Supports VAPs: yes  PHY name: phy1

phy2-mesh0 ESSID: "Constellation-mesh"
          Access Point: EA:9F:80:A5:77:D6
          Mode: Mesh Point  Channel: 36 (5.180 GHz)  HT Mode: VHT80
          Center Channel 1: 42 2: unknown
          Tx-Power: 23 dBm  Link Quality: 50/70
          Signal: -60 dBm  Noise: -101 dBm
          Bit Rate: 459.6 MBit/s
          Encryption: none
          Type: nl80211  HW Mode(s): 802.11ac/n
          Hardware: embedded [Qualcomm Atheros IPQ4019]
          TX power offset: none
          Frequency offset: none
          Supports VAPs: yes  PHY name: phy2

root@Linksys1-main:~# iw dev phy2-mesh0 mesh_param dump
mesh_retry_timeout = 100 milliseconds
mesh_confirm_timeout = 100 milliseconds
mesh_holding_timeout = 100 milliseconds
mesh_max_peer_links = 99
mesh_max_retries = 3
mesh_ttl = 31
mesh_element_ttl = 31
mesh_auto_open_plinks = 0
mesh_hwmp_max_preq_retries = 4
mesh_path_refresh_time = 1000 milliseconds
mesh_min_discovery_timeout = 100 milliseconds
mesh_hwmp_active_path_timeout = 5000 TUs
mesh_hwmp_preq_min_interval = 10 TUs
mesh_hwmp_net_diameter_traversal_time = 50 TUs
mesh_hwmp_rootmode = 0
mesh_hwmp_rann_interval = 5000 TUs
mesh_gate_announcements = 0
mesh_fwding = 0
mesh_sync_offset_max_neighor = 50
mesh_rssi_threshold = 0 dBm
mesh_hwmp_active_path_to_root_timeout = 6000 TUs
mesh_hwmp_root_interval = 5000 TUs
mesh_hwmp_confirmation_interval = 2000 TUs
mesh_power_mode = active
mesh_awake_window = 10 TUs
mesh_plink_timeout = 0 seconds
mesh_connected_to_gate = 0
mesh_nolearn = 0
mesh_connected_to_as = 0

root@Linksys1-main:~# iw dev phy2-mesh0 mpath dump
DEST ADDR         NEXT HOP          IFACE       SN      METRIC  QLEN    EXPTIME DTIM    DRET    FLAGS   HOP_COUNT       PATH_CHANGE
10:7b:44:ce:06:3c 10:7b:44:ce:06:3c phy2-mesh0  462     11      0       2790    100     0       0x5     1       3
ea:9f:80:a5:84:8a ea:9f:80:a5:84:8a phy2-mesh0  109424  21      0       2960    100     0       0x15    1       5
10:7b:44:d5:12:54 10:7b:44:d5:12:54 phy2-mesh0  107592  23      0       2860    100     0       0x15    1       4
10:7b:44:ce:05:d4 10:7b:44:ce:05:d4 phy2-mesh0  930     29      0       2870    200     1       0x5     1       26
10:7b:44:d5:0e:d4 10:7b:44:d5:0e:d4 phy2-mesh0  104854  16      0       2860    200     1       0x15    1       21
10:7b:44:ce:05:84 10:7b:44:ce:05:84 phy2-mesh0  107668  43      0       2870    100     0       0x5     1       52
10:7b:44:d5:12:6c 10:7b:44:d5:12:6c phy2-mesh0  109059  13      0       2870    100     0       0x15    1       24

root@Linksys1-main:~# iw dev phy2-mesh0 station dump
Station 10:7b:44:ce:05:84 (on phy2-mesh0)
        inactive time:  50 ms
        rx bytes:       10049036
        rx packets:     69089
        tx bytes:       1413660
        tx packets:     5870
        tx retries:     1
        tx failed:      86
        rx drop misc:   111
        signal:         -82 [-92, -93, -85, -85] dBm
        signal avg:     -77 [-83, -87, -84, -84] dBm
        Toffset:        169622085016 us
        tx bitrate:     195.0 MBit/s VHT-MCS 4 80MHz short GI VHT-NSS 1
        tx duration:    168825 us
        rx bitrate:     45.0 MBit/s VHT-MCS 2 40MHz short GI VHT-NSS 1
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 43
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 1497 seconds
        associated at [boottime]:       42.439s
        associated at:  1751100804555 ms
        current time:   1751102297240 ms
Station 10:7b:44:d5:12:6c (on phy2-mesh0)
        inactive time:  30 ms
        rx bytes:       15813471
        rx packets:     91140
        tx bytes:       3577669
        tx packets:     13431
        tx retries:     6
        tx failed:      198
        rx drop misc:   57
        signal:         -53 [-54, -58, -85, -85] dBm
        signal avg:     -54 [-55, -60, -85, -85] dBm
        Toffset:        169594309872 us
        tx bitrate:     650.0 MBit/s VHT-MCS 7 80MHz short GI VHT-NSS 2
        tx duration:    263417 us
        rx bitrate:     520.0 MBit/s VHT-MCS 5 80MHz short GI VHT-NSS 2
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 13
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 1494 seconds
        associated at [boottime]:       41.694s
        associated at:  1751100803810 ms
        current time:   1751102297241 ms
Station 10:7b:44:d5:0e:d4 (on phy2-mesh0)
        inactive time:  10 ms
        rx bytes:       10985853
        rx packets:     73278
        tx bytes:       2123635
        tx packets:     10455
        tx retries:     8
        tx failed:      60
        rx drop misc:   61
        signal:         -58 [-58, -69, -85, -85] dBm
        signal avg:     -59 [-59, -72, -84, -84] dBm
        Toffset:        169594904363 us
        tx bitrate:     520.0 MBit/s VHT-MCS 5 80MHz short GI VHT-NSS 2
        tx duration:    204362 us
        rx bitrate:     292.6 MBit/s VHT-MCS 6 80MHz short GI VHT-NSS 1
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 16
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 1492 seconds
        associated at [boottime]:       43.843s
        associated at:  1751100805958 ms
        current time:   1751102297241 ms
Station 10:7b:44:ce:05:d4 (on phy2-mesh0)
        inactive time:  40 ms
        rx bytes:       10218846
        rx packets:     69805
        tx bytes:       1446729
        tx packets:     6032
        tx retries:     1
        tx failed:      61
        rx drop misc:   62
        signal:         -82 [-92, -93, -85, -85] dBm
        signal avg:     -68 [-71, -87, -84, -84] dBm
        Toffset:        14184406 us
        tx bitrate:     292.6 MBit/s VHT-MCS 6 80MHz short GI VHT-NSS 1
        tx duration:    146723 us
        rx bitrate:     175.5 MBit/s VHT-MCS 4 80MHz VHT-NSS 1
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 29
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 1491 seconds
        associated at [boottime]:       45.048s
        associated at:  1751100807163 ms
        current time:   1751102297242 ms
Station 10:7b:44:d5:12:54 (on phy2-mesh0)
        inactive time:  90 ms
        rx bytes:       11802784
        rx packets:     77022
        tx bytes:       9632219
        tx packets:     18140
        tx retries:     7
        tx failed:      119
        rx drop misc:   61
        signal:         -65 [-65, -92, -85, -85] dBm
        signal avg:     -62 [-65, -73, -84, -84] dBm
        Toffset:        169619966923 us
        tx bitrate:     390.0 MBit/s VHT-MCS 4 80MHz short GI VHT-NSS 2
        tx duration:    512052 us
        rx bitrate:     260.0 MBit/s VHT-MCS 5 80MHz short GI VHT-NSS 1
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 21
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 1491 seconds
        associated at [boottime]:       45.428s
        associated at:  1751100807544 ms
        current time:   1751102297243 ms
Station ea:9f:80:a5:84:8a (on phy2-mesh0)
        inactive time:  0 ms
        rx bytes:       15219675
        rx packets:     84009
        tx bytes:       9301237
        tx packets:     25071
        tx retries:     24
        tx failed:      0
        rx drop misc:   44
        signal:         -64 [-67, -67, -85, -85] dBm
        signal avg:     -69 [-72, -81, -85, -85] dBm
        Toffset:        168114671895 us
        tx bitrate:     390.0 MBit/s VHT-MCS 4 80MHz short GI VHT-NSS 2
        tx duration:    616315 us
        rx bitrate:     292.6 MBit/s VHT-MCS 6 80MHz short GI VHT-NSS 1
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 21
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 1490 seconds
        associated at [boottime]:       46.236s
        associated at:  1751100808352 ms
        current time:   1751102297244 ms
Station 10:7b:44:ce:06:3c (on phy2-mesh0)
        inactive time:  30 ms
        rx bytes:       5119691
        rx packets:     31715
        tx bytes:       791279
        tx packets:     3599
        tx retries:     1
        tx failed:      42
        rx drop misc:   74
        signal:         -54 [-55, -59, -85, -85] dBm
        signal avg:     -54 [-55, -59, -84, -84] dBm
        Toffset:        18446744072841553601 us
        tx bitrate:     780.0 MBit/s VHT-MCS 8 80MHz short GI VHT-NSS 2
        tx duration:    66042 us
        rx bitrate:     390.0 MBit/s VHT-MCS 8 80MHz short GI VHT-NSS 1
        rx duration:    0 us
        airtime weight: 256
        mesh llid:      0
        mesh plid:      0
        mesh plink:     ESTAB
        mesh airtime link metric: 14
        mesh connected to gate: no
        mesh connected to auth server:  no
        mesh local PS mode:     ACTIVE
        mesh peer PS mode:      ACTIVE
        mesh non-peer PS mode:  ACTIVE
        authorized:     yes
        authenticated:  yes
        associated:     yes
        preamble:       long
        WMM/WME:        yes
        MFP:            yes
        TDLS peer:      no
        DTIM period:    2
        beacon interval:100
        connected time: 627 seconds
        associated at [boottime]:       910.821s
        associated at:  1751101672936 ms
        current time:   1751102297244 ms
root@Linksys1-main:~#

Incidentally, this was the online 'tutorial' I was following to setup BATMAN:

benkay86/openwrt-batman-tutorial: Configure 802.11s mesh with batman-adv routing on OpenWRT through the LuCI web interface.

and its only after you suggested that my 'batmesh' backhaul might be the issue, that I noticed he shows an 'Override MTU' of 1536 - so I decided to try again and change the MTU on my 'batmesh' interface. I've now set it to be 1532 (had to go around all the Access Points with my laptop & network cable) on the main router & all Access Points, but so far, it appears to have made no difference :frowning:

Also, when you say "Run iwinfo to find out the interface name of the mesh interface" I assume this was 'phy2-mesh0' ? All the various postings i've been reading 'wildly' refer to the mesh interface, and i'm never clear if this means the 'mesh backhaul' ('batmesh' in my case) or the 'bat0' interface (which I'm not entirely clear on what it is exactly).

When I get these 'ping dropouts' they are ALWAYS around 250ms - anything less is just a glitch, anything more is usually a disconnect.

The attached image shows the 'ping dropouts' I was getting with MTU=1500 (on 'batmesh'), then I rebooted after changing the MTU=1532 - as you can see, before & after are pretty much the same behaviour with the 250ms 'ping dropouts'

I've also setup the 'Uptime Kuma' app on a different mesh Access Point, and it also shows the same 250ms 'ping dropouts' occurring, but NOT 'synchronised' at the same time and on the same Access Points. That did make we wonder if the problem was nothing to do with the mesh network, but i cant see how given that two separate 'Uptime Kumas' tracking pings over the mesh from different Access Points show the same type of behaviour?

This is the 802.11s mesh interface, ie Mode: Mesh Point

See if you can follow my logic:

Batman does its own thing (at layer 3 ipv6) to generate a point to point backhaul connecting the nodes together.
But 802.11s/HWMP generates a multi-point to multi-point network.

In the mpath dump output, a healthy Batman backhaul should only have 1 hop links.
But an HWMP backhaul can have multiple hop links (remember this bit for later!)

The mesh_param dump shows all the 802.11s mesh parameters are at default, meaning HWMP is still active in its simplest mode ( mesh_hwmp_rootmode = 0 ).

Your station dump output shows that all the other nodes are connected to this one, as you would expect from the mpath dump output ie "1" hop.

This means you do not actually have a "mesh" style backhaul, instead it is a "star" type. This is not a problem per se, but .....

I have no idea of your actual physical layout/positioning of the nodes, but looking at the station dump output,

Station 10:7b:44:ce:05:84 

is struggling with a very low signal (instantaneous -82dBm, worst case -92dBm), whilst trying to maintain its connection.

The 802.11s HWMP sees this and may well change its connection to one of the nodes closer to it. Batman's layer 3 backhaul routing will try to track this, causing a short dropout, then using the 2 hop layer 2 HWMP route will have greater latency.

Because the HWMP is in its simplest mode, it will, sooner or later, signal permitting, change back to a direct connection (I told you to remember, Batman should only have 1 hop layer 2 links). In turn, this causes another dropout with Batman tracking the change, eventually going back to the poor signal, but lower latency direct path and your pings go back to "normal".

Does this make sense to you ?

A possible fix:
There are lots of things we can do, but the easiest is the following -
On your "main" node, change the mesh_rssi_threshold.
The default setting of 0 (zero) means attempt to connect regardless of signal strength.

Lets try forcing that distant node to connect to a node closer to it, without HWMP pulling the rug.

Currently:

Change this to:
option mesh_rssi_threshold '-65'

Make sure you have committed (saved) the change and reboot.

See what happens. You should probably set the same mesh_rssi_threshold on every node.

If this works, great.
If not, we will need a bigger hammer :hammer_and_wrench:
ie start setting some mesh parameters (that cannot be set in a uci config).

1 Like

Okay, so I think in your description you're saying that HWMP maintains a true 'mesh' network where every node has a link to every other node that it can communicate with.

Whereas BATMAN bins all but the best link, so only maintains a working link to its nearest (or loudest) neighbour. But presumably if you stretch your 'star' arrangement to breaking point, then it scrambles about and re-tries to find a new neighbour, which might end-up playing 'piggy-in-the-middle' and acting as a relay from the outlining Access Point back to the main Router?

And I think you're then saying HWMP is screwing-up BATMANs nicely thought-out plan & switching to it's 'better plan' of having 2-hops, which BATMAN then decides to go along with. And then at some point, it all switches back to the nicely thought-out 1-hop plan.

Q. So why is HWMP got anything to do with this if installing BATMAN does option mesh_fwding '0' to kill it off? Or does BATMAN still rely on HWMP at a lower layer to do some of the hardwork?

Q. I'm pinging all my mesh nodes every 60s. The ping-dropouts seem to last at least 60s according to the 'Uptime Kuma' charts, but perhaps the time slice on the charts just isn't small enough to see the real length of time. How long does it take BATMAN to re-jig it's mesh layout? (I kind of imagined 1-3sec perhaps?)

Took me a while to find where in Luci the option mesh_rssi_threshold '0' setting is, but I've now found it & changed it to -65 on the main Router node (a Linksys EA8300). And confirmed it's written to /etc/config/wireless & output as -65 in iw dev phy2-mesh0 mesh_param dump

I'll login to the other Access Points and set them all to -65. I prefer to do everything via Luci as I just know at some point, I'll end up having to do all this a 2nd time & it's best to know my way around in Luci.

The particular Access Point you highlighted 10:7b:44:ce:05:84 is an ASUS Lyra MAP AC2200 in my diningroom, and certainly it's not got the best download speeds, but it's not actually the furthest away AP.

However, few of my AccessPoints are presently in their 'final' locations - there was no point doing that until I had everything working.

Okay, so setting option mesh_rssi_threshold '-65' doesn't actually seem to have had any effect - see attached Uptime Kuma screenshots.

Although I did have to go power-off/on some of the Access Points, so presumably they re-jigged their configuration & I lost connection to them after setting the RSSI threshold.


No, it has a layer 2 link to every other node via as many other nodes as necessary to give the best path (where "best" is calculated by an algorithm based on numerous factors including latency, number of hops, signal quality etc).

No, it has a layer 3 link to every other node via as many other nodes as necessary to give the best path.

BUT, layer 3 sits on top of layer 2. so you can get into a situation where HWMP and Batman can be fighting each other.

No, Batman has no idea what is going on at layer 2, it just assumes it is like an old fashioned ethernet hub. Whereas HWMP is acting more like an array of managed ethernet switches (sort of)

This is not to "kill off" HWMP. It is to set the node in a mode where it will not forward user data to nodes it is not directly connected to. It still participates in the layer 2 HWMP mac-routing.

I don't know, like I said, I am not deeply familiar with Batman. A quick search showed that with a topology change it can be greater than 60 seconds, but that was a quick search....

Changes to this only take effect when a node tries to join the mesh. If already connected at -80dBm for example, it will remain connected.

I'm not sure if luci restarts the interface to kick off all connections. I always reboot everything if doing this manually and retrospectively.

It is not the actual distance that matters, it is the signal strength (which is of course effected by distance, but is far from the whole story.).

Of course, this might not fix the problem. But there are also other things to try if it does not.

1 Like

I wanted to satisfy myself that this issue wasn't something odd with 'Uptime Kuma', so I installed 'Statping' as a Docker Container and added my Access Points.

Although I've no real idea how to setup Statping properly, it does appear to show the same ping-dropouts happening (example images show Access Point Lyra7 192.168.1.7).

Always the ping shows ~250ms.

Perhaps there's nothing wrong & this is just some kind of 'side effect'?

The effect I seem to notice (which is what led me to setup Uptime Kuma in the first place) was when surfing the web on phone/laptop, suddenly for 5-20secs the browser activity just stops - there's just no response happening (apart from loading/busy) & then suddenly activity returns to normal. And it seems to happen randomly, on different Access Points at different times.


Is there a way to build a non-BATMAN mesh with just 802.11s ?

I'm not entirely sure what advantages BATMAN gives?

Is there an online idiot-proof guide to building an OpenWRT 802.11s mesh (without the need of BATMAN). I'm technically literate, but no networking expert & only about 2 weeks experience of OpenWRT.

All my routers/access-points are tri-band & I'd like to stick with the 5GHz Ch.36 (as in BATMAN) as the wireless backhaul.

Yes, you can create a wireless mesh network without using BATMAN.

For me, the biggest advantage of using BATMAN is that you can have a wireless mesh network that supports multiple VLAN. Personally, that means I can have individual wireless networks for my lan, iot, guests, etc. that are all operating across the wireless mesh network.

As for "idiot-proof" guides, I can't say I'm aware of any :slight_smile:

For some friends of mine that aren't the most technical, they were able to successfully configure their own 802.11s mesh network following this video from onemarcfifty:

Also, I believe the OpenWrt user guide is decent and worth a look:

Lastly, the mesh11sd project looks like a possible solution for people wanting to create their mesh network without a ton of configuration or effort. I can't personally speak to this, but most of the things I've read are positive:

Hope this isn't just a repeat of things you've already seen. Good luck :grinning_face:

P.S. onemarcfifty also has another video for implementing BATMAN that could be useful to you if you end up having to use it after all:

1 Like

Thank you for the links & videos.

I had previously looked at all of them and they were indeed very helpful. The one 'tutorial' that I found most useful though, was this one:

It provides a detailed & comprehensive step-by-step guide with pictures, to setting up a BATMAN mesh.

Unfortunately, for various reasons, my setup seemed plagued by issues, including OpenWRT spontaneously losing all of its configuration details & subsequently proving irrecoverable.

I believe the OneMarcFifty video on setting up an OpenWRT 802.11s mesh is very old now & ultimately results in a mesh network that's unstable with nodes dropping off here & there.

I did try following the Mesh11sd walkthrough, but found this difficult to navigate & achieve an end result that I could then manage.

So in the end, I've abandoned OpenWRT & BATMAN, as I just couldn't get a workable, stable & manageable wireless network built.