802.11s mesh /w x3 RT3200's broken

I have x3 RT3200's setup in an 802.11s mesh. For months it has been 100% stable, but now recently either one of my two extension nodes becomes inaccessible even though the other stations list the inaccessible node as still connected. Rebooting my main router restores connectivity, but then eventually one of the two extensions nodes will again become inaccessible. By inaccessible I mean I cannot access the web interface or ping the affected node.

I believe this issue has surfaced as a result of a change reflected in a fairly recent snapshot.

The receive and send packets seem to increase during the period of inaccessibility (which I don't understand)? See the first station (presently inaccessible) rx/tx packets increase:

root@OpenWrt:~# iw dev wlan1 station dump |grep -i packets -A 2 -B 2
        inactive time:  50 ms
        rx bytes:       11694224
        rx packets:     98735
        tx bytes:       115203
        tx packets:     921
        tx retries:     0
        tx failed:      0
--
        inactive time:  10 ms
        rx bytes:       39163475
        rx packets:     184596
        tx bytes:       107933315
        tx packets:     100696
        tx retries:     0
        tx failed:      0
root@OpenWrt:~# iw dev wlan1 station dump |grep -i packets -A 2 -B 2
        inactive time:  90 ms
        rx bytes:       11698067
        rx packets:     98769
        tx bytes:       115203
        tx packets:     921
        tx retries:     0
        tx failed:      0
--
        inactive time:  0 ms
        rx bytes:       39168194
        rx packets:     184638
        tx bytes:       107934713
        tx packets:     100703
        tx retries:     0
        tx failed:      0

Any suggestions for how to diagnose / resolve this issue?

I can view the log of an affected node by rebooting the main node, which restores connectivity to the previously affected node, but then I do not know what to look for.

Hi, @Lynx,

I had an issue a few days ago with my 802.11s too, can execute this command and post the output here?

iw dev wlan1 mpath dump

I saw new routes where added with NEXTHOP being 00:00:00:00:00:00, which is incorrect, eventually my devices were loosing connectivity. So let's find out.

Dear @amteza,

DRET    FLAGS   HOP_COUNT       PATH_CHANGE
e8:XX:XX:XX:X5:c9 e8:XX:XX:XX:X5:c9 wlan1       13355   1389    0       0         100      0       0x4     1       11
e8:XX:XX:XX:X3:80 e8:XX:XX:XX:X3:80 wlan1       9563    1389    0       3410      100      0       0x5     1       26

So the two different nodes are listed there by the 5Ghz WiFi MAC address. Each entry has the node's 5Ghz MAC repeated twice. I think the same is shown on each of the other nodes for the respective remaining two nodes out of the three. Does this look right?

@smileys29 may have the same issue that I am experiencing with mesh at the moment.

Yup, looks good, your problem looks different to what I was experiencing. Sorry for wasting your time.

Not at all. This is progress. Any other ideas anyone?

What dictates which node is the master node in mesh to which other two nodes connects? How can I force one particular node to be the master node? Since it looks like the one that breaks is the one to which the other two connects.

Maser node? If you are utilising HWMP there is no master node. Each node tracks and discovers other nodes and adds them to the routing table. Nodes connect to each other choosing the "best radio path". That's why they are marked as Mesh Stations (mesh STAs) or ad-hoc stations.

1 Like

Interesting. I am going by what my android phone reports using 'WiFi analyzer'. It shows 3x 5Ghz access points (one for each node) and a hidden SSID. The hidden SSID has a mac address of one of the nodes. I assumed that it is this that the other two nodes connect to?

Something is wrong there, check this snapshot from WiFi Explorer.


Do you see the two hidden networks? One per node? I've only 2 mesh stations.

1 Like

I only see the one hidden node for one of my mesh nodes. Will try another WiFi monitoring tool.

And just in case this is of any use to you:

1 Like

Yeah any idea why the same MAC shows up twice in path? Wouldn't you expect path from one MAC to another MAC? Trying to find another monitoring tool to see if I see the other hidden SSID's. Right now I only see the one. Maybe this is a problem.

Yeah, nah, I don't know why. I was confused at first to and check online to see if for others was like that, and it was. If I check LuCI, I see this downstairs:


And this upstairs:

Which are correct.

Ah! Got it, it's a direct path mpath Wiki:

Next-hop MAC address

The next hop on the way to the destination. Direct paths have the same MAC address for the destination and next-hop.

https://open80211s.org/trac/wiki/HOWTO

1 Like

With WifiInfoView:

I see two hidden SSID's, but not three.

I don't know. Is this on a stable network or, is it on current OpenWrt version giving you headaches?

It's latest snapshot for RT3200.

Actually now I see all three hidden SSID's on WifiInfoView. So that's reassuring.

The weird thing is that when one node becomes inaccessible, it is still listed by the other nodes as connected and even bytes transferred increase:

So like even now during messaging my 2nd extension node has become inaccessible. It seems it becomes inaccessible following a period of inactivity. So perhaps it is a power saving issue or something?

root@OpenWrt:~# iw dev wlan1 station dump |grep -i bytes
        rx bytes:       24222245
        tx bytes:       147301
        rx bytes:       626904121
        tx bytes:       372754833
root@OpenWrt:~# iw dev wlan1 station dump |grep -i bytes
        rx bytes:       24224452
        tx bytes:       147301
        rx bytes:       626906328
        tx bytes:       372754833
root@OpenWrt:~# iw dev wlan1 station dump |grep -i bytes
        rx bytes:       24225989
        tx bytes:       147301
        rx bytes:       626908137
        tx bytes:       372754919
root@OpenWrt:~# iw dev wlan1 station dump |grep -i bytes
        rx bytes:       24372908
        tx bytes:       147479
        rx bytes:       627143495
        tx bytes:       372823476
root@OpenWrt:~# iw dev wlan1 station dump |grep -i bytes
        rx bytes:       24379516
        tx bytes:       147538
        rx bytes:       627150666
        tx bytes:       372823844

So 'tx bytes' of the first inaccessible node increases very slowly.. does that help?

Gotcha! Try to add in all your nodes in /etc/config/wireless the last two lines skip_inactivity_poll and disassoc_low_ack:

config wifi-iface 'wifinet0'
	option device 'radio1'
	option mode 'mesh'
	option mesh_id 'mesh-net'
	option mesh_fwding '1'
	option mesh_rssi_threshold '0'
	option network 'lan'
	option encryption 'sae'
	option macaddr '00:0C:43:26:60:B0'
	option key 'xxxxx.yyyyy'
	option skip_inactivity_poll '1'
	option disassoc_low_ack '0'
1 Like