BTW - I tried the 'hairpin' fix but it had no effect.
In a nutshell,
If [C] pings [A]
[C] issues an ARP broadcast :"Who has IP_A"
The ARP arrives on wlan0 of [B].
[B] does not reply (as it is not [C])
BUT !!!! [B] does NOT forward the ARP back out of wlan0, so the ARP request never reaches [A]
[A] never replies, so the ping fails
This is a pretty fundamental issue.
My assumption is that [B] should consult its mesh topology, and if it determines that there are MeshPoints that did not get the ARP broadcast, but which [B] does have a direct path to, then [B] should resend the ARP request out the same wireless interface it arrived.
This is a show stopper for me .. and means 802.11s on OpenWrt is quite broken.
Hello! I was planning on setting up 802.11s Mesh at home using OpenWrt in the near future. This report gives me pause. My current mesh setup has wired LAN to WAN connectivity on all mesh nodes. I'm hoping this bug wouldn't impact an identical setup with OpenWrt. I suspect each node would be a portal node in this case, so the wired connection on each node would be the default way of accessing the LAN. Let me know if you have any insight.
As far as I can see this is not a bug as such - more a config issue.
It all works on first power up but as in the OP's example if one link goes out of range this can happen. Rebooting all the mesh nodes fixes it.
The issue is that the uci config for the following does not take place but these are essential for autonomous layer 2 "routing":
These have to be set using the IW utility. My work around is to have a background process that checks and sets these three parameters at a set interval as they are reset to default by a network restart or eg running the wifi command.
Setting the rssi threshold is important. Set it to a value that still gives acceptable performance, -80 is usually good, -90 too weak etc..
mesh_gate_announcements should be set on at least one node. As it does generate some traffic, if you had hundreds of nodes it would be a lot of traffic, but with just tens of nodes, I put it on all of them as it helps to get rapid convergence.
I know.
I tried 2 AP connect to same switch and setting both SSID&Password same but some device like iOS or laptop can't auto switch to nearest AP. That's why I try to use mesh.
thanks. and why do you even check and not just set them?
still, the script would be of interest
I wrote now this:
#!/bin/sh
mesh_fwding=$(iw dev wlan0 get mesh_param mesh_fwding)
mesh_gate_announcements=$(iw dev wlan0 get mesh_param mesh_gate_announcements)
mesh_rssi_threshold=$(iw dev wlan0 get mesh_param mesh_rssi_threshold)
echo "the current forwarding is: $mesh_fwding"
echo "the current gate announcements is: $mesh_gate_announcements"
echo "the current rssi threshold is: $mesh_rssi_threshold"
if [ $mesh_fwding == "0" ]; then
iw dev wlan0 set mesh_param mesh_fwding '1'
echo "forwarding set"
else
echo "forwarding already set"
fi
if [ $mesh_gate_announcements == "0" ]; then
iw dev wlan0 set mesh_param mesh_gate_announcements '1'
echo "announcements set"
else
echo "announcements already set"
fi
if [ "$mesh_rssi_threshold" != "-80 dBm" ]; then
iw dev wlan0 set mesh_param mesh_rssi_threshold '-80'
echo "threshold set"
else
echo "threshold already set"
fi
I guess yours is similar. I then put this on the scheduled tasks to run it every 5min:
5 * * * * /root/mesh_parameter.sh
so far I only have this set on the mesh nodes, not the master. But I guess it would be a good idea there as well?
#!/bin/sh
. /lib/functions.sh
parse_list() {
local value="$1"
local _device="$2"
iw $_device set mesh_param $value
}
parse_interface() {
local section="$1"
local _mode
local _mesh_param
local _mesh_id
local _device
config_get _mode "$section" mode
# Not mesh then exit
[ "$_mode" != "mesh" ] && return 0
config_get _mesh_param "$section" mesh_param
# No mesh_param list then exit
[ -z "$_mesh_param" ] && return 0
config_get _mesh_id "$section" mesh_id
while true; do
sleep 10
_device=$(iwinfo | grep "$_mesh_id" | awk '{print $1}')
[ -z "$_device" ] && continue
config_list_foreach "$section" mesh_param parse_list $_device
break
done
}
config_load wireless
config_foreach parse_interface wifi-iface
I'd be interested if someone has a better solution. The hotplug event runs BEFORE the mesh is up so you can't run the iw commands right away. They have to run later. Also my script will run for each mesh the system has and process ALL meshes each time.
My /etc/config/wireless contains:
option mesh_fwding '1'
option mesh_rssi_threshold '-80'
list mesh_param 'mesh_gate_announcements=1'
Any mesh parameters that aren't recognized properly can be added to the mesh_param list.
I ran into this once but did something else instead of analyzing the issue, I think I can summarize:
The problem exists when one of the APs is not in wireless range of another AP that is wired, only one that is also wireless-only, example
AP 1 ----------- AP 2 ------------ AP 3
wired / 802.11s 802.11s
/
Internet Router
where AP 3 will have problems with DHCP for itself and downstream devices, but no problem communicating upstream IF the address is manually set
And for anyone with total control of their network looking for a simpler workaround. I would suggest trying to make all points totally static
that includes for each AP, setting the address and gateway statically, and then also each AP having a DHCP server that serves different ranges within your available address pool.
I have no problem communicating in each direction. Never had this issue, nor on OpenWrt 19, nor on 21 or the current master.
What kind of configurations are you guys using?
Are you sure your bridge is configured correctly? All mesh and wlan interfaces must be bridged together for this to work.
all wifi interfaces and mesh interfaces are bridged with the lan ports.
Does that mean you're running all nodes on the same interconnected and fully bridged (layer 2) LAN / Ethernet? So your ARPs and DHCP travels over your LAN / Ethernet between all devices?
If that's the case, then that's not what the people here are talking about. They use the 802.11s mesh as the primary layer 2 interconnection means and the report is about ARP and DHCP only travelling one hop through the mesh, but not multiple hops (contrary to what it should if it wants to be full layer 2).