802.11s: Strange behaviour, esp. packets blocked on 2.4GHz, whereas it works on 5GHz

Hello all,

I have set up a wide-spread testbed for 802.11s in a farm, 50m*30m area, buildings with sturdy walls, some of which are 80cm thick.
The first installation was an added mesh on top the 2.4Ghz WLAN infrastructure running a mix of WDR4300, CPE-210, DHP-1565, WR710N on Openwrt CC 15.05 and DD trunk.
Integrating an Fritzbox 7340 didn't work (their meshing is obviously not 802.11s, at least not plain vanilla), also a Teltronika RUT-950 failed (the Openwrt BB 14.04 is too old, and new wpad/authsae builds fail because of libc deps I don't want to mess with on this box).

The mesh was encypted, channel 11, HT20 (so 802.11gn), but no tuning whatsoever done.
This worked more or less, but had some nasty effects like paths chosen despite better paths via another hop available, very slow round trip times, and generally a cloggy network "feeling".

Strange problem 1)
Neighboring stations are in ESTAB state, however when starting any communication sometimes a ping a->b does not get through (resp the arp request for b from a is received by b). However doing the same from b->a works fine: arp request sent, arp answer received, icmp goes back and forth.
Question: Is there some tuning necessary for broadcast flooding? Or generally, what would be wireless/IP/ARP tuning recommendation for stability (and a bit of performance) when meshing using 802.11s ?

To include a different project (Airmax p-t-mp links overlaying a small city) into this testing, I grabbed some hardware together to build another, distinct test bed.
Now, a number of Ubiquiti Nanostation M5, M2, Loco M5, Loco M2, two WDR4300 and a 1043ND running LEDE 17.01.2 are used.
They all run on ath9k wireless.

Strange behaviour 2):
The 5Ghz boxes using channel 48 (M5, Loco M5, WDR4300) only started talking on the mesh encrypted, after I set them all to unencrypted, then back to encrypted. After having done this once, they work without problems even after reboot. Currently, I cannot power them off to relase cold-start the boxes though.
The config is the very same as in the beginning, yet now it works which it did not before.
Question: Anyone has seen something like this?

Strange problem 3):
The WDR4300 carries a "managed network"/WLAN client on it's phy0.
With phy1 on channel 48, mesh works fine despite the point 2) from above.
With channel 11 on the phy0, non-encrypted traffic works fine.
However, encrypted traffic (here arp requests) are sent out and received&aswered on the other side, yet never received back.

mesh2     Link encap:Ethernet  HWaddr FA:1A:67:CA:82:29
      inet addr:10.0.2.23  Bcast:10.0.2.255  Mask:255.255.255.0
      inet6 addr: fd87:5b6f:a5a1:20::1/60 Scope:Global
      inet6 addr: fe80::f81a:67ff:feca:8229/64 Scope:Link
      UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
      RX packets:0 errors:0 dropped:0 overruns:0 frame:0
      TX packets:13 errors:0 dropped:0 overruns:0 carrier:0
      collisions:0 txqueuelen:1000
      RX bytes:0 (0.0 B)  TX bytes:1444 (1.4 KiB)

Does anyone have an idea on this? Increased debugging level on the authsae daemon didn't give any meaningful output imho.

Best regards,
Michael

PS: Setting static arp entries on both sides solve the problem, so something seems to wrong with arp requests/respones here.
Anyone experienced, and solved, something like this?

Ok, after tracing and lots of trying, the problem was resolved.
It seems with the ath9k driver, beacons are transmitted fine, however management frames where encrypted/descrypted wrongly in the context of authsae. This made the arp requests go out, arp replies coming on never made it because they had a wrong fcs from multiple partial encrypting on sender and/or decrypting on receiver side.
Sometimes, not aways, it seemed to worked, when all wifi interfaces where startup up in the exact same order on the nodes. Obviously, the erronous encryption was then symmetrically decrypted the same way.

Ultimately this means the ath9k currently does not properly works together with 802.11s in all cases.

Solution: turn off hardware encryption:

echo "ath9k nohwcrypt=1" > /etc/modules.d/ath9k
1 Like