WDS Configuration on 21.02

Hi all,

I am having trouble configuring a wireless link with WDS between 2 of my access points. Everything worked fine with 21.02_rc3 but has stopped working with 21.02_rc4 and 21.02.2.

My layout is below. AP3 is a wireless access point connected via a cable to my main router. AP3 is setup as a plain access point (no DHCP, no firewall, etc...). AP4 used to connect to AP3 with a WDS connection so I could extend my network. After upgrading to 21.02.0 it seems like AP4 can connect to AP3, but no traffic flows from AP3 to AP4 or AP4 to AP3.

I have tried different setting for DHCP forwarding, rebind protection on and off but haven't been able to get it to work. I don't think I am the only one having this issue.

Any help would be greatly appreciated. Thanks

I think it's busted.
https://bugs.openwrt.org/index.php?do=details&task_id=3961&order=dateopened&sort=desc

@chadneufeld You don't mention your hardware which I reckon is essential. E.g. for me, on an mt76 based setup (MT7613 and MT7615 radios) WDS is working fine.

I'm not following the release schedule, but I'm running on post RC4 code.

Hi there, I have a problem with my wds-connection as well since upgrading my two tl-1043nd_v2 to openwrt 21.02.0 stable. The connection between the two routers gets established but I am not able to transfer data through the link. But I found a workaround for the problem for me til the bug gets fixed, maybe it works for you as well:

When the routers are booted and the wds-connection seems to be established, but there isn't any data throughput, it works on my setup to go to the wireless tab in luci on the wds-ap and the wds client and to disable the wireless connection and immediately after that enable it again. This has to be performed on both routers. After that's done, the wds-bridge works fine, at least on my setup. Every change in the settings and every reboot makes this procedure necessary again.

No beautiful or stable fix, but worked for me through the day. Maybe it works on your setup too.

Have a good night

@Borromini - I have a Unifi AC LR AP as the WDS Access Point and an Archer C7 V2 as the WDS Client. Both are Atheros. The WDS link works fine in RC3, but not in RC4 or 21.02.0. I'm wondering if there is a setting or something that may have changed between releases. Would you be willing to share your wifi and dhcp settings?

@c-streif - I just had a quick look and it looks like your TL-1043nd has the same or very similar hardware to my access points. I will give your workaround a try.

Thanks

Sure will do so when I get home. I had a mixed mt76/ath10K WDS working as well at some point but that was with a sub-par ath10k device (Linksys EA6350 v3) so I can't check if that still works.

@chadneufeld - Give it a try, it's not beautiful, but it worked for me as a quick fix. Today, I "downgraded" via luci-sysupgrade to 21.02rc3. My wds-connection works now without any problems and I don't need to enable and disable the wireless connection after every reboot. I kept my settings at the downgrade as well, so it was an even better fix for me. Nevertheless, I hope the bug gets fixed. Best regards

@chadneufeld I'm not sure why you think the DHCP settings are relevant, since the WDS operates on another layer? It's supposed to be a transparent bridge, from what I know. I have both the WDS master and client set to a fixed IP and they have their gateways and route hardcoded.

Wireless settings WDS master:

config wifi-iface 'wds_radio0'
	option device 'radio0'
	option network 'lan'
	option mode 'ap'
	option ssid 'xxx'
	option encryption 'sae'
	option isolate '1'
	option wpa_disable_eapol_key_retries '1'
	option wds '1'
	option hidden '1'
	option key 'xxxx'
	option ieee80211w '2'

Wireless settings WDS client:

config wifi-iface 'wds_radio0'
	option device 'radio0'
	option network 'lan'
	option ssid 'xxx'
	option mode 'sta'
	option wds '1'
	option encryption 'sae'
	option key 'xxx'
	option ieee80211w '2'

Network setup looks like this (this is MT7621, which is on DSA, and ath79 isn't yet):

config interface 'lan'
        option type 'bridge'
        option ifname 'lan1 lan2 lan3 lan4'
        option proto 'static'
        option netmask '255.255.255.0'
        option ip6assign '60'
        option ipaddr '10.0.0.21/24'
        option dns '10.0.0.1'
        option gateway '10.0.0.1'

Any questions let me know.

Thanks @Borromini

I misspoke and meant DNS settings and not DHCP. I read on the forum about disabling Rebind Protection And setting up DNS forwards.

I’ll give your settings a try.

Hi @Borromini. I couldn't get WDS working with 21.02.0 and the network/wireless setting you posted.

Everything works fine with 21.02_rc3. I'll probably stick with rc3 or try a powerline adapter and update to 21.02.0.

I've pulled the changelog between RC3 and stable, because I myself haven't seen any commit come along that might cause this. But I might be wrong of course. And I have no ath10k setups here myself.

$ git log 2bc192c3f4..b2ae423314 --oneline
b2ae423314 (tag: v21.02.0) OpenWrt v21.02.0: adjust config defaults
5cc0535800 ath79: add support for onion omega
085c67762d kernel: bump 5.4 to 5.4.143
ff31cfb856 openssl: bump to 1.1.1l
5bfb9c30a1 prereq-build: require python3-distutils
f78017006b uboot-layerscape: fix dtc compilation on host gcc 10
8f039acee4 uboot-at91: fix dtc compilation on host gcc 10
378769b555 kernel: bump 5.4 to 5.4.142
662401d903 ipq40xx: fix Edgecore ECW5211 boot
61c65acbda ath79: kernel: Add missing quote to drivers/mfd/Kconfig
25d9fe8468 bcm27xx-userland: update to latest version
35eb06066e bcm27xx-userland: factor out a -dev package
750b966866 x86: kernel: set NR_CPUS to 512
94efa1c612 fritz-tools: fix returning wrong values due to strncmp usage
d9be07169e mbedtls: update to 2.16.11
f407b2f43c mvebu: armada-37xx: add patch to forbid cpufreq for 1.2 GHz
b254bd697d Revert "mvebu: 5.4 fix DVFS caused random boot crashes"
4003eeab35 dnsmasq: reset EXTRA_MOUNT in the right place
6ca34c5c0c dnsmasq: fix more dnsmasq jail issues
b88ab44036 dnsmasq: rework jail mounts
8ef5894197 dnsmasq: use local option for local domain parameter
da5fd91073 dnsmasq: add ignore hosts dir to dnsmasq init script
9531e70708 OpenWrt v21.02.0-rc4: revert to branch defaults
134ac824c5 (tag: v21.02.0-rc4) OpenWrt v21.02.0-rc4: adjust config defaults
2d5ee43dc6 kernel: bump 5.4 to 5.4.137
a205de5594 ramips: mt76x8: add missing config symbol
8abe67d6d2 x86: move Kconfig symbol to common config
2e1a5a4353 generic: add missing Kconfig symbol
941ba3ffc4 ath79: fix JT-OR750i switch LED assignment
17cb9a9a9e ath79: enable missing pinmux for JT-OR750i
a5850c049e ath79: add support for Joy-IT JT-OR750i
55d9c020a1 netifd: update to the latest version
089efd61e9 netifd: update to the latest version
60fad8f82b glibc: update to latest 2.33 HEAD (bug 28011)
c58afca1aa glibc: update to latest 2.33 HEAD (BZ #27646, bug 27896, BZ #15271)
249aeaa9d8 dnsmasq: distinct Ubus names for multiple instances
a1d50e7b45 kernel: bump 5.4 to 5.4.132
88c8d0a219 dnsmasq: add /tmp/hosts/ to jail_mount
4633471d74 odhcpd: fix invalid DHCPv6 ADVERTSIE with small configured leasetime (FS#3935)
df4feb1655 ipq40xx: fix FRITZRepeater 1200 RGMII delay
f3f70fb956 netifd: update to the latest version
23cde9d12a mpc85xx: add missing Kconfig symbols
fe498dd3f1 netifd: update to the latest version
38cdc57be6 mediatek: add missing config symbols
6073d2c02a generic: add missing config symbols
8921e36ed8 iwinfo: move device info into -data package
d3278c4343 build: ensure that dash isn't prepended twice to abi version suffix
47f617ef8d build: prepend ABI suffixes with a dash if package name ends with digit
febf6db0d0 ath79: add missing MTD_NAND_RB91X symbol
983fcc42a4 ath79: add missing GPIO_LATCH symbol
0ad49d368b ath79: mikrotik: fix beeper phantom noise on RB912
ffa943f0b9 ath79: ar934x: fix mounting issues if subpage is not supported
88e1c9b0b5 ath79: add support for MikroTik RouterBOARD 912UAG-2HPnD
bd2e070557 ath79: add NAND driver for MikroTik RB91xG series
43723e6db9 ath79: add gpio-latch driver for MikroTik RouterBOARDs
3eb34bc251 hostapd: make wnm_sleep_mode_no_keys configurable
89d21b7f62 hostapd: make country3 option configurable
72f0733123 ltq-deu: Mark lantiq DEU broken
b0424190ef iwinfo: build with nl80211 backend only and make shared
d723002d84 treewide: unmark selected packages nonshared
86f6171788 ath10k-ct: fix typo in Makefile
24cfa5005e ath10k-ct: update to latest version
69c10497c7 kernel/modules: move act_gact into kmod-sched-core
fc4b5411b3 package/comgt: Handle bind/unbind events
d666ebcaa3 ubus: update to the latest version
a9100f2196 base-files: wifi: tidy up the reconf code
b27b63b082 base-files: wifi:  swap the order of some ubus calls
6f13a39035 mac80211: print an error if wifi teardown fails
9302e63d1a mac80211: always call wireless_set_data  (FS#3784)
bea9380149 mac80211: fix no_reload logic (FS#3902)
ccbe535604 mac80211: backport fix for nl80211 control port tx (fixes FS#3857)
4c29ff7cb8 mac80211: add support for 802.3 encap offload with software rate control
a078037ace mac80211: improve rate control performance
9fa925362f busybox: sysntpd: add trigger to reload server
a75928d125 busybox: sysntpd: option to bind server to iface
e16a45f258 iwinfo: update to latest Git HEAD
0c51b265bf iwinfo: update to latest Git HEAD
85cef1cf22 kernel: bump 5.4 to 5.4.128
e171d11f55 libusb: Fix parsing of descriptors for multi-configuration devices
3d62b5d5c6 base-files: fix /tmp/TZ when zoneinfo not installed
3047df2317 base-files: fix zoneinfo support
ab5010d170 exfat: update to 5.12.3
72d93c1ba4 realtek: Fix failsafe mode
7a5a247c1f base-files: failsafe: Remove the VLAN modifier from interface name
c0fdfd15fc base-files: failsafe: Fix IP configuration
98b1a6435f kernel: Backport patch to automatically bring up DSA master when opening user port
ec780bdb92 kernel-5.4: backport latest patches for wireguard
82c700de67 hostapd: fix handling of the channel utilization options
1247a6bb35 bcm4908: fix Ethernet broken state after interface restart
25daa921da bcm4908: add kmod-gpio-button-hotplug
74dbf3412b base-files: fix typo in config_generate MAC check
125deb4d78 base-files: set MAC for bridge ports (devices) instead of bridge itself
e410ef8389 hostapd: wolfssl: add RNG to EC key
f6d8c0cf2b wolfssl: always export wc_ecc_set_rng
56228e9393 ath79: don't autodetect AR8033 PHY capabilities
2e157714a8 build,json: fix generation with empty profiles
8add3e139c build: preserve profiles.json between builds
b2a3df91fa qos-scripts: add ifbN device before setting the link up
3d0ed7d763 mac80211: fix an issue with wds links on 802.11ax devices
7a4bd9cc51 ath79: use dynamic partitioning for TP-Link CPE series
3839a4c7e9 mac80211: fix minstrel sample time check
3921f213e5 iw: update to 8fab0c9e
20f66649dd mt76: update to the latest version
05a8bf04ec mac80211: sync nl80211.h with upstream and backport a WPA3 related commit
072d0afb8f ugps: start also in case device is absent
25c75424e7 ugps: update to git HEAD
aeb7b57798 OpenWrt v21.02.0-rc3: revert to branch defaults

Edit: there has been an ath10k-ct bump though, I'd try rolling those back to see if that is interfering. Barring that, might be one of the netifd bumps maybe.

What about this Borromini ? If you are able to pull back particular change, isn't it worth to try as well ?

That's for AX devices, and ath10k is AC.

Up, sorry, I didn't know ax stands for Wifi6 ...

No worries. We all try to help :slightly_smiling_face:

The problem is on ath9k as well, not just 10k.
Even though the tplink devices I've seen the problem on are still using swconfig, I strongly suspect it's something to do with the DSA changes as they are the most obvious areas that impact on bridge/vlan/packet movement. ymmv

Thanks everyone. I've decided to try compiling my own firmware to see if I can figure out which commit breaks WDS.

Unfortunately, I'm completely new to using git and compiling openwrt. Here is what I have done so far:

  • create a local git branch based off the RC3 tag
    git checkout tags/v21.02.0-rc3 -b wds_test

  • Pulled changes from the remote repository
    git fetch
    git merge
    ./scripts/feeds update -a
    ./scripts/feeds install -a

  • Setup the configuration file
    make menuconfig

  • compile firmware

What I was hoping you could help me out with is how I can easily step between different commits between RC3 and RC4. I want to start in the middle between RC3 and RC4 and see if that work. Then keep stepping 1/2 way between points where I know WDS works or doesn't. I found this but am not sure if it's the best approach:

  • git reset --hard <commit_id>

I don't know if this will reference the 21.02 branch or back to the master branch.

Thanks

There's no need to reset to a specific commit. You can use:

$ git checkout $commit_id

Which will return:

$ git checkout 134ac824c5
Note: switching to '134ac824c5'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 134ac824c5 OpenWrt v21.02.0-rc4: adjust config defaults

As you can see, git itself already tells you how to revert back to the default state (HEAD) of the branch. It won't be switching branches without you telling it so explicitly.

Edit: @chadneufeld What you could also try (and which would be more fine-grained) is reverting specific commits, build an image, see if that helps. As easy as:

$ git revert 134ac824c5

If it turns out that doesn't do anything, you can do a:

$ git reset --hard HEAD~1

That might return you to 21.02 HEAD instead of the 21.02.0 tag (not sure), but at this point changes beyond 21.02.0 are minimal (one mvebu commit which doesn't concern your devices).

After some trial and error I think I have found the offending commit. There are 4 NETIFD commits that happened between RC3 and RC4. The earliest one, fe498dd3f1, causes issues with the stability of the WDS link on Ath9k for me. I wasn't able to rollback just that commit, I had to rollback all 4 commits. GIT threw an error if I tried to rollback fe498dd3f1 without rolling back the other 3 commits.

55d9c020a1 netifd: update to the latest version
089efd61e9 netifd: update to the latest version
f3f70fb956 netifd: update to the latest version
fe498dd3f1 netifd: update to the latest version

I think these are the individual fixes within the larger fe498dd3f1 commit. @Borromini, is there a way that I could work with these smaller commits inside of the main commit?

61a71e5e49c3 bridge: dynamically create vlans for hotplug members
cb6ee9608e10 bridge: fix dynamic delete of hotplug vlans
7f199050f395 wireless: pass the real network ifname to the setup script
50381d0a2998 bridge: allow adding/removing VLANs to configured member ports via hotplug
f12b073c0cc3 wireless: add some comments to functions
b0d090688302 bridge: fix setting pvid for updated vlans
ff3764ce28e0 device: move hotplug handling logic from system-linux.c to device.c
16bff892f415 ubus: add a dummy mode ubus call to simulate hotplug events
7f30b02013f2 examples: make dummy wireless vif names shorter
013a1171e9b0 device: do not treat devices with non-digit characters after . as vlan devices
f037b082923a wireless: handle WDS per-sta devices
db0fa24e1c17 bridge: fix enabling hotplug-added VLANs on the bridge port
4e92ea74273f bridge: bring up pre-existing vlans on hotplug as well
1f283c654aeb bridge: fix hotplug vlan overwrite on big-endian systems
2 Likes

Glad you narrowed it down! It's pretty normal you can't roll back just the one commit, since those build on the previous ones. When you check the commit log, you'll see the commits in the external git tree that are part of the bump:

$ git show fe498dd3f1
commit fe498dd3f108de494594ae8e0eba207fdbf14594
Author: Felix Fietkau <nbd@nbd.name>
Date:   Fri Jun 4 09:11:37 2021 +0200

    netifd: update to the latest version
    
    61a71e5e49c3 bridge: dynamically create vlans for hotplug members
    cb6ee9608e10 bridge: fix dynamic delete of hotplug vlans
    7f199050f395 wireless: pass the real network ifname to the setup script
    50381d0a2998 bridge: allow adding/removing VLANs to configured member ports via hotplug
    f12b073c0cc3 wireless: add some comments to functions
    b0d090688302 bridge: fix setting pvid for updated vlans
    ff3764ce28e0 device: move hotplug handling logic from system-linux.c to device.c
    16bff892f415 ubus: add a dummy mode ubus call to simulate hotplug events
    7f30b02013f2 examples: make dummy wireless vif names shorter
    013a1171e9b0 device: do not treat devices with non-digit characters after . as vlan devices
    f037b082923a wireless: handle WDS per-sta devices
    db0fa24e1c17 bridge: fix enabling hotplug-added VLANs on the bridge port
    4e92ea74273f bridge: bring up pre-existing vlans on hotplug as well
    1f283c654aeb bridge: fix hotplug vlan overwrite on big-endian systems

I'm betting your issue is caused by one of the wireless changes, the repo is here. A final test would be to build directly from the git repo instead of using the git tarball the build environment pulls. At this point, you can just recompiled the single netifd package and upgrade/downgrade it with opkg to your liking. No need to keep reflashing firmwares.

My money would be the f037b082923a netifd commit. I've tried myself to revert just that in the netifd tree but it won't let me, so its time to call in the cavalry. I'd recommend you try IRC and ping @nbd there to see if he can assist, he committed fe498dd3f1 and did a lot of work on refactoring the networking code. Not sure if he replies on/reads the forum, and IRC makes for an easier back and forth (more realtime than a forum).

I'll include a quick write-up for completeness' sake on how to build straight from the git tree. It might expedite things if you already test a few commits.

  1. Enable CONFIG_SRC_TREE_OVERRIDE in the buildroot settings.
  2. Grab the netifd git tree and link it into your OpenWrt buildroot as follows:
$ git clone git://git.openwrt.org/project/netifd.git
$ cd path/to/21.01_tree/
$ ln -sv /path/to/netifd_git_tree package/network/config/netifd/git-src/
  1. Compile netifd
    $ make package/netifd/clean,compile} V=s

  2. Install new netifd package on your router and test.

If you need to test a specific earlier commit, extend 2. with the following commands:

$ cd /path/to/netifd_git_tree
$ git checkout $git_hash

To revert a specific commit:

$ git revert f037b082923a

After that you can run 3. again. What I'd recommend is you build from netifd source before prodding nbd (he'll probably ask you to do so anyway so you can test patches at some point). Commit 013a117 is the commit prior to f037b08 (the WDS change), so if you check out that first commit, build from there, and it works, then build an image from commit db0fa24 (the commit after), you can be pretty sure the WDS one is the one causing trouble.

1 Like