No wifi roaming

I have multiple access points, all running the same ssids. I’m not sure why my phone isn’t roaming across access points. I also have 802.11r enabled.

I get ~200Mbs download/transfer speeds on, say, access point 1. I’m testing this using iperf. I move away from access point 1 and the speed drops to 0 while I’m standing right next to another access point 2.

Here I’m moving from AP 1 to AP 2 while the phone remains connected with AP 1. It’ll remain at 0.00 bits/sec. It’s not going to change its AP.

[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  21.0 MBytes   176 Mbits/sec                  
[  5]   1.00-2.00   sec  14.4 MBytes   121 Mbits/sec                  
[  5]   2.00-3.00   sec  13.5 MBytes   113 Mbits/sec                  
[  5]   3.00-4.00   sec  20.1 MBytes   169 Mbits/sec                  
[  5]   4.00-5.00   sec  17.5 MBytes   147 Mbits/sec                  
[  5]   5.00-6.00   sec  10.1 MBytes  84.9 Mbits/sec                  
[  5]   6.00-7.00   sec  11.9 MBytes  99.6 Mbits/sec                  
[  5]   7.00-8.00   sec  6.62 MBytes  55.6 Mbits/sec                  
[  5]   8.00-9.00   sec  8.12 MBytes  68.2 Mbits/sec                  
[  5]   9.00-10.00  sec  8.75 MBytes  73.4 Mbits/sec                  
[  5]  10.00-11.00  sec  18.0 MBytes   151 Mbits/sec                  
[  5]  11.00-12.00  sec  20.0 MBytes   168 Mbits/sec                  
[  5]  12.00-13.00  sec  19.8 MBytes   166 Mbits/sec                  
[  5]  13.00-14.00  sec  13.2 MBytes   111 Mbits/sec                  
[  5]  14.00-15.00  sec  8.62 MBytes  72.4 Mbits/sec                  
[  5]  15.00-16.00  sec  5.50 MBytes  46.1 Mbits/sec                  
[  5]  16.00-17.00  sec  6.12 MBytes  51.4 Mbits/sec                  
[  5]  17.00-18.00  sec  5.38 MBytes  45.1 Mbits/sec                  
[  5]  18.00-19.00  sec  4.75 MBytes  39.9 Mbits/sec                  
[  5]  19.00-20.00  sec  2.00 MBytes  16.8 Mbits/sec                  
...             
[  5]  36.00-37.00  sec   384 KBytes  3.15 Mbits/sec                  
[  5]  37.00-38.00  sec  0.00 Bytes  0.00 bits/sec                  
[  5]  38.00-39.00  sec  0.00 Bytes  0.00 bits/sec                  
[  5]  39.00-40.00  sec   768 KBytes  6.29 Mbits/sec                  
[  5]  40.00-41.00  sec   256 KBytes  2.10 Mbits/sec                  
[  5]  41.00-42.00  sec  0.00 Bytes  0.00 bits/sec                  
[  5]  42.00-43.00  sec  0.00 Bytes  0.00 bits/sec                  
[  5]  43.00-44.00  sec  0.00 Bytes  0.00 bits/sec                  
[  5]  44.00-45.00  sec  0.00 Bytes  0.00 bits/sec                  
...

The only way to get the phone to ‘roam’ is to disable and enable wifi at which point it connects to (the closer) AP 2. That’ll give the same ~200Mbps speeds.

Roaming is largely a client side function. The phone (or other device) will use its own internal logic to determine when to roam.

Disable this. It can cause more harm than good in general, and shouldn't be used at all until the foundational stuff is verified.

The first thing to do is to make sure that:

  • Both APs use the same SSID + encryption type + passphrase.
  • Ideally, both APs will also be dual-band (2G + 5G) with the same conditions as above per band
  • Each AP uses a different and non-overlapping channel for each band.
  • Each AP is set to the lowest power that will still cover the desired area. You want to reduce the area of coverage overlap so that the client devices(s) are more likely to want to roam. If the distant AP still provides reasonable coverage even when you're right near the other, the client may not choose to roam.
  • 802.11r, as well as 802.11k and 802.11v are all disabled.

You can also post your configs here for review (please post from both devices):

Please connect to your OpenWrt device using ssh and copy the output of the following commands and post it here using the "Preformatted text </> " button (red circle; this works best in the 'Markdown' composer view in the blue oval):

Screenshot 2025-10-20 at 8.14.14 PM

Remember to redact passwords, VPN keys, MAC addresses and any public IP addresses you may have:

ubus call system board
cat /etc/config/network
cat /etc/config/wireless
cat /etc/config/dhcp
cat /etc/config/firewall
2 Likes

With these disabled, the behavior is the same. I had r and k but did not have v enabled. I’m actually interested if enabling v would make a difference. Might try that later.

The APs all have the same ssid, encryption, etc.. All running dual band. I’ve previously partitioned the spectrum but could arguably improve upon that.

The AP with which the phone is connected is logging BEACON-REQ-TX-STATUS events associated with the phone’s mac every second or so. What is curious is that the phone’s mac never associates with a BEACON-RESP-RX message. Other mac addresses (which look like mediastreamers) get both the BEACON-REQ-TX-STATUS and BEACON-RESP-RX associated with their mac address.

At this point, the best bet would be to post the configs.

There is no dhcp server or firewall on the APs. dhcp addresses for wifi clients come from a switch.

Is there anything specific in the AP’s wireless settings that would prevent a client from roaming? I get more the impression the client is simply refraining from roaming.

It's worth us reviewing the configs for these files, too, just to make sure. The whole picture becomes more clear when we see the entirety of the major configuration items.

You have a switch handling your DHCP? (as compared to your main router or different host such as a pinhole)? It's certainly okay from a technical perspective assuming it's correctly configured, but it is a bit unusual.

Maybe... there are tons of things you could have in your configuration that could theoretically cause issues.... or it could just be the client device(s). But seeing the config is the only practical way to know for sure.

1 Like

Are we not supposed to use something like usteer if advanced roaming features (802.11r+v+k) are enabled?

I’ve run usteer. I’ve been running dawn for a while.

Ultimately the client has to initiate the roaming which it never does. Perhaps the wifi settings could have something to do with this. Not seeing how the network topology or otherwise would have anything to do with failure of initiating a wifi roaming transition.

I agree it is unusual. Like people doing L3 switching.

Nitpick.
On layer 2, we do switching, with an fdb, a forward database. Mapping Mac to Ethernet devices and interfaces. Use ip neighbor show for instance.
On layer 3 we do routing. Full stop. Use ip -4 and ip -6 route show.

Industry trying to sell layer3 switches which are switches, with limited router capabilities.

Linux is able do more or less anything from above and even more obscure things. It's up to you.
End of not pick.

I personally also don't recommend usteer or other AP based logic to attempt to force clients to do certain things when it comes to roaming and the like -- at least not until the base roaming behavior is known to be working well (at which point these other techniques can sometimes improve roaming performance).

That said, without the configs, we can't really help diagnose anything. So I would recommend that post those if you want concrete help.

I just tested roaming using a laptop (fedora linux) which shows similar behavior to my phone. I’m getting more the impression the (client side) aggressiveness of the roaming is too weak. The laptop doesn’t remain completely sticky and after several minutes it will actually switch to a different (closer) ap. For all I know the phone does that as well but it seems like it takes forever.

I’m now looking into changing the aggression factor (if that’s what it’s called) when it comes to roaming. I should be able to change that on the laptop for testing. Probably not immediately adjustable on these ‘smart’ phones.

As far as configs; The network config is vast and sharing it is going to lead to a whole bunch of tangentially related questions. Is there something specific we’re looking for? If you believe the client is wholly responsible for deciding whether to roam, which seems to be somewhat of a given in this context, then what would be looking for in the configs on the wifi/ap or even the (upstream) network side?

Perhaps I’m mistaken, but there should (also) be some kind of diagnosis possible from monitoring network traffic, specifically 802.11 traffic.

LOL. I’m totally on your side and much prefer what’s possible in linux, notably using netfilter but also other features and aspects. Compared to that a switch looks like it’s in the stone age.

However, we or at least I accept this because a (mature) switch doing L3 is doing L3 at line speed. Modern day linux box may perform well but it’s going to struggle with L3 when you move towards or beyond 10Gb speeds. A good switch will do that, some at higher speeds, across multiple links, with acls and L3, and not even really break a sweat.

There are so many possible configuration variables that one could have that could cause problems -- so we really cannot guess.

If your config is either so unusual or so malformed as to prompt tangential questions, it might be a sign that your configuration is suspect. But honestly, we can't assess that without seeing the configs.

Roaming is nuanced as there are 3 different approaches:

  • it's possible to cooperatively encourage/trigger roaming (802.11k/v), but not all devices play nice with this method.
  • some methods 'force' roaming by kicking a client devices off one AP with the hope that it will roam properly to the next. This also sometimes causes problems and is not recommended.
  • and the method I'm recommending is to simply setup the APs such that you create an environment that encourages clients to roam by nature of the radio config -- namely: channel selection, power levels, and to the degree possible, the physical placement of the APs.

The 802.11r and 802.11k/v standards can be useful additions in certain circumstances (although I've also said that they can cause some devices to not work properly), but they can only be useful when applied on top of a well configured RF environment... The radio configuration is the foundation upon which your wifi performance hinges.

We're now over 11 posts and 10 hours into this thread, and you haven't shared your configs yet despite the fact that it was requested/suggested in the very first reply.

Is there a reason you are so reluctant to share your config?? It's really the only way that we can help.

Generally phones only look for a better access point if the signal strength for the current one drops below a threshold - Roaming RSSI Threshold. They aren’t constantly trying to roam to the fastest access point.

1 Like

Try putting all the APs on the same channel and at max power.

No. With mellanox connect cards you push easily 100 gig. If your card supports it and the driver much is offloaded to the card...

There is for instance OpenWrt support for mellanox sn2100. 32x 100 gig does routing and switching.

That is bad advice. Max power will only make client hang onto old AP longer (despite AP not being able to hear it) and same channel will make AP speed go down as they will be sharing same frequency.

I believe we’ve covered how the configs may affect roaming and with features turned off and still no roaming, sure, there are perhaps things in the config that may affect roaming but I don’t believe they would affect things to this extent as demonstrated with the iperf output. It’s really looking more like a client issue.

Honestly, I’m scratching my head as to why you’re persisting in propositions that I don’t consider to be true, e.g. if you’re asking for dhcp and firewall config and I explain there is no dhcp or firewall (on the openwrt APs) and you still want to see the configs. I don’t necessarily mind sharing those configs but it really feels strange that you continue to ask for config for things that are not even in the system. I have a flat tire and you want to check the traction control. Mmm, ok. I don’t have traction control on this vehicle but you still want to check traction control config?

Do you really believe there is only 1 way to diagnose this issue?

I feel there are other diagnostic procedures, which I mentioned previously. To continue the vehicle analogy; you want to check the ecu’s shift mapping and I believe it’s more useful to check the vehicle’s runtime behavior (using some form of traffic monitoring which I previously alluded to).

At this moment tweaking client parameters seems most fruitful, which I’ll be doing later this week.

Is iptables running on the mellanox card? We’re talking L3 switching. You may have some linux or else a host system that can keep up but it’s fundamentally flawed when the packets have to go from the backplane to the cpu (and back) compared to an L3 switch that’s doing the majority of L3 activity on the backplane (in some asic).

Perhaps thinking of the difference between modems and soft modems (what were sometimes called win-modems) may illustrate the difference. I’m sure there are network cards that make hosts behave more like switches, and conversely, there are switches that behave more like hosts, such as soft switches.

Proper driver can offload some functions to these Cards yes.
Bridging is CPU limited on x86 and somewhere around 22 GBit with 4ghz is possible.
The sn2100 has Linux support since it's release in 2017 or something and l2 switching is offloaded
If have been running this switch and the cards with Linux. And yes even if you have net filter rules applied you can route these 100 GBit on x86.

You'd be amazed how many people say these services are disabled but actually have configs that could become 'reactivated' in certain circumstances. I ask for the entire config so that I can get a proper and complete picture.

To be completely clear, I have seen situations where people have reconfigured other parts of the system in ways that can affect the part they are trying to troubleshoot.

Do you also refuse to let a doctor look at your blood-work when you're sick with something that is non-obvious and could be symptomatic of multiple things and/or maybe even something more serious?

We have not seen any of your configs at this point. Not even the /etc/config/wireless file (which would be the bare minimum). We don't even know what device you're using, or even what version of OpenWrt!!

My questions:
a) What is so difficult about posting your config?
b) Do you have such massive secrets inside your config that it would either be difficult or impossible to redact?
c) Why are you refusing to provide your configurations?

Yes, it could be a client issue, but OTOH, you're asking what you can do on your APs to verify/prove that and/or what you can change on the APs to make the client more likely to roam properly.

You can keep making tweaks to your config for as long as you like, but as long as you're the only person who can see those configs, there is no way anyone else can know how you have things setup. As such, the rest of us will need to rely on our crystal balls (mine is in the shop... its hydraulic system sprung a leak, but, I've been able to help users fix hundreds and hundreds of issues when they do post their configs).

It seems to me that there is no purpose in keeping this thread open as it is pointless for all of us to sit here guessing what might be wrong with your configuration.

As such, I'm closing this bug with a 24 hour timer. If you decide to post your config, I'll keep the thread open (or reopen it). Without the configs, we can't do anything useful (and in fact, I'm beginning to wonder if you're even using official OpenWrt in the first place).

1 Like

I appreciate your approach and understanding of these situations. In 99% of these forum posts your approach is sound and valid. Here’s a config:

ath10k-board-qca9888 ath10k-firmware-qca9888-ct base-files busybox ca-bundle cgi-io -dnsmasq dropbear ebtables ethtool -firewall4 fr
eeradius3-common fstools fwtool getrandom iw iwinfo -jansson4 jshn jsonfilter kernel kmod-ath kmod-ath10k-ct kmod-ath9k kmod-ath9k-common kmod
-cfg80211 kmod-crypto-acompress kmod-crypto-aead kmod-crypto-ccm kmod-crypto-cmac kmod-crypto-crc32c kmod-crypto-ctr kmod-crypto-gcm kmod-cryp
to-gf128 kmod-crypto-ghash kmod-crypto-hash kmod-crypto-hmac kmod-crypto-manager kmod-crypto-null kmod-crypto-rng kmod-crypto-seqiv kmod-crypt
o-sha512 kmod-gpio-button-hotplug kmod-hwmon-core kmod-lib-crc-ccitt kmod-lib-crc32c kmod-lib-lzo kmod-mac80211 kmod-random-core kmod-slhc lib
atomic1 libattr libblobmsg-json20230523 libc libcap libgcc1 libiwinfo-data libiwinfo20230701 libjson-c5 libjson-script20230523 liblucihttp-uco
de liblucihttp0 libmbedtls12 libmnl0 libncurses6 -libnftnl11 libnl-tiny1 libopenssl-conf libopenssl-legacy libopenssl3 libpcap1 libpthread lib
readline8 libtalloc libubox20230523 libubus20230605 libuci20130104 libuclient20201210 -libucode20220812 libucode20230711 libustream-mbedtls202
01210 libwolfssl5.6.4.e624513f logd -luci -luci-app-firewall -luci-app-opkg -luci-base -luci-light -luci-mod-admin-full -luci-mod-network luci
-mod-status luci-mod-system -luci-proto-ipv6 -luci-proto-ppp -luci-ssl -luci-theme
-bootstrap mtd netifd -nftables-json -odhcp6c -odhcpd-ipv6only openwrt-keyring opkg -ppp -ppp-mod-pppoe procd procd-seccomp procd-ujail px5g-mbedtls rpcd rpcd-mod-file rpcd-mod-iwinfo rpcd-mod-luci rpcd-mod-rrdns rpcd-mod-ucode swconfig tcpdump terminfo uboot-envtools ubox ubus ubusd uci uclient-fetch ucode ucode-mod-fs ucode-mod-html ucode-mod-math ucode-mod-nl80211 ucode-mod-rtnl ucode-mod-ubus ucode-mod-uci ucode-mod-uloop -uhttpd -uhttpd-mod-ubus urandom-seed urngd usign wireless-regdb wpad-wolfssl -wpad-basic-mbedtls -kmod-nf-conntrack -kmod-nf-conntrack6 -kmod-nf-flow -kmod-nf-log -kmod-nf-log6 -kmod-nf-nat -kmod-nf-reject -kmod-nf-reject6 -kmod-nfnetlink -kmod-nft-core -kmod-nft-fib -kmod-nft-nat -kmod-nft-offload -uci-firewall umdns dawn

I’m having trouble with my vision. Could be blood related but let’s first diagnose the vision.

That’s what I believed would be most relevant when I started this thread. During this thread’s runtime I believe it became clear that the config, especially without any of the steering features, is probably not where the troubleshooting should continue.

It seems you yourself would or could argue that the it’s really the client that needs to initiate the roaming. Sure, there are things on the aps that could affect that but at this point it’s way more fruitful to diagnose the behavior of the client.

You’re saying the only way to diagnose this issue is by knowing the configs and you persist in that even when it’s pretty obvious the configs will have little or nothing to do with this issue. I can talk to you all day about the merits of having dhcp running on a switch, but it’s not only not relevant but in your comments on that you’re preaching to the choir. Again, I don’t mind discussing that but it’s not relevant. Posting configs is not going to identify the problem and if that was the only outcome I would have no issue (other than having to remove passphrases, etc.) in posting my config.

Your choice. I believe that there is not only a different way to diagnose but it’s also better and less of a waste of time. So far, I’ve not received any feedback on monitoring network traffic. I’ll figure that out. I’m a bit startled that if you have a subtitle such as ‘Guru’ that you’re allowing your fundamental logic to go flawed by arguing there is only 1 way to diagnose the problem.

I not only don’t mind but I also would be very happy in posting my results of client tweaking but I might not make that within 24 hours.