Dawn: a decentralized wireless controller

Ah! Interesting about mbo…. Will read a bit and likely add that.

In terms of dtim, although most devices are apple, I have a few iot devices. If I recall correctly back when I was running an r7000 on tomato, a couple of the cheap iot devices started acting weird when I raised dtim…

It’s been a while so my memory is foggy on that…. Might try it again. :slight_smile:

Great point, though I can say confidently I have multiple IoT devices as well and haven’t considered the dtim interval with them to-date. Everything has been running smoothly here. But definitely a YMMV situation, no doubt.

Best of luck with any new settings!

Found this commit re mbo... Maybe best to leave as default (or at least turn on knowing) as of "today":

Fri Feb 24 02:39:19 2023 daemon.notice hostapd: wlan2g: BEACON-RESP-RX ac:d6:18:0d:47:4d:3e 04 0000000000000000000000008000000000000000000000000000
Fri Feb 24 02:39:19 2023 daemon.info dawn: Received NULL MAC! Client is strange!

Why it always occuring?

Hello,
I just setup Dawn and followed all the recommended changes to the config file.
I do seem to have a problem with Android devices that are in sleep mode. They are connected to the AP and show in the list of connected devices, however I do get these messages in the log every 5 seconds for each Android device:

Thu Mar  2 22:12:44 2023 daemon.notice hostapd: Beacon request: xx:xx:xx:xx:xx:xx is not connected
Thu Mar  2 22:12:44 2023 daemon.warn dawn: Client / BSSID = xx:xx:xx:xx:xx:xx / yy:yy:yy:yy:yy:yy: BEACON REQUEST failed

Any idea what's going on? I suspect the kicks are not actually sent because hostapd thinks the STA is not authenticated/connected for some reason.
This is on the latest stable with wpad-wolfssl.
Thanks!

My logs are also flooded with this and from what I gathered, this might not be directly related to DAWN but seems to be a bug in the cfg80211 kernel module (driver) related to 802.11r. Some forum threads suggest that it's a regression that happened after OpenWRT 19. Unfortunately, I can't downgrade to verify because my devices (TOTOLINK X5000R) only got support since OpenWRT 21. There's a patch online (not upstream, unfortunately) and rumor has it that it could mitigate the bug.

This is why I'm trying to figure out building OpenWRT from source. I haven't succeeded doing that but here's the details: Daemon.err hostapd: nl80211: kernel reports: key addition failed - is this a problem? - #45 by fodiator.

At least, that's my initial impression, at least. I've been googling so much that I don't remember how I came up with the connection to that forum thread, though. Maybe somebody like @PolynomialDivision would be able to give better pointers...

Note that https://github.com/openwrt/openwrt/issues/7907 (ex. https://bugs.openwrt.org/index.php?do=details&task_id=3159) suggests that it is recommended to set FT Protocol to FT over the air and Reassociation Deadline to 20000, specifically referencing this sleep mode case (though, mentioning iOS, not Android).

I've not really confirmed if this is related to Dawn or 802.11k/v in general, but after some time my (Android) devices just "supposedly" lose internet connectivity. They drop off of WiFi, and attempting to reconnect seems to throw "Connected, no internet access" and then they drop again.

What's odd is that in the short amount of time they are connected, they are able to be pinged, and also themselves seem to be able to ping (and use DNS) addresses.

I'm currently re-trying with 802.11k/v and Dawn disabled. There's nothing obvious in logs. Devices are the Xiaomi R3G and Redmi 2100, so both ramips/mt7621

I'm running self-compiled Snapshots from the same git HEAD on both devices.

I haven't seen anything super obvious on either my OpenWrt devices, nor my Android logs. Just basic auth, DHCP and then disassociation after a few seconds. Only restarting seems to reliably make it work for a bit again

I've been seeing something that feels similar. TOTOLINK X5000R also has ramips/mt7621. One example was an iPad that seemed to have stopped getting the packets through WiFi while still showing it as connected on the device. Also, some DNS-related queries seem to be stuck+timeout from time to time. I wasn't able to track this down. It is happening on 21.03 and I think on 22.03 too.

One of my issues was that the mt76 upstream sets the same mac address for the 5G radio: https://github.com/openwrt/openwrt/issues/8861. This is bad since I had 3 identical APs that advertised 5G network with the same BSSID — according to my googling, this usually causes a lot of problem with AP confusion where one AP authenticates an STA but others send deauth since they also hear traffic addressed to the same BSSID. It was an interesting experience since 2.4G radios with the same SSID have different BSSID, as they should. So I applied https://github.com/openwrt/openwrt/pull/4738 for now, as a workaround.
Still, I'm seeing the bug I described in the previous post too.

Note that I've tried the lastest OpenWRT snapshots while debugging my issues and WiFi seemed much less stable but I haven't checked why.

The logs I've seen don't reveal if the problems I'm seeing are DAWN-related on lower level too...

Doesn't seem to be an issue here - primary router's 2.4 GHz ends with :F2 and 5 GHz ends with :F3, and the AP's with 59 and 5A, so no duplicates

e hmmm: Fix mt76 crash issue, Resolves #763 by Brain2000 · Pull Request #764 · openwrt/mt76 (github.com)

Yeah the good news is that it doesn't seem to be caused by 802.11k/v, nor Dawn.

The bad news is that it seems like a MT7621 issue, however I'm not sure if it's the WiFi or Switch part. I'm assuming switch, as both 2.4 and 5 GHz are inaccessible after it runs into issues

1 Like

The warning could be related to dawn. Sounds like it is requesting beacon reports from a station which already roamed somewhere else. Probably a sanity check before requesting those beacon reports would fix it.

1 Like

Hi !
I installed Dawn on all 3 of my APs with actual release but in Network Overview i see only the local AP.
But on all the i get

Mon Apr 3 13:46:13 2023 daemon.warn dawn: Failed to lookup ID!
Mon Apr 3 13:46:13 2023 daemon.warn dawn: Failed to lookup ID!

When i startup the AP but dawn is still running

Ciao Gerd

OpenWrt 22.03.5
wpad
dawn
i have 2 dumb AP and a router with DHCP

My DAWN config


config local
	option loglevel '0'
config network
	option broadcast_ip '192.168.22.255'
	option broadcast_port '1025'
	option tcp_port '1026'
	option network_option '2'
	option shared_key 'Niiiiiiiiiiiiick'
	option iv 'Niiiiiiiiiiiiick'
	option use_symm_enc '0'
	option collision_domain '-1'
	option bandwidth '-1'
config hostapd
	option hostapd_dir '/var/run/hostapd'
config times
	option con_timeout '60'
	option update_client '2'
	option remove_client '15'
	option remove_probe '30'
	option remove_ap '5'
	option update_hostapd '2'
	option update_tcp_con '2'
	option update_chan_util '2'
	option update_beacon_reports '5'
config metric 'global'
	option min_probe_count '2'
	option bandwidth_threshold '0'
	option use_station_count '0'
	option max_station_diff '1'
	option eval_probe_req '0'
	option eval_auth_req '0'
	option eval_assoc_req '0'
	option kicking '2'
	option kicking_threshold '20'
	option deny_auth_reason '1'
	option deny_assoc_reason '17'
	option min_number_to_kick '2'
	option chan_util_avg_period '3'
	option duration '0'
	option rrm_mode 'tap'
	option set_hostapd_nr '2'
config metric '802_11g'
	option initial_score '100'
	option ht_support '5'
	option vht_support '5'
	option no_ht_support '0'
	option no_vht_support '0'
	option rssi '15'
	option rssi_val '-40'
	option low_rssi_val '-50'
	option low_rssi '-1'
	option chan_util '0'
	option chan_util_val '1'
	option max_chan_util '-1'
	option max_chan_util_val '300'
	option rssi_weight '0'
	option rssi_center '-45'

config metric '802_11a'
	option initial_score '50'
	option ht_support '5'
	option vht_support '5'
	option no_ht_support '0'
	option no_vht_support '0'
	option rssi '15'
	option rssi_val '-60'
	option low_rssi_val '-80'
	option low_rssi '-15'
	option chan_util '0'
	option chan_util_val '140'
	option max_chan_util '-15'
	option max_chan_util_val '170'
	option rssi_weight '0'
	option rssi_center '-70'

.......................................................................................
Connect my Honor 6A (it has 802.11kv but no r The table) to WiFi and stay between the APs in the middle.
kcikcing 2
rssi_center '-45' (setted for stressing testing)

..................................................................................................................................
Question: option max_chan_util_val '140' what are those 140 ? Mb/s?
also what are those ? seconds or milliseconds ?

	option con_timeout '60'
	option update_client '2'
	option remove_client '15'
	option remove_probe '30'
	option remove_ap '5'
	option update_hostapd '2'
	option update_tcp_con '2'
	option update_chan_util '2'
	option update_beacon_reports '5'

................................................................................................
so dawn kicks it like hell

But when i connect second smartphone to WIFI it stops kicking , one smartphone goes to AP1 and the other to AP2 and they kinda stick to these APs and DAWN doesn want them to kick
and i disconnect 1 the other switches immediately
So I think there is some kind of balancing working (but kicking is 2)
I need to disable in DAWN any balancing so doesn't matter how current AP is busy
Also when it is kicking smartphone it doesnt kick others clients ,Why

DAWN works normally only with 1 device

@ZebraOnPC what's that app on your screenshot?

1 Like

FTR I rolled out v23.05.0-rc2 across my dumb APs, this is still happening:

Hey @PolynomialDivision, I noticed that some clients appear only once in the hearing map. For example, my OnePlus 10 Pro 5G. It only shows up as “seen” by the AP that it's directly connected to. There are two other APs that don't display it in the hearing map, but they are definitely close and accessible.
If I turn the phone's Wi-Fi on off and on again, it may connect to a different AP from the same location. I can also kick it manually by hitting Disconnect in LUCI, and it then migrates to a different AP or a different radio on the same one. But then, it usually keeps being seen only by the new AP it reconnected to.
Any debugging ideas?

1 Like

that is an old problem i have noticed that too, the funny things all my 3 AP can hear other smartphones that dont even connect to the wireless network, but when a smartphone connected it can see only 2 (last and curent AP) also that last AP not really seeng it just appers for 10 secund but it stack

The app is Wifiman from ubiquity

1 Like

Using DTIM 3 might help Apple devices, but i had worse results with other vendors (samsung, amazon fire, google pixel,...). Now i am back on 2.

Got DAWN working pretty well on my new pair of mesh routers (running 23.05.0-rc2 OpenWrt), but not without some tweaks in addition to following the 'official' setup guide:

  1. Followed these proposed changes to the dawn config file, ̶e̶x̶c̶e̶p̶t̶ ̶I̶ ̶s̶e̶t̶ ̶k̶i̶c̶k̶i̶n̶g̶ ̶t̶o̶ ̶'̶3̶'̶ ̶i̶n̶s̶t̶e̶a̶d̶ ̶o̶f̶ ̶'̶1̶'̶. (Edit: seems keeping kicking at '1' makes kicking less jittery and minimizes the chance of disconnections while transitioning) ̶a̶n̶d̶ ̶I̶ ̶l̶e̶a̶v̶e̶ ̶b̶a̶n̶d̶w̶i̶d̶t̶h̶_̶t̶h̶r̶e̶s̶h̶o̶l̶d̶ ̶a̶t̶ ̶'̶6̶'̶. (Edit: setting it at '6' actually means as long as your currently connected AP has a throughput of greater than 6 Mbits/s, regardless of how weak the signal is, no kicking will take place. I have now actually changed the value to '0' to indicate to DAWN that I only want to consider the scores of each AP, and do not take into consideration the throughput of currently connected AP)

  2. Per this, I changed duration from the default '0' to '120'.

  3. I̶ ̶h̶a̶d̶ ̶t̶o̶ ̶c̶h̶a̶n̶g̶e̶ ̶b̶o̶t̶h̶ ̶t̶h̶e̶ ̶d̶e̶n̶y̶_̶a̶u̶t̶h̶_̶r̶e̶a̶s̶o̶n̶ ̶a̶n̶d̶ ̶d̶e̶n̶y̶_̶a̶s̶s̶o̶c̶_̶r̶e̶a̶s̶o̶n̶ ̶t̶o̶ ̶'̶0̶'̶.̶

  4. I needed to add a startup script to restart umdns each time after booting the routers in order for info on other mesh routers to populate (checking can be done via command line ubus call umdns browse:

#!/bin/sh
sleep 30
/etc/init.d/umdns restart

I needed to do 2, 3, and 4 above before any kicking action is actually observed on my devices. And step 1 above helped with kicking more aggressively and preference to 5G.

  1. My Android phone does not like 802.11r turned on. I had to turn off 802.11r before seamless roaming can be achieved. Otherwise when kicking is done, the phone loses wifi for a second before reconnecting to the AP with the stronger signal. 802.11k and v are turned on. With no 802.11r the transition is in theory not as quick but wifi connection is still maintained throughout each transition.

If things go ok for the next few days I will likely add another router of the same model to the mesh.

3 Likes

When dawn 'kicks' a device to another AP, the log says this:

Client xx:xx:xx:xx:xx:xx: Kicking as no active transmission data for client, and / or limit of 0 is OK.

(this is displayed when you set bandwidth_threshold in dawn's config file to '0')

Does this imply that no kicking will ever take place when there is active data transmission? If so, is there any option to tell dawn to do kicking even when a device is transmitting data?

Case in point: I am watching youtube, and while watching I bring the phone to another area where a better AP is waiting for the device to roam to.

EDIT: Answering my own question - doesn't matter. DAWN sends out bss-transition commands to kick clients around regardless of whether any active data transmission is taking place. (at least when bandwidth_threshold is set to '0')