Daemon.err hostapd: nl80211: kernel reports: key addition failed - is this a problem?

Tested again with OpenWrt 22.03.0 r19685-512e76967f. Sadly this issue has made it into the Stable release. :slightly_frowning_face:

Looking forward to a brighter future one day when Teams/WhatsApp calls don't drop when walking across the house.

3 Likes

Yes, me too. It’s interesting that nobody knows the reason for that issue and it is there for years…

2 Likes

Back from Stable to snapshot release, and sad to say that key addition failed is still there with OpenWrt SNAPSHOT r20655-c3e7d86d2b. :frowning_face:

3 Likes

And still on OpenWrt SNAPSHOT r20746-a67f484e67.

Is there anyone out there?

1 Like

Also seeing this (with Radius and VLANs) on 22.3.0-rc6. However, as I am having FT on both radios (2.4 & 5 GHz), I believe my situation is mitigated (routinely one fails, but the other succeeds).

It would seem that this may be caused by timeout (?). Looking at the nl80211 driver code, it seems it's either that or no memory. Also, some time back Michael Braun commented about the time it takes to decrypt the key (and some suggestions, see HowTo enable WiFi roaming with hostapd and VLANs - FeM Blog (tu-ilmenau.de)).

I'll make an attempt to add some debug statements to see what the driver is reporting... May take a while as it will be my first compile for OpenWRT :wink:

2 Likes

After adding some prints to the driver, it would seem the failure originates
in build_dir/target-mips_24kc_musl/linux-ath79_generic/backports-5.15.58-1/net/mac80211/cfg.c in ieee80211_add_key() :

 499                 if (!sta || !test_sta_flag(sta, WLAN_STA_ASSOC)) {
 500                         ieee80211_key_free_unused(key);
 501                         err = -ENOENT;
 502                         goto out_unlock;
 503                 }

The TODO in this code fragment seems to be clear: "accept the key if we have a station entry and add it to the device after the station." This was noted before here: 802.11r Fast Transition how to understand that FT works? - #62 by cotequeiroz

Now figuring out how to do that :wink:

--- update ---
According to patch #1626e0fa :

During FT roaming, wpa_supplicant attempts to set the key before association. This used to be rejected, but as a side effect of my commit 66e67e41 ("mac80211: redesign auth/assoc") the key was accepted causing hardware crypto to not be used for it as the station isn't added to the driver yet. It would be possible to accept the key and then add it to the driver when the station has been added. However, this may run into issues with drivers using the state- based station adding if they accept the key only after association like it used to be. For now, revert to the behaviour from before the auth and assoc change.

Hence it would seem that worst-case may be to loose the hardware key decryption... I guess I'll just see what my router does with the ASSOC condition removed...

--- update 2 ---
Removing the WLAN_STA_ASSOC test indeed get's rid of the error message. Roaming seems to be a bit faster (roaming between AP 1 with this change, and an identical AP 2 without this change). Will keep an eye on this to see if there are negative events popping up in daily use.

7 Likes

Hope this makes it in stable release soon. I've tried 21.x, 22..x and Snaphot builds; different HW, FT over DS/Air, 802.11w v k on/off, .... nothing helped. FT/roaming just does not work.

5 Likes

@zagi-tng I had FT working well before the switch to DSA (i.e. under 19.07 and, I think, early 21.02 before DSA came to Lantiq).

1 Like

It was probably before this patch

However after years it is still present, incredible :frowning:

@fodiator how is it working?
Thanks

2 Likes

@fodiator loving your work. Of course, getting rid of the error message isn't the goal. It is that FT works.

Seems to work well, not noticing any performance issues. Now trying to build a replacement patch matching the kernel .vermagic for others to test.

FT is working here. It’s still ‘regression’ to previous openwrt release. Still seems to be good enough for my use.

I’ll keep investigating in parallel how to do it in accordance with the todo from the cfg.c file

-- update ---
Managed to create a 'release-equivalent' image @ 22.03, so able to use online repository for packages (not sure about kmods, as one may need the entire dependency chain...).

I've switched my Archer C7 v2 routers over to use this version, and would be happy to share if anyone is interested (also to receive suggestions on how to share :wink: ).

Here's the manifest for reference:

contents of .manifest

ath10k-board-qca988x - 20220411-1
ath10k-firmware-qca988x - 20220411-1
base-files - 1490-r19685-512e76967f
busybox - 1.35.0-3
ca-bundle - 20211016-1
cgi-io - 2022-08-10-901b0f04-21
dnsmasq - 2.86-14
dropbear - 2022.82-2
firewall4 - 2022-09-01-f5fcdcf2-1
fstools - 2022-06-02-93369be0-2
fwtool - 2019-11-12-8f7fe925-1
getrandom - 2021-08-03-205defb5-2
hostapd-common - 2022-01-16-cff80b4f-11
iw - 5.16-1
iwinfo - 2022-08-19-0dad3e66-1
jansson4 - 2.13.1-2
jshn - 2022-05-15-d2223ef9-1
jsonfilter - 2018-02-04-c7e938d6-1
kernel - 5.10.138-1-abf0c66378f3d0588b20489662c12426
kmod-ath - 5.10.138+5.15.58-1-1
kmod-ath10k - 5.10.138+5.15.58-1-1
kmod-ath9k - 5.10.138+5.15.58-1-1
kmod-ath9k-common - 5.10.138+5.15.58-1-1
kmod-cfg80211 - 5.10.138+5.15.58-1-1
kmod-crypto-aead - 5.10.138-1
kmod-crypto-ccm - 5.10.138-1
kmod-crypto-cmac - 5.10.138-1
kmod-crypto-crc32c - 5.10.138-1
kmod-crypto-ctr - 5.10.138-1
kmod-crypto-gcm - 5.10.138-1
kmod-crypto-gf128 - 5.10.138-1
kmod-crypto-ghash - 5.10.138-1
kmod-crypto-hash - 5.10.138-1
kmod-crypto-hmac - 5.10.138-1
kmod-crypto-manager - 5.10.138-1
kmod-crypto-null - 5.10.138-1
kmod-crypto-rng - 5.10.138-1
kmod-crypto-seqiv - 5.10.138-1
kmod-crypto-sha256 - 5.10.138-1
kmod-gpio-button-hotplug - 5.10.138-3
kmod-lib-crc-ccitt - 5.10.138-1
kmod-lib-crc32c - 5.10.138-1
kmod-mac80211 - 5.10.138+5.15.58-1-1
kmod-nf-conntrack - 5.10.138-1
kmod-nf-conntrack6 - 5.10.138-1
kmod-nf-flow - 5.10.138-1
kmod-nf-log - 5.10.138-1
kmod-nf-log6 - 5.10.138-1
kmod-nf-nat - 5.10.138-1
kmod-nf-reject - 5.10.138-1
kmod-nf-reject6 - 5.10.138-1
kmod-nfnetlink - 5.10.138-1
kmod-nft-core - 5.10.138-1
kmod-nft-fib - 5.10.138-1
kmod-nft-nat - 5.10.138-1
kmod-nft-offload - 5.10.138-1
kmod-ppp - 5.10.138-1
kmod-pppoe - 5.10.138-1
kmod-pppox - 5.10.138-1
kmod-slhc - 5.10.138-1
libblobmsg-json20220515 - 2022-05-15-d2223ef9-1
libc - 1.2.3-4
libgcc1 - 11.2.0-4
libiwinfo-data - 2022-08-19-0dad3e66-1
libiwinfo-lua - 2022-08-19-0dad3e66-1
libiwinfo20210430 - 2022-08-19-0dad3e66-1
libjson-c5 - 0.15-2
libjson-script20220515 - 2022-05-15-d2223ef9-1
liblua5.1.5 - 5.1.5-10
liblucihttp-lua - 2022-07-08-6e68a106-1
liblucihttp0 - 2022-07-08-6e68a106-1
libmnl0 - 1.0.5-1
libnftnl11 - 1.2.1-1
libnl-tiny1 - 2021-11-21-8e0555fb-1
libpthread - 1.2.3-4
libubox20220515 - 2022-05-15-d2223ef9-1
libubus-lua - 2022-06-01-2bebf93c-1
libubus20220601 - 2022-06-01-2bebf93c-1
libuci20130104 - 2021-10-22-f84f49f0-6
libuclient20201210 - 2021-05-14-6a6011df-1
libucode20220812 - 2022-08-29-344fa9e6-1
libustream-wolfssl20201210 - 2022-01-16-868fd881-1
libwolfssl5.4.0.ee39414e - 5.4.0-stable-5
logd - 2021-08-03-205defb5-2
lua - 5.1.5-10
luci - git-20.074.84698-ead5e81
luci-app-firewall - git-22.089.67563-7e3c1b4
luci-app-opkg - git-22.154.41881-28e92e3
luci-base - git-22.245.77528-487e58a
luci-lib-base - git-20.232.39649-1f6dc29
luci-lib-ip - git-20.250.76529-62505bd
luci-lib-jsonc - git-22.097.61921-7513345
luci-lib-nixio - git-20.234.06894-c4a4e43
luci-mod-admin-full - git-19.253.48496-3f93650
luci-mod-network - git-22.244.54818-b13d8c7
luci-mod-status - git-22.189.48501-6731190
luci-mod-system - git-22.140.66206-02913be
luci-proto-ipv6 - git-21.148.48881-79947af
luci-proto-ppp - git-21.158.38888-88b9d84
luci-theme-bootstrap - git-22.141.59265-d8ecf48
mtd - 26
netifd - 2022-08-25-76d2d41b-1
nftables-json - 1.0.2-2.1
odhcp6c - 2022-08-05-7d21e8d8-18
odhcpd-ipv6only - 2022-03-22-860ca900-1
openwrt-keyring - 2022-03-25-62471e69-3
opkg - 2022-02-24-d038e5b6-1
ppp - 2.4.9.git-2021-01-04-3
ppp-mod-pppoe - 2.4.9.git-2021-01-04-3
procd - 2022-06-01-7a009685-1
procd-seccomp - 2022-06-01-7a009685-1
procd-ujail - 2022-06-01-7a009685-1
px5g-wolfssl - 4
rpcd - 2022-08-24-82904bd4-1
rpcd-mod-file - 2022-08-24-82904bd4-1
rpcd-mod-iwinfo - 2022-08-24-82904bd4-1
rpcd-mod-luci - 20210614
rpcd-mod-rrdns - 20170710
swconfig - 12
uboot-envtools - 2022.01-31
ubox - 2021-08-03-205defb5-2
ubus - 2022-06-01-2bebf93c-1
ubusd - 2022-06-01-2bebf93c-1
uci - 2021-10-22-f84f49f0-6
uclient-fetch - 2021-05-14-6a6011df-1
ucode - 2022-08-29-344fa9e6-1
ucode-mod-fs - 2022-08-29-344fa9e6-1
ucode-mod-ubus - 2022-08-29-344fa9e6-1
ucode-mod-uci - 2022-08-29-344fa9e6-1
uhttpd - 2022-08-12-e3395cd9-1
uhttpd-mod-ubus - 2022-08-12-e3395cd9-1
urandom-seed - 3
urngd - 2020-01-21-c7f7b6b6-1
usign - 2020-05-23-f1f65026-1
wireless-regdb - 2022.06.06-1
wpad - 2022-01-16-cff80b4f-11

3 Likes

Thanks @fodiator, I'm definitely interested. Not only in a release equivalent image, but also in gaining some of the knowledge you have in order to get this far. Please share the TODO in the cfg.c file?

It seems to me, the challenge of OpenWRT is understanding where each and every component comes from. Why is this not already sorted in the upstream Linux components that OpenWRT relies on?

1 Like

Here is the TODO excerpt:

/*
 * The ASSOC test makes sure the driver is ready to
 * receive the key. When wpa_supplicant has roamed
 * using FT, it attempts to set the key before
 * association has completed, this rejects that attempt
 * so it will set the key again after association.
 *
 * TODO: accept the key if we have a station entry and
 *       add it to the device after the station.
 */

I've included patch 100-allow_key_without_assoc.patch under
[buildroot]/openwrt/package/kernel/mac80211/patches/subsys

patch content
Used to remove FT key addition failed error on Archer C7 v2

Index: backports-5.15.58-1/net/mac80211/cfg.c
===================================================================
--- backports-5.15.58-1.orig/net/mac80211/cfg.c
+++ backports-5.15.58-1/net/mac80211/cfg.c
@@ -466,11 +466,15 @@ static int ieee80211_add_key(struct wiph
                 * TODO: accept the key if we have a station entry and
                 *       add it to the device after the station.
                 */
-               if (!sta || !test_sta_flag(sta, WLAN_STA_ASSOC)) {
+               if (!sta) {
+                       sdata_info(sdata, "mwv1 - no sta\n");
                        ieee80211_key_free_unused(key);
                        err = -ENOENT;
                        goto out_unlock;
                }
+               if (!test_sta_flag(sta, WLAN_STA_ASSOC)) {
+                       sdata_info(sdata, "mwv2 - no sta assoc\n");
+               }
        }

        switch (sdata->vif.type) {

This will automatically sourced in when (re)-building source.

Also, now have my ath10k driver ftrace enabled to increase my understanding on how to follow the TODO properly.

6 Likes

How can we get this pulled into the main release?
-Or/And-
How can we build this for each of our own?

I'd be happy to supply a patch file, it's code-wise trivial .

Not sure if this will be accepted as a patch for main release, as it may have the side effect of non-hardware decryption.

For what it's worth, I am tracing the hostapd-mac80211 interplay. It's clear that there is provision for fail-retry settings (depending on ASSOC state), which I need to have a closer look at... Will update here once I understand this better.

---- update ----
the patch above removes the need for the retry cycle in my 802.1x + VLAN setup.

3 Likes

@fodiator understand that it might take longer to get this to a patch that will be accepted.

Meantime, how might we replicate what you have done to get it onto our own setups? Some pointers on where to get going would be helpful. Is it sufficient to follow the Developer Guide on the Wiki?

Any thoughts as to why this hasn't been corrected some time ago already?

1 Like

@andybjackson building the patch should be fairly straightforward.

Assuming your build environment is set up as per Wiki, you can build the release equivalent as described here:

If you copy the patch content from above

and place it in [buildroot]/openwrt/package/kernel/mac80211/patches/subsys

and run make world you should get the sysupgrade.bin under [buildroot]/bin/targets/[path to your device]

You can confirm the patch is part of your build strings [buildroot]/[yourtarget]/[yourlinux]/backports-5.15.58-1/net/mac80211/mac80211.ko | grep mwv .

In the meantime I am still looking to improve the solution by adding the STA to the driver as per the note above. I have been able to get insight into the hostapd code. Hoping to have an implementation within a few weeks.

The reason this code was 'corrected' seems to be - according to a comment - about hardware decryption not being active when the key is added without being associated on some systems.

2 Likes

@fodiator did you actually submit the patch?
If so, could you please update this thread with that information?

Haven’t made a pull request yet. Instead I’ve been tracing the mac80211 code to understand how to follow the TODO suggestion.

I think I understand now what to do (accept key on first offer, then on assoc replace key and load to hw if not already done).

Unfortunately I have not yet had time to implement and test this. Hopefully in the next week or so. If succesfull, I’ll make that a PR.

If you want the patch right now, you can just copy the contents above :+1:.

2 Likes