AC mode on Raspberry Pi 4 WiFi

Yup. I'll diff what you have against upstream to get a sense of where the problems are.

I'm trying to figure out what actually calls that script so I can see the data that is passed here:

init_wireless_driver "$@"

thinking about it some more...

the C stuff ( cfg80211.ko )... is just if not more a likely cantidate... why else would the thing start scanning and dump itself back to client mode...

i.e.;

cat /sys/kernel/debug/ieee80211/phy0/ht40allow_map

haven't been using the buildroot much for the pi... but spinning up a full debug image of the above might be worth a shot ( failing the pro's weighing in )

It seems unlikely it's the kernel modules - there's little to no difference between mainline and what Ubuntu runs on a Pi 4 at this point for cfg80211 along with brcmfmac, and at least

iw phy phy0 info

gives nearly identical results other than some small countrycode oddities - and those disappear (but our problems do not) if you copy a regdb from an Ubuntu machine.

I'm wondering if netifd has some code I haven't found that adds .sh to certain items automatically which is why grepping for "mac80211.sh" doesn't result in much that is useful.

1 Like

hmmm... well my modded 80211.sh is no good today...

either way... show some insight... some random errs...

[ 2644.501258] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52
[ 2650.524544] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52

#Fri Aug  7 17:44:07 2020 daemon.notice netifd: radio0 (8955): + iw phy phy0 set txpower auto
#Fri Aug  7 17:44:07 2020 daemon.notice netifd: radio0 (8955): command failed: I/O error (-5)
#Fri Aug  7 17:44:07 2020 daemon.notice netifd: radio0 (8955): + '[' -n  ]


No force 40 and b rates on ( default )

Fri Aug  7 18:35:19 2020 kern.info kernel: [ 4390.387121] device wlan0 entered promiscuous mode
Fri Aug  7 18:35:19 2020 daemon.notice hostapd: wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
*****Fri Aug  7 18:35:19 2020 daemon.notice hostapd: wlan0: interface state COUNTRY_UPDATE->HT_SCAN
Fri Aug  7 18:35:19 2020 kern.err kernel: [ 4390.393482] ieee80211 phy0: brcmf_run_escan: error (-52)
Fri Aug  7 18:35:19 2020 kern.err kernel: [ 4390.398803] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
Fri Aug  7 18:35:19 2020 kern.info kernel: [ 4390.921464] br-lan: port 2(wlan0) entered disabled state
Fri Aug  7 18:35:19 2020 kern.info kernel: [ 4390.930863] device wlan0 left promiscuous mode
Fri Aug  7 18:35:19 2020 kern.info kernel: [ 4390.935510] br-lan: port 2(wlan0) entered disabled state
Fri Aug  7 18:35:19 2020 kern.err kernel: [ 4390.963329] ieee80211 phy0: brcmf_run_escan: error (-52)
Fri Aug  7 18:35:19 2020 kern.err kernel: [ 4390.968801] ieee80211 phy0: brcmf_cfg80211_scan: scan error (-52)
Fri Aug  7 18:35:19 2020 daemon.err hostapd: Failed to request a scan of neighboring BSSes ret=-52 (No error information)
******Fri Aug  7 18:35:19 2020 daemon.notice hostapd: wlan0: interface state HT_SCAN->DISABLED
Fri Aug  7 18:35:19 2020 daemon.notice hostapd: wlan0: AP-DISABLED
Fri Aug  7 18:35:19 2020 daemon.err hostapd: wlan0: Unable to setup interface.
Fri Aug  7 18:35:19 2020 daemon.notice hostapd: nl80211: deinit ifname=wlan0 disabled_11b_rates=0
Fri Aug  7 18:35:19 2020 daemon.notice hostapd: nl80211: Failed to remove interface wlan0 from bridge br-lan: Invalid argument
Fri Aug  7 18:35:20 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-TERMINATING
Fri Aug  7 18:35:20 2020 daemon.err hostapd: hostapd_free_hapd_data: Interface wlan0 wasn't started
Fri Aug  7 18:35:20 2020 daemon.notice netifd: radio0 (13342): Command failed: Invalid argument
Fri Aug  7 18:36:03 2020 daemon.notice netifd: radio0 (13507): Command failed: Invalid argument
Fri Aug  7 18:36:04 2020 daemon.notice netifd: radio0 (13507): command failed: No error information (-524)
Fri Aug  7 18:36:04 2020 daemon.notice netifd: radio0 (13522): command failed: No error information (-524)
Fri Aug  7 18:36:04 2020 daemon.notice netifd: radio0 (13522): command failed: I/O error (-5)
Fri Aug  7 18:36:09 2020 daemon.notice hostapd: Configuration file: /var/run/hostapd-phy0.conf (phy wlan0) --> new PHY
Fri Aug  7 18:36:09 2020 kern.info kernel: [ 4440.089994] br-lan: port 2(wlan0) entered blocking state
Fri Aug  7 18:36:09 2020 kern.info kernel: [ 4440.095335] br-lan: port 2(wlan0) entered disabled state


force 40 and b rates off

Fri Aug  7 18:36:09 2020 kern.info kernel: [ 4440.100813] device wlan0 entered promiscuous mode
Fri Aug  7 18:36:09 2020 daemon.notice hostapd: wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
********Fri Aug  7 18:36:09 2020 daemon.notice hostapd: wlan0: interface state COUNTRY_UPDATE->ENABLED
Fri Aug  7 18:36:09 2020 daemon.notice hostapd: wlan0: AP-ENABLED
Fri Aug  7 18:36:09 2020 kern.err kernel: [ 4440.110978] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52
Fri Aug  7 18:36:15 2020 kern.err kernel: [ 4446.119866] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52
Fri Aug  7 18:36:16 2020 daemon.notice hostapd: Remove interface 'wlan0'
Fri Aug  7 18:36:16 2020 daemon.notice hostapd: wlan0: interface state ENABLED->DISABLED
Fri Aug  7 18:36:16 2020 daemon.notice hostapd: wlan0: AP-DISABLED
Fri Aug  7 18:36:16 2020 daemon.notice hostapd: wlan0: CTRL-EVENT-TERMINATING
Fri Aug  7 18:36:16 2020 daemon.notice hostapd: nl80211: deinit ifname=wlan0 disabled_11b_rates=0
Fri Aug  7 18:36:16 2020 kern.info kernel: [ 4447.958343] device wlan0 left promiscuous mode
Fri Aug  7 18:36:16 2020 kern.info kernel: [ 4447.962908] br-lan: port 2(wlan0) entered disabled state
Fri Aug  7 18:36:17 2020 daemon.notice netifd: radio0 (13696): command failed: No error information (-524)
Fri Aug  7 18:36:17 2020 daemon.notice netifd: radio0 (13714): command failed: No error information (-524)
Fri Aug  7 18:36:17 2020 daemon.notice netifd: radio0 (13714): command failed: I/O error (-5)
Fri Aug  7 18:36:22 2020 daemon.notice hostapd: Configuration file: /var/run/hostapd-phy0.conf (phy wlan0) --> new PHY
Fri Aug  7 18:36:22 2020 kern.info kernel: [ 4453.570751] br-lan: port 2(wlan0) entered blocking state
Fri Aug  7 18:36:22 2020 kern.info kernel: [ 4453.576100] br-lan: port 2(wlan0) entered disabled state
Fri Aug  7 18:36:22 2020 kern.info kernel: [ 4453.581558] device wlan0 entered promiscuous mode
Fri Aug  7 18:36:22 2020 daemon.notice hostapd: wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
Fri Aug  7 18:36:22 2020 kern.err kernel: [ 4453.591575] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52
Fri Aug  7 18:36:22 2020 daemon.notice hostapd: wlan0: interface state COUNTRY_UPDATE->ENABLED
Fri Aug  7 18:36:22 2020 daemon.notice hostapd: wlan0: AP-ENABLED
Fri Aug  7 18:36:28 2020 kern.err kernel: [ 4459.600639] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52
Fri Aug  7 18:36:34 2020 kern.err kernel: [ 4465.622565] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52


EDIT: b rates on has no effect... so it's only the force 40... and something to do with how that influences initialisation

There is a note in the rtl8812ac driver from aircrack-ng that says the driver has trouble with the way openwrt create and destroy wlan interface on bring up.
This driver has the same problematic behaviour of the rpi4 broadcom driver.
Maybe it is somehow related

2 Likes

Oh, useful info there!

As to what is calling mac80211.sh - it looks like netifd traverses /lib/netifd to find all .sh scripts:

https://git.openwrt.org/?p=project/netifd.git;a=blob;f=handler.c;h=8608a9751ea9f5705b182b762cb0a9b5b8090852;hb=HEAD

Enabling netfid debug output - #3 by jeff is likely to be useful

Edit: I'm getting debug info out of netifd, unfortunately right now I'm not located at my VHT80 test location, VHT20 is working fine where I am. Although it's not seeing some APs on channel 153 that my Android phone can see, I'm not sure if this is due to signal strength or is related to our issue.

When I get to my VHT80 test location I'll drop that AP down to a lower channel to see what happens.

Edit: At least as a client, VHT40 in lower channels works, when I looked at wifianalyzer on my phone I noticed they were running at least VHT40 and editing the config in LuCi to match that works.

Edit 2: Never mind, setting VHT40 seems to cause an I/O error from the driver, but it still connects fine, just at VHT20 instead
Edit 3: Maybe just HT20... A GL-AR750S on the same network reports:

135.0 Mbit/s, 40 MHz, MCS 6, Short GI
150.0 Mbit/s, 40 MHz, MCS 7, Short GI

The Pi reports:

150.0 Mbit/s, 20 MHz
200.0 Mbit/s, 20 MHz

No MCS or GI information - I'm not sure exactly what that means

Edit 4: But

iw dev wlan0 info

returns a channel width of 40 MHz?

1 Like

An additional bit of information:

The inability to see my home router is definitely a frequency issue. For some reason, higher channels are not seen in the scan results despite the countrycode being set to US and an updated regdb that (at least) gives sensible results from

iw phy phy0 info

Obviously it doesn't work with the old regdb either. I've seen reports that on some OSes (wasn't the case for my Ubuntu unit) you need to set the countrycode in Raspbian first for some strange reason.

yeah... that hack above to manually get 80 working wanted some manual country set action... pretty sure there is a 'timing/sequencing-setup' thing going on... [in addition to the other capability limitation/matching thing]...

but... there is also a fair chance there is a 'stale state' thing going on in master since a few weeks plus... as well...

so that's 3 finnicky glitches... but the one you mention ( freqs ) is the most pertinent...

"stale state"?

I've tried seting the roamoff parameter when the module is loaded to 0 - no effect. (That is one thing I noticed was different in the Ubuntu kernel)

If you pass a frequency of 5785 (channel 157, which the iw phy info says should be valid, and is valid in the US) when trying to scan using

iw dev wlan0 scan freq 5785
  • it behaves as if you didn't pass a frequency. No errors anywhere.

Pass a frequency like 2412 and the freq option takes effect

country AU: DFS-ETSI
        (2400 - 2483 @ 40), (N/A, 36), (N/A)
        (5150 - 5250 @ 80), (N/A, 23), (N/A), NO-OUTDOOR, AUTO-BW
        (5250 - 5350 @ 80), (N/A, 20), (0 ms), NO-OUTDOOR, DFS, AUTO-BW
        (5470 - 5600 @ 80), (N/A, 27), (0 ms), DFS
        (5650 - 5730 @ 80), (N/A, 27), (0 ms), DFS
        (5730 - 5850 @ 80), (N/A, 36), (N/A)
        (57000 - 66000 @ 2160), (N/A, 43), (N/A), NO-OUTDOOR
country US: DFS-FCC
        (2400 - 2483 @ 40), (N/A, 30), (N/A)
        (5150 - 5250 @ 80), (N/A, 23), (N/A), AUTO-BW
        (5250 - 5350 @ 80), (N/A, 23), (0 ms), DFS, AUTO-BW
        (5470 - 5730 @ 160), (N/A, 23), (0 ms), DFS
        (5730 - 5850 @ 80), (N/A, 30), (N/A)
        (57240 - 71000 @ 2160), (N/A, 40), (N/A)

notice the @80

Yup. Same for my regdb pulled from an Ubuntu installation:

global
country US: DFS-FCC
        (2402 - 2472 @ 40), (N/A, 30), (N/A)
        (5170 - 5250 @ 80), (N/A, 23), (N/A), AUTO-BW
        (5250 - 5330 @ 80), (N/A, 23), (0 ms), DFS, AUTO-BW
        (5490 - 5730 @ 160), (N/A, 23), (0 ms), DFS
        (5735 - 5835 @ 80), (N/A, 30), (N/A)
        (57240 - 63720 @ 2160), (N/A, 40), (N/A)

But while "iw list" and "iw phy phy0 info" and "iw reg get" give the data you expect, it's almost as if the firmware in the chip is behaving differently.

I can't find anything relevant in the kernel source - it appears that the Ubuntu kernel is not using any of the customizations seen in the raspi.org kernels, and the Ubuntu kernel works well.

yeah... auto >>> set to 20>ok set it to 40>ok ( both 36 )

set to 80 >>>

[  258.351588] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52
[  264.375726] ieee80211 phy0: brcmf_cfg80211_start_ap: Set Channel failed: chspec=57386, -52

set it back to 40 >>> crashes into 'client'...

( sure, you expect 80 to be an issue... but not being able to undo that change without re-instantiating the modules is a concern )

related ... i think i owe an apology to the team... these errors are popping up on all platforms spanning back more than a year... but yes... some distro's seem to have found ways to massage things in to submission.... this one goes in the 'time will fix it basket' I think...

Yeah. I'm wondering if the Ubuntu kernel source on raspis is different from the "normally documented" kernel source... Half of Ubuntu's documentation regarding obtaining kernel source is sadly out of date (notably, anything involving apt-get source fails to actually pull down kernel source for me)

The official raspi kernel source does have some patches related to roaming mode and country code changes, although the commit descriptions imply it shouldn't affect an override coming from userspace, just the defaults.

1 Like

https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux-raspi/+git/focal/commit/drivers/net/wireless/broadcom/brcm80211?id=321ba7449a30d88ac5b93cae413cc913c3b37fe3 is interesting. The commit message says that it adds support for a device we don't have in a pi4, however:

"brcmfmac: Use original country code as a fallback

Commit 73345fd212980d2e28a5c6d83801c903bd773680:

brcmfmac: Configure country code using device specific settings

prevents region codes from working on devices that lack a region code
translation table. In the event of an absent table, preserve the old
behaviour of using the provided code as-is.

Signed-off-by: Phil Elwell phil@raspberrypi.org"

Sounds a lot like our problem, and the patch touches code that is not specific to a 43341...

1 Like

they are running firmware:7.45.202 ( vs 206 ) with 43455-pwropt action... must be a reason they haven't bumped...

to be fair... master is master...

I finally got tracing working to potentially see if the missing patches might be relevant, but go figure, it's hard to turn on tracing for the relevant line without causing massive logspam.

I'm getting very tempted to just apply the patches Ubuntu and RPi use and see what happens.

I did determine that scan results are definitely a firmware thing - userspace sends the correct channel list to the device, and the firmware seems to ignore it silently, or if not, there's so much logspam I don't see the error response.

1 Like

( fyi... the latest master bumped some wireless code )

this looks like a fairly clean init(from boot)... except for the last line or few... didn't see it much yesterday / before... ( ACS scan )

Summary
Tue Aug 11 15:34:22 2020 daemon.notice hostapd: Configuration file: /var/run/hostapd-phy0.conf (phy wlan0) --> new PHY
Tue Aug 11 15:34:22 2020 kern.info kernel: [   10.085042] br-lan: port 2(wlan0) entered blocking state
Tue Aug 11 15:34:22 2020 kern.info kernel: [   10.090381] br-lan: port 2(wlan0) entered disabled state
Tue Aug 11 15:34:22 2020 kern.info kernel: [   10.095821] device wlan0 entered promiscuous mode
Tue Aug 11 15:34:22 2020 daemon.notice hostapd: ACS: Automatic channel selection started, this may take a bit
Tue Aug 11 15:34:22 2020 daemon.notice hostapd: wlan0: interface state UNINITIALIZED->ACS
Tue Aug 11 15:34:22 2020 daemon.notice hostapd: wlan0: ACS-STARTED
Tue Aug 11 15:34:24 2020 daemon.notice hostapd: ACS: Survey is missing channel time
Tue Aug 11 15:34:24 2020 daemon.notice hostapd: ACS: Survey is missing RX and busy time (at least one is required)
Tue Aug 11 15:34:24 2020 daemon.notice hostapd: ACS: Survey is missing channel time
Tue Aug 11 15:34:24 2020 daemon.notice hostapd: ACS: Survey is missing RX and busy time (at least one is required)
Tue Aug 11 15:34:24 2020 daemon.notice hostapd: ACS: Survey is missing channel time

unless it's just a client thing... also some interest background re: cypress clm blobs... also, those guys use rfkill and alternate region code...

Pulling in some of the patches from the rpi kernel sources don't seem to have done anything, unless you have to use both those AND recent blobs from https://github.com/RPi-Distro/firmware-nonfree/tree/master/brcm (Nope, using most recent blobs with those patches is failing. Now attempting to set ccode in the nvram text file and crank up tracing...)

That's interesting regarding the firmware blobs - although openwrt does pull in the Cypress distribution by default, not the linux-firmware blobs.

######debian ( lpc_pwropt-43455 )
7.45.202 (r724630 CY)       Mar  2 2020         2020-03-02 23:19:15         DVID 01-37031479
##### raspi64  ( lpc_pwropt-43455 )
7.45.202 (r724630 CY)       Mar  2 2020         2020-03-02 23:19:15         DVID 01-37031479
### ubuntu not sure which one
43455c0 Version: 7.45.202 (r724630 CY) CRC: 4b9a9ceb Date: Mon 2020-03-02 23:32:43 PST Ucode Ver: 1043.2139 FWID 01-72f6ece2
?> 43455c5-roml 7.84.17.1 (r871554) CRC: 72494685 Date: Thu 2020-05-14 17:41:11 KST Ucode Ver: 1043.20424 FWID: 01-3d9e1d87
### owrt latest -> macbump
7.45.206 (r725000 CY)       Mar 23 2020         2020-03-23 02:06:14         DVID 01-2c52dea2

rpi-distro has 206 also, I won't be able to check Ubuntu 20.04 until tonight.

Also, i bumped up LOG_BUF_SHIFT in the kconfig so I can get all of the logspam, and it turns out that my attempts to patch the kernel didn't actually apply... This is the first time I've tried to patch a kernel within OpenWRT's build system. I'm in the process of rebuilding a second attempt now.