Users needed to test Wi-Fi stability on Linksys WRT3200ACM & WRT32X on OpenWrt 21.02

cat /sys/kernel/debug/ieee80211/phy0/mwlwifi/tx_amsdu
cat /sys/kernel/debug/ieee80211/phy1/mwlwifi/tx_amsdu

My AMSDU is enabled so I guess the next step now is finding out why WPA3 doesn't work?

My understanding is that WPA3 will not work. Tried this many ways and Wi-Fi kept crashing to the point that the router would hang (WRT3200ACM). Using WPA2 and AES only mode.
I need to test and report back on my setup, to see if the AMSDU setting actually is doing anything or if there is something else i set that allowed 21.02 to work for me (iPhone X, iPad, etc.).

RE: WPA3

1 Like

Sounds like WPA3 not working is a driver issue and since they haven't been supported for awhile now I think it's safe to say that it's not going to happen. Thanks for the link.

General update: I'm working on a new build now. I'm wrestling with some dependency building issues (looks like fakeroot has a known history of issues around the commit timeline that I'm currently testing).

I've jumped to a commit that hopefully fixes the fakeroot build problems, and will try that out. If I need to go backwards, there's a possible patch I can add per this thread: Unable to build toolchain (fakeroot fails, perhaps others after it) - #3 by darksky

I'll post a new build once I've got things sorted.

Edit: New build is up - https://openwrt.austindw.com/linksys-wrt/wifi-hang-bisect/r15791/

I have not tested this personally just yet.

I got the build working, but haven't had time to flash it on my WRT3200ACM. I hope to get around to doing so later today or tomorrow morning, but not sure when just yet.

@WildByDesign feel free to test this build, or wait until I've flashed it on my own device. It contains the Advanced Reboot package, so it should be easy to swap back if any issues occur. Remember to not include any config when you flash, as there's a serious risk it's not compatible with what you already have.

I'd strongly suggest to keep WPA3 out of this attempt to fix the ongoing issues with mwlwifi - WPA3/ IEEE802.11W never worked, so this isn't a regression, nothing to be fixed here. Try to focus on the things that ought to work, because they did in the past (with older OpenWrt versions) first, because that's the aspect that may be actually be fixable, so WPA2/ CCMP, no 802.11W at all. Ideally try to identify a vanilla (not a community build, as those may contain custom patches muddying the test results) version that did work and then try to bisect the breakage point.

--
I have faint memories of community builds (don't remember which, never had this particular hardware and only watched from the sidelines) including patches to disable amsdu completely, while those were never part of vanilla OpenWrt, those may make all the difference - therefore it's important to find a common base of what works and what doesn't.

1 Like

@adworacz Thank you for sharing another build.

I was able to reproduce the exact same issue quite easily on r15791. It was about the same with approximately 3 wireless cutouts within a 10 minute period of time.

If I remember correctly, the kernel bump within this build was from mid-February 2021. I have also tested builds from early March 2021 and also exhibited the wireless cutouts.

This narrows the timeline down a bit more:
r13342 (David's last build - May 2020) GOOD
r15791 (mid-February 2021) BAD

DSA was implemented in mvebu only a couple of weeks after David's last build. That was the main reason why David stopped his builds, if I recall correctly.

  • Could the implementation of DSA cause wireless cutouts at all? Do wireless packets go through DSA? Could the wireless packets be getting hung up somewhere?

Interestingly, last night I was looking on Wayback Machine (Archive.org) and came across build r13583 which is from June 16, 2020. Unfortunately there were no builds for WRT32X at the time. But I did download the images for WRT3200ACM. This build would be one of the first successful builds for mvebu that contains DSA implementation.

I can test this build (r13583) tomorrow night and that can at least help determine whether or not DSA could be playing a role in this issue. I'm way too tired right now to test another build tonight but I will definitely test this tomorrow night.

2 Likes

Ahhh thanks for testing r15791 so quickly! I won't even bother installing in on my own router then. It's really convenient that you have a device that reproduces the issue so quickly.

I'll get a new build up with a bisect between r13342 and r15791 for you to test. I'll be away for work for a week so I won't be able to make another build in the short term, but should be able to next week.

One interesting note - I did receive one, single cutout with r13342 on my iPhone 8. But that was one in an entire week. I'm actually not as concerned as I normally would be - reason being, I've seen my iPhone 8 cutout in two other instances: one at a hotel, and another at a friend's place who's running a Unifi AP6. Both just cutout once, and then were fine afterwards. So there's definitely something funny about the latest iOS updates + the wifi chip(s), but it doesn't change the fact that it's happening a lot more with certain builds of OpenWrt than others.

Okay another bisect build is up - r14566: https://openwrt.austindw.com/linksys-wrt/wifi-hang-bisect/r14566/

I had to patch fakeroot using this patch to get builds to work, but luckily that went off without a hitch: https://bugs.archlinux.org/task/69572?getfile=19705

Haven't loaded this on my router yet, so the usual warnings apply. I'm getting ready to bounce out of town, so apologies for not being able to test just yet.

1 Like

Thanks for compiling another build. This should definitely help us get closer to finding the issue. I will absolutely give this build a good test later tonight and provide some feedback here. Have a good, safe trip.

1 Like

So today I was able to test two builds. I tested r13583, which is the snapshot build that I found on Wayback Machine. I spent approx. 3 hours testing that build and was unable to reproduce the wifi cutouts at all. This build had DSA implemented.

After that, I spent approx. 30 minutes testing your build from today, r14566. I was almost immediately getting wifi cutouts on my iPhone. Same issue and approx. 3 cutouts per 10 minute period of time. After 9 or 10 cutouts within a half hour and I could no longer tolerate the cutouts. As always, laptop was good, iPhone received the wifi cutouts.

Updated Builds List

r13342 (David's last build - May 2020) GOOD
r13583 (found on Wayback Machine - June 2020) GOOD
r14566 (September 2020) BAD
r15791 (February 2021) BAD
21.02.x (All RCs and final stable) BAD

Therefore the issue is now narrowed down between:

r13583 (June 16, 2020) and r14566 (September 25, 2020)

1 Like

In that case I'd suggest to go back to r13583, ideally build it from source yourself to confirm the finding, and test it for at least ~a week. Once you are sure that your (self-built r13583 is fine), you can git bisect between r13583 and r14566 - the good news, as you're past the DSA introduction, you can retain configs between test (making testing a lot easier).

Thank you for your time, I appreciate it.

Sorry, I should have been more specific about build r13583. It wasn't some community build that I found off Wayback Machine. It was an official OpenWrt snapshot build and I was able to verify the SHA256 hash. So it was a clean/vanilla build of OpenWrt from snapshots and therefore no worry of unofficial patches and such.

Although I have been an avid OpenWrt user for 10-15 years, I do not have a build environment nor have I built OpenWrt before and unfortunately setting up a build environment and compiling builds is something that I have zero experience with.

I agree 100% with your suggestion of testing images for 1 week for more thorough testing. But I cannot do that because my kids have to do virtual learning Mon-Fri for approx. 8 hours each of those weekdays. And I have a specific build setup on the other partition that I use during weekdays for reliability and performance that is needed.

That is why I can only commit to this more rapid-testing kind of strategy where I can commit to approx. 3 hours every night. @adworacz is able to do the 1 week per build testing of the builds that my rapid-testing flushes out as potentially good builds.

By the way, you have always provided detailed and thorough answers to user questions and problems over the years and I have benefited greatly in learning from your posts. I greatly appreciate your commitment and effort that you've always had toward this community.

Cheers!

@WildByDesign good to hear your testing updates!

I’ll create a new bisect build between the new known commits on Saturday or Sunday this weekend.

I can create a build specific to your new known good commit and test longer term on my WRT3200ACM.

1 Like

I just realized now that these two commits that I suspected as potential for causing this issue are still within the range of bad builds.

@adworacz Would it make sense, on the weekend, to compile a build that contains these? Or maybe a build that is a day or two before these commits?

Yup I can make a few different builds to test out before/during those commits.

I spent the past week doing more research and testing on the possibility of those A-MSDU and MPDU patches causing the wireless cutouts. I was able to force the wireless chipset to use different max MPDU values and different A-MSDU values, including disabled.

config wifi-device 'radio0'
        option type 'mac80211'
        option vht_max_mpdu '7991'
        option max_amsdu '0'

Other tested options:

        option vht_max_mpdu '3895'
        option vht_max_mpdu '11454'
  • 0 does not disable AMSDU, it lowers AMSDU to 3839 octets from 7935 octects.
config wifi-device 'radio1'
        option type 'mac80211'
        option max_amsdu '0'

I also tested the well known method for completely disabling A-MSDU:

echo "0" >> /sys/kernel/debug/ieee80211/phy0/mwlwifi/tx_amsdu
echo "0" >> /sys/kernel/debug/ieee80211/phy1/mwlwifi/tx_amsdu

These options all represented correctly in the system via /var/run/hostapd-phy0.conf and /var/run/hostapd-phy0.conf. So at least we know that those options work and we can play around with those options in the future to find the best performance for ours WRT routers. However, it just does not work as a workaround to the wireless cutout issue.

So throughout the week, I ended up testing every single option above and in every possible combination in between. I always rebooted the WRT3200ACM router after each change to ensure the cleanest testing attempt.

Unfortunately, despite all of the attempts, the wireless cutouts continued with zero benefits.

@adworacz I no longer think that those two commits are the culprit. So I don't think that creating builds specifically around those commits is as beneficial as I once thought. I think that you should follow your original plan and create multiple builds that are spaced out more evenly throughout that time period.

The time period still remains: r13583 (June 16, 2020) and r14566 (September 25, 2020)

There are still so many commits within that range that can potentially cause problems. mac80211 had many updates within that time period, including the major version update to 5.8 among others. Also lots of kernel bumps as well. So the bug is somewhere within that range, however, it is still a complete mystery at this time.

Thanks to all providing their precious time, experience and deep knowledge. I don't want to interrupt the technical discussion going on here.

Just wanted to report, that I tried v21.02.1 on my WRT3200ACM. Wifi dropouts are still there, as far as I experienced they are even worse, or more frequent respectively, than before.

1 Like

If it helps in testing, I have an extra WRT32X and an extra WRT3200 that I picked up specifically to use to migrate my 2 existing WRT32X's to 21 train code because of the switch from swconfig to DSA, so I could quite easily put test code on a device and temporarily swap out one of my production devices and perform testing given a specific set of tests that need to be performed