Upstream repo history of the 88W8964 BLOB, comments and concerns along the journey.
Fascinating - thanks for doing your own bisect, albeit from the firmware side.
What's interesting is that David's last build (and the build I'm working on), used the latest firmware, 9.3.2.12, from Feb 06 2020.
If the latest firmware worked at one point, then that implies a breakage occurred between two components. One being the firmware, and another being... the kernel perhaps?
We'll find out more soon. I should be able to finish my first test build tonight. I'll flash it onto my own router first, to make sure everything seems fine. If it seems okay, I'll post it, and work on the next build which splits the known "good" and known "bad" commits in half.
Actually I am experiencing same problems on wrt1200ac.
Welp I've successfully installed my custom image on my WRT3200ACM. No dropouts yet but I don't expect any. This just sets the baseline to prove that in fact these previous builds (as created by Davidc502) were unaffected.
I'll be honest, I'm kind of shocked at how painless it actually was to get it all working, including builds, reflash, and my scripted uci setup (which just needed some minor tweaking to deal with than rename of the wan interface from wan
to eth1.2
due to the change in switch architecture).
EDIT: If anyone's curious, I've thrown this build up for others to see. I only made builds for the WRT3200ACM and WRT32X for now. I didn't not include all the same packages that David did, just a warning. You can see what I did include in the *.manifest
file.
Use at your own risk. There's not much point right now, as we're just doing a bug bisect, so you aren't gaining anything.
I intentionally left a lot of things at OpenWRT defaults in terms of package selection etc. I just added a few packages that I use now (things like banIP, SQM, and BCP38).
Builds: https://openwrt.austindw.com/linksys-wrt/wifi-hang-bisect/r13342/
Pics or it didn't happen proof:
Do yourself a favour, test it a few days longer than planned, nothing is worse than an early false negative and getting derailed into the wrong direction.
Completely agree. Figure I'll give it a 3-5 days, maybe a week.
Unfortunately, my testing into previous firmware blobs was unsuccessful in the end. I apologize if I had gotten any hopes up over this. Firmware 9.3.1.2 was successful for approx. 3 hours, but every attempt after that has failed within 10 minutes. It seems that 3 hours without wifi cutouts on 21.02.0 was comparable to winning the lottery.
I ended up testing firmware 9.3.2.3 after that and, of course, it failed within 10 minutes. So it is clear that my quest for a temporary workaround does not exist within previous firmware blobs.
Sorry everyone.
I think that @adworacz is on the right track to pinning this down in a more definitive method and also that the issue must be within the kernel or possibly mac80211.
One interesting thing that I wanted to point out, though, is that I never lost wireless connectivity on my laptop during all of my testing. It was always the iPhone Xr with the wifi cutouts continuously and had to always toggle the wifi off/on to resolve the cutouts. So over the course of some 40-50 wifi cutouts during my testing, my laptop never cutout.
It seems that a large percentage of devices that trigger this issue are iPhones and iPads. I've seen others saying some Android devices as well. So mobile devices seem particularly affected.
It's a shame that Marvell/Linksys/NXP couldn't simply leave one developer on mwlwifi for occasional patches. Especially for a device that retails for $399 here in Canada at the time that I purchased mine.
Sorry for the rant. I'm just frustrated.
This has been my experience as well. My laptop and desktop work just fine over wifi, but my iPhone and an IoT device exhibit the hanging issues.
So far, my test build is running stable. Haven't hit a hanging issue yet. Gonna give it a few more days and then I'll test the next build.
Are you guys monitoring DHCP lease time on both sides, maybe there is a miscommunication and a lease expires?
Thank you for confirming. As always, I appreciate the time and effort that you are putting into this.
Since I had kept builds of r15878 from March 6, 2021, I finally had time to give it a try last night. My hope was that I could help narrow down a timeline for you. I was hoping that it would not trigger the wifi cutouts, and that would put the bad commit somewhere between March 6, 2021 and April 23, 2021 (RC1).
However, r15878 did trigger the issue of wifi cutouts within 10 minutes on my iPhone Xr that seems especially prone to triggering this issue. Each time, as usual, required toggling the wifi off/on. Therefore the bad commit is before March 6, 2021, not after. Unfortunately still a large time period for your bisecting.
As I understand it, you are testing each build for approx. 3-5 days for thoroughness. If you want to build some images ahead of time, while you are still in testing, I am willing and happy to do some testing of your other builds. Particularly, since my iPhone Xr seems to trigger the issue rather quickly. I can commit to testing of 3-4 hours each night per build and provide you with some feedback. And then depending on which ones trigger issues or not, it might help us to narrow down a bit and at that point do the more thorough 3-5 days testing on less builds.
Let me know what you think. I would like to help if possible. At least as much as I can help as a non-developer.
Thanks for the offer of help, @WildByDesign I will definitely accept it.
The 3-5 days is just for the initial build, or any builds that don't exhibit hanging issues. If we install a build and notice a hanging issue, then that's a "bad" build, and we'll continue the bisect. That might mean we can get through several bad builds in rapid succession.
So far my initial build is proving to be stable, so things are looking good. Once a few more days pass, I'll create a new build of the bisect and we can start testing there. It's still a little bit of a manual process to get the builds working properly.
I'll be sure to post the bisect build here for everyone to try.
Just so I know in advance, are there any special packages you need for your testing @WildByDesign? We can't really install packages on these custom builds after the fact, since we don't have a reliable source to pull from. So I just need to build in all specific packages into the image, or build them as modules so we can install them individually + manually.
You're welcome. I am feeling very optimistic about this approach.
The only package that is really crucial for me when testing various 21.02.x builds is luci-app-advanced-reboot. That way, once things go sideways I can always reboot back into David's last build on the other partition because I need stable wifi during the day for my family. Night time is when I get a chance to do more testing.
Yup - I'm using the same luci-app-advanced-reboot
package myself, so all of my builds will have it.
Sounds good then! I'll prep a new build on either Friday or Saturday.
I'm sure you have seen this post before, but I found it just a few weeks ago... I had success using this setting with no disconnects on a WRT3200ACM. The script disables AMSDU. Stability for >2 weeks on the 21.02 release. Without this, I had the same issue everyone else describes, with iPhone X, iPad, etc. showing connected to WiFi but having no response for several minutes.
I'm not sure what this points to other than a possible incompatibility with iPhone and the firmware, or maybe a kernel issue with handling this AMSU setting.
A-MSDU is an interesting lead. I always had the impression that those commands were not needed for WRT3200. But maybe they are needed now.
I found only one possible suspect commit between David’s last build and 21.02.x series (regarding A-MSDU):
Link: https://github.com/openwrt/openwrt/commit/673062fc56c05ec613f416b0da7325be46e65a29
I just tried RMilk's suggestion regarding AMSDU on a WRT3200ACM with 21.02.0. Unfortunaltely, Wifi still has dropouts. As described above, Wifi becomes stable on my WRT3200ACM, when I disable WMM (but this prevents some clients to connect at all).
Maybe there are several different issues, all leading to Wifi instability, and having different causes.
Some further reference:
mac80211: do not allow bigger VHT MPDUs than the hardware supports · openwrt/openwrt@caf7277 (github.com)
and
mac80211: allow bigger A-MSDU sizes in VHT, even if HT is limited · openwrt/openwrt@673062f (github.com)
Both commits were added a day apart to the 21.02.0 branch in September 2020 and both were later tagged for 21.02.0 and 21.02.0-RC1 releases.
Patches were removed with:
mac80211: Update to version 5.10-rc6-1 · openwrt/openwrt@12424ed (github.com)
in February 2021. Well, one of the patches was removed with that and the other patch removed in another commit. Both patches were upstreamed as far as I understand.
These patches are potentially suspect in this issue. Just a theory at the moment anyway.
Does anyone know if Master branch still has wifi cutouts?
(I've seen some users say Yes, and some say No).
If this image is still ticking I wonder if you could post the link from the output of:
dmesg | grep -i pci | nc termbin.com 9999
curious as to timeline of PCI changes.
Yes master still has WiFi cutouts. For me anyways.
And Divested’s build has a patch that explicitly disables AMSDU using these same commands, and there’s still cutouts on my WRT3200ACM. This was on master and 21.02, I patched both.