Wireless instability on Netgear R7800 (19.07.2 migrated from 18.06.4)

Would indeed be interesting. Perhaps your curiosity wins and you try again :- )
Strange that there are not more complaints.

PS: KRACK mitigation in both scenarios enabled?

Yes on for both, but you make me dubious right now, maybe I didn't configure it on all wireless...

The only difference I see is probably that with 18.06.4, KRACK is on, but with 19.07.2, Wireless is on KRACK... :smirk:

I will retest all that on less critical times - pager duty this week.

That sounds really strange: I am suing firmware-5.bin_10.4-3.9.0.2-00086 with a matching board file and everything works fine.

2 Likes

Just to be sure, we are talking about the board-2.bin file, right?

Matching means each version has it's board-2.bin file? I only see the board-2.bin file in https://github.com/kvalo/ath10k-firmware/raw/master/QCA9984/hw1.0. Is there something I'm not doing correctly?

Yeah, That one.

Same issue here, replace family with 'Roommates'.

I've been running19.07.0 r10860 though, and sad to hear that the 2 point releases didn't improve anything.

I may just downgrade to 18.06.4 as well. Just posting here and subscribing to see if I can help find a fix.

Essentially, this is what has to be done on 19.07.2 to follow up on the answer from @fantom-x:

opkg update
opkg remove ath10k-firmware-qca9984-ct kmod-ath10k-ct
opkg install wget ath10k-firmware-qca9984 kmod-ath10k
cd /lib/firmware/ath10k/QCA9984/hw1.0/
mv board-2.bin board-2.bin.bk && mv firmware-5.bin firmware-5.bin.bk
wget -O firmware-5.bin https://github.com/kvalo/ath10k-firmware/raw/master/QCA9984/hw1.0/3.9.0.2/firmware-5.bin_10.4-3.9.0.2-00086 --no-check-certificate
wget https://github.com/kvalo/ath10k-firmware/raw/master/QCA9984/hw1.0/board-2.bin --no-check-certificate

If you wish to apply this fix, this is what you would have to run in the CLI, then reboot the router. Just paste it line by line.

EDIT: Updated to include substitution of kmod-ath10k-ct also that I mentioned on this post.

6 Likes

Thanks for the update. I'll give it a try on mine later today and see if it helps with wireless stability.

Still have to throw 19.07.2 on it first.

As per @Doppel-D, 18.06.8 seems to work well too, so that would be a better option than 18.06.4 since there are quite good CVE fixes between these releases, among other things.

Yeah I haven't looked at the 18.X releases yet, as I got my new router right after 19.X was finalized. Upgraded from a DDWRT Shibby fork IIRC. I'll pick up whatever is latest if I decide to go that route.

Note: This is the partial solution, view complete solution in this post.

I've reinstalled 19.07.2 with the Non-CT firmware. We'll see how it goes tomorrow. I've also done this:

opkg update
opkg remove kmod-ath10k-ct
opkg install kmod-ath10k

That looked appropriate since I should be running Non-CT everywhere. This command helped me find it:

root@Mercure03:~# opkg list-installed | grep ath10k
ath10k-firmware-qca9984 - 20190416-1
kmod-ath10k - 4.14.171+4.19.98-1-1

2 Likes

So far, so good with the changes to Non-CT!

It was a heavy day usage from 2 gaming consoles (wired/wireless), 1 wireless gaming PC, 1 streaming tablet (wireless) and 2 remote work PC stations (wired), along with VPNs, softphone calls, conference calls and WebEx meetings.

No disconnection, rock solid connections everywhere. System log is clean as a whistle from errors or warnings, except for:

daemon.err procd: unable to find /sbin/ujail: No such file or directory (-1)

I am getting them too and not sure if it affects anything.

New warning tonight (on 19.07.2):

Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.603381] ------------[ cut here ]------------
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.603461] WARNING: CPU: 0 PID: 0 at backports-4.19.98-1/drivers/net/wireless/ath/ath10k/htt_rx.c:1179 0xbf3a0d10 [ath10k_core@bf38a000+0x48000]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.607124] Modules linked in: pppoe ppp_async ath10k_pci ath10k_core ath pppox ppp_generic nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat sch_cake nf_conntrack sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.668997]  cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred ledtrig_usbport nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_of_simple ohci_platform ohci_hcd phy_qcom_dwc3 ahci ehci_platform sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.704829] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.171 #0
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.727040] Hardware name: Generic DT based system
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.733121] Function entered at [<c030f1c4>] from [<c030b390>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.737891] Function entered at [<c030b390>] from [<c07c0664>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.743794] Function entered at [<c07c0664>] from [<c031fa98>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.749697] Function entered at [<c031fa98>] from [<c031fb84>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.755597] Function entered at [<c031fb84>] from [<bf3a0d10>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.761561] Function entered at [<bf3a0d10>] from [<bf3a2224>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.767405] Function entered at [<bf3a2224>] from [<bf3a29bc>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.773307] Function entered at [<bf3a29bc>] from [<bf3d608c>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.779236] Function entered at [<bf3d608c>] from [<c06a9660>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.785112] Function entered at [<c06a9660>] from [<c03015c8>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.791015] Function entered at [<c03015c8>] from [<c0324000>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.796916] Function entered at [<c0324000>] from [<c0362b60>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.802822] Function entered at [<c0362b60>] from [<c0301488>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.808722] Function entered at [<c0301488>] from [<c030bf8c>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.814626] Exception stack(0xc0a01f48 to 0xc0a01f90)
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.820552] 1f40:                   00000001 00000000 00000000 c0315100 ffffe000 c0a03cb8
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.825771] 1f60: c0a03c6c 00000000 00000000 c092ea28 00000000 00000000 c0a01f90 c0a01f98
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.834001] 1f80: c030854c c0308550 60000013 ffffffff
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.842231] Function entered at [<c030bf8c>] from [<c0308550>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.847351] Function entered at [<c0308550>] from [<c03589c8>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.853168] Function entered at [<c03589c8>] from [<c0358d10>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.859069] Function entered at [<c0358d10>] from [<c0900c54>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.865012] ---[ end trace 3bc62172a69aea2a ]---

Just to confirm, You are still running 19.07.2 Non CT RIGHT?

Yes. I edited my previous post to make that clear.

That log entry has wrong type obviously (19.07.2) - should be Info not Error:

Sun Mar 22 22:15:18 2020 daemon.err uhttpd[2669]: luci: accepted login on / for root from 192.168.1.200

I'm gonna let it run for more days before marking fantom-x as solution, because it seems to be pretty much solid right now.

Confirm that I have same problem with Wi-Fi since January, when I decided to replace 17.01.5 to 19.07.0 with extended flash space. Next day reinstalled 18.06.6, but problem on all home devices remained.

I switched my R7800 back to non-CT driver/firmware as well to resolve a connection issue with a Pixel 3.

I was also seeing full lock ups that required power resets on the CT driver/firmware.

I believe the switch to CT-based stuff was too soon.

1 Like

Hi all

I migrated from OpenWRT 18.06.04 to 19.07.02 a week ago and no issues so far with wifi.I have 25+ wifi devices on my network (phones, laptops, iot stuff...)

I beleive it's the CT version as I have these 2 packages installed
ath10k-firmware-qca9984-ct
kmod-ath10k-ct

I chose to keep my config while upgrading, all I had to do is install openvpn and sqm packages, all working fine.

I had issues with wifi on 18.06.X, but found the root cause at that time. I was experimenting/trying to do some network monitoring for my IoT devices using ulogd/syslog-ng (based on this article https://balagetech.com/monitor-network-traffic-openwrt-syslog-ng/) but that was writing so much logs into the flash memory. All wifi problems stopped when I stopped ulogd daemon.