Wireless instability on Netgear R7800 (19.07.2 migrated from 18.06.4)

New warning tonight (on 19.07.2):

Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.603381] ------------[ cut here ]------------
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.603461] WARNING: CPU: 0 PID: 0 at backports-4.19.98-1/drivers/net/wireless/ath/ath10k/htt_rx.c:1179 0xbf3a0d10 [ath10k_core@bf38a000+0x48000]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.607124] Modules linked in: pppoe ppp_async ath10k_pci ath10k_core ath pppox ppp_generic nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat sch_cake nf_conntrack sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.668997]  cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred ledtrig_usbport nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_of_simple ohci_platform ohci_hcd phy_qcom_dwc3 ahci ehci_platform sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.704829] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.171 #0
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.727040] Hardware name: Generic DT based system
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.733121] Function entered at [<c030f1c4>] from [<c030b390>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.737891] Function entered at [<c030b390>] from [<c07c0664>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.743794] Function entered at [<c07c0664>] from [<c031fa98>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.749697] Function entered at [<c031fa98>] from [<c031fb84>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.755597] Function entered at [<c031fb84>] from [<bf3a0d10>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.761561] Function entered at [<bf3a0d10>] from [<bf3a2224>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.767405] Function entered at [<bf3a2224>] from [<bf3a29bc>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.773307] Function entered at [<bf3a29bc>] from [<bf3d608c>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.779236] Function entered at [<bf3d608c>] from [<c06a9660>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.785112] Function entered at [<c06a9660>] from [<c03015c8>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.791015] Function entered at [<c03015c8>] from [<c0324000>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.796916] Function entered at [<c0324000>] from [<c0362b60>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.802822] Function entered at [<c0362b60>] from [<c0301488>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.808722] Function entered at [<c0301488>] from [<c030bf8c>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.814626] Exception stack(0xc0a01f48 to 0xc0a01f90)
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.820552] 1f40:                   00000001 00000000 00000000 c0315100 ffffe000 c0a03cb8
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.825771] 1f60: c0a03c6c 00000000 00000000 c092ea28 00000000 00000000 c0a01f90 c0a01f98
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.834001] 1f80: c030854c c0308550 60000013 ffffffff
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.842231] Function entered at [<c030bf8c>] from [<c0308550>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.847351] Function entered at [<c0308550>] from [<c03589c8>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.853168] Function entered at [<c03589c8>] from [<c0358d10>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.859069] Function entered at [<c0358d10>] from [<c0900c54>]
Sat Mar 21 20:59:30 2020 kern.warn kernel: [156361.865012] ---[ end trace 3bc62172a69aea2a ]---

Just to confirm, You are still running 19.07.2 Non CT RIGHT?

Yes. I edited my previous post to make that clear.

That log entry has wrong type obviously (19.07.2) - should be Info not Error:

Sun Mar 22 22:15:18 2020 daemon.err uhttpd[2669]: luci: accepted login on / for root from 192.168.1.200

I'm gonna let it run for more days before marking fantom-x as solution, because it seems to be pretty much solid right now.

Confirm that I have same problem with Wi-Fi since January, when I decided to replace 17.01.5 to 19.07.0 with extended flash space. Next day reinstalled 18.06.6, but problem on all home devices remained.

I switched my R7800 back to non-CT driver/firmware as well to resolve a connection issue with a Pixel 3.

I was also seeing full lock ups that required power resets on the CT driver/firmware.

I believe the switch to CT-based stuff was too soon.

1 Like

Hi all

I migrated from OpenWRT 18.06.04 to 19.07.02 a week ago and no issues so far with wifi.I have 25+ wifi devices on my network (phones, laptops, iot stuff...)

I beleive it's the CT version as I have these 2 packages installed
ath10k-firmware-qca9984-ct
kmod-ath10k-ct

I chose to keep my config while upgrading, all I had to do is install openvpn and sqm packages, all working fine.

I had issues with wifi on 18.06.X, but found the root cause at that time. I was experimenting/trying to do some network monitoring for my IoT devices using ulogd/syslog-ng (based on this article https://balagetech.com/monitor-network-traffic-openwrt-syslog-ng/) but that was writing so much logs into the flash memory. All wifi problems stopped when I stopped ulogd daemon.

Hi guys,

19.07.2 for all below

Using dumb AP1 Archer c2600
(2 VLANS -> 2 SSIDs, 2x 2.4GHz + 2x 5 GHz, ie each SSID on 2.4 and 5GHz)
-> WiFi disappeared, no GUI or SSH access after a while
-> or laggy WiFi
-> shall I replace wifi firmware or go back to 18.06.8 ?
-> going to be fixed in 19.07.3 ? (quite some devices with qca998x)

Using dumb AP2 pcengines APU2e4 with 2x Compex WLE900VX (also qca988x) same VLAN and SSID stuff as for first AP1
-> iPhones 7 and SE have issues after a while with AP2 (and AP1)
-> overall less issues as for AP1 but not perfect (as with 18.06.4)
-> same questions as above, what do you recommend based on your experience ?

Using x86-64 as Router/Firewall (no wifi no qca988x)
i5-3470 4GB 128GB SSD
—> already twice down (in maybe 4 weeks), I‘m worried...
-> maybe not a topic here, but maybe somebody had similar worse stability than with 18.06 ?

Security (19.07.2) over Stability (18.06.4) ?

thanks and

cheers blinton

Security and stability. Stability is less useful if your networks have known security weaknesses and can easily get compromised. Older unsupported OpenWrt releases with years old components can run perfectly stable but are open to well known security flaws.

OpenWrt 18.06 has recent security fixes with 18.06.8: https://openwrt.org/releases/18.06/start

If you experience stability problems with ath10k-ct, you could run classic ath10k non-ct in both 19.07.2 or 18.06.8. @DjiPi documented above how to switch to ath10k non-ct in 19.07.

You should decide by yourself what matches your needs.

1 Like

Ok, thanks a lot for your recommendation.

Now I understand why I maybe had issues.
On AP2 (APU2 with Compex .. qca9880) I had to install the kmod-ath10k anyway manually as it's a x86-64.
But on AP1 (c2600 with qca9880) the openwrt image contains kmod-ath10k-ct. So I replaced it now with the non-ct version, i.e. kmod-ath10k.

I'll report back if the issues go away (maybe the iPhones were somehow "stuck" with AP1 and "didn't want" to change to AP2, roaming issue).

Now having both on the non-ct version.
-> Is it wise to turn on 802.11r on both AP's, given each AP broadcasts 2 SSID's in both 2.4 GHz and 5 GHz ? Will roaming work between the two AP's but also between 2.4 GHz and 5 GHz ?

Another observation which I don't understand.
For testing purpose on AP2 (2x Compex with qca9880) I set each Compex on 5 GHz only (distinct SSID's),

  • first Compex on LAN (lan port 1) (channel 100, 80 MHz)
  • second Compex (channel 52, 80 MHz) on VLAN (lan port 2)
    I expected to get 1 GBit (I have 1 GBit internet, typ. 933 MBit) cumulated, but I get only 550ish (the same speed as when I run each Compex alone). Why ?
    (my topology is Router (19.07.2) - switch - switch - AP2 (19.07.2), whereas router and ap2 have lan on port 1 and vlan on port 2, the switches are connected via lacp)

thanks and cheers blinton

Maybe you should open a new thread with your questions, as your specific configuration is rather off topic here now. It's not about R7800 wireless instability.

1 Like

Hi,

I have the R7800 on dd wrt and I'd like to switch to open wrt, so if I understand after the flash , I need to upload those script?
Thanks

opkg update
opkg remove ath10k-firmware-qca9984-ct kmod-ath10k-ct
opkg install wget ath10k-firmware-qca9984 kmod-ath10k
cd /lib/firmware/ath10k/QCA9984/hw1.0/
mv board-2.bin board-2.bin.bk && mv firmware-5.bin firmware-5.bin.bk
wget https://github.com/kvalo/ath10k-firmware/raw/master/QCA9984/hw1.0/3.9.0.2/firmware-5.bin_10.4-3.9.0.2-00086 --no-check-certificate
mv firmware-5.bin_10.4-3.9.0.2-00086 firmware-5.bin
wget https://github.com/kvalo/ath10k-firmware/raw/master/QCA9984/hw1.0/board-2.bin --no-check-certificate
opkg update
opkg remove kmod-ath10k-ct
opkg install kmod-ath10k

Just out of curiosity;
shouldn't you exchange the firmware ath10k-firmware-qca9984-CT also?

1 Like

Good question,

For my Archer c2600, in
/lib/firmware/ath10k/QCA99X0/hw2.0/
and in
/rom/lib/firmware/ath10k/QCA99X0/hw2.0/
I found omething like
firmware-5.bin
board.bin
board-2.bin

In the kernel log I found
10.4b-ct-9980-fW-012-17ba9833
which looks like the ct version ...

So I went into


and found that for my wifi
QCA99X0/hw2.0
the firmware is 5 years old ...

So where do I get the latest non-ct version ? :disappointed_relieved:

Or can I choose the non-ct version here ? Is this a snapshot or compatible with the stable version 19.07.2 ? (looks like from 2019, i.e. more recent than the above)
https://downloads.openwrt.org/releases/19.07.2/packages/arm_cortex-a15_neon-vfpv4/base/ath10k-firmware-qca99x0_20190416-1_arm_cortex-a15_neon-vfpv4.ipk

(I guess these are the ct versions
https://downloads.openwrt.org/releases/19.07.2/packages/arm_cortex-a15_neon-vfpv4/base/ath10k-firmware-qca99x0-ct_2019-10-03-d622d160-1_arm_cortex-a15_neon-vfpv4.ipk
and
https://downloads.openwrt.org/releases/19.07.2/packages/arm_cortex-a15_neon-vfpv4/base/ath10k-firmware-qca99x0-ct-htt_2019-10-03-d622d160-1_arm_cortex-a15_neon-vfpv4.ipk)

Only if you find yourself in my conditions, then you could try this fix or revert back to version latest OpenWrt version 18.

You are looking at the partial solution, please refers to the full solution details on this post.

I would gladly write you the commands you need to run but not in this thread, as this might confuse people going at the bottom of this thread thinking of running the most recent patch solution, which would then not be for R7800.

Could you start another thread for the C2600 so we can move those discussions related to your device on its own page please?

1 Like

Added a separate topic for qca 99x0 or archer c2600

For your information, I've noticed these message in the kernel log after applying the patch for non-CT:

[   11.626418] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2
[   20.317961] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/QCA9984/hw1.0/firmware-6.bin failed with error -2

This behaviour is normal and is explained in this post.

to be sure, that script, I put that in ssh session (putty)? line by line or one shot?
thanks

Yes, line by line in putty.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.