Devices connected to WiFi become sluggish to load data (R7800)

Problem
Firstly, wired devices are unaffected. This problem only affects devices (iphones seen on two different ones so far) connected to my 5 GHz WiFi eventually get really sluggish to load data when browsing in Safari, Chrome, or using the Facebook app. When working properly, there is a minimal lag when, for example, googing and awaiting the hit set to return, or simply scrolling through the feed within the facebook app.

When I experience this "sluggish to load data" effect, I can wait 5-20+ seconds to see the hit set after hitting "go" on google search. Or I can see the facebook app trying to load more data but eventually timing out.

Partial Solution
I can usually fix the problem by executing /etc/init.d/network restart on the OW router (R7800), but eventually, the behavior returns and devices seem really slow again. As well, the restart is not foolproof with regard to fixing the problem.

Help to track down cause
Happy to post any config files/logs.

Here the output from dmesg. I do see some errors there. Around 76471 seconds, an error relating to backports-4.19.120-1/net/wireless/util.c which might be related or a red herring? As expected, I see that error echoed in the output of logread as well.

...
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.050020] ------------[ cut here ]------------
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.050076] WARNING: CPU: 0 PID: 11750 at backports-4.19.120-1/net/wireless/util.c:1147 0xbf309d88 [cfg80211@bf305000+0x37000]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.053720] invalid rate bw=0, mcs=15, nss=4
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.065027] Modules linked in: pppoe ppp_async ath10k_pci ath10k_core ath pppox ppp_generic nf_conntrack_ipv6 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat sch_cake nf_conntrack sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.117800]  cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred ledtrig_usbport nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb usb_storage f2fs crc32_generic leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_of_simple ohci_platform ohci_hcd phy_qcom_dwc3 ahci ehci_platform sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.156318] CPU: 0 PID: 11750 Comm: hostapd Not tainted 4.14.180 #0
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.178450] Hardware name: Generic DT based system
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.184708] Function entered at [<c030f1c4>] from [<c030b390>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.189563] Function entered at [<c030b390>] from [<c07c09c4>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.195380] Function entered at [<c07c09c4>] from [<c031f878>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.201195] Function entered at [<c031f878>] from [<c031f8d8>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.207012] Function entered at [<c031f8d8>] from [<bf309d88>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.212884] Function entered at [<bf309d88>] from [<bf31783c>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.218645] Function entered at [<bf31783c>] from [<bf324654>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.224461] Function entered at [<bf324654>] from [<bf32504c>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.230275] Function entered at [<bf32504c>] from [<c06e1a1c>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.236093] Function entered at [<c06e1a1c>] from [<c06e0238>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.241906] Function entered at [<c06e0238>] from [<c06e0a10>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.247720] Function entered at [<c06e0a10>] from [<c06dfa48>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.253537] Function entered at [<c06dfa48>] from [<c06dfe64>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.259354] Function entered at [<c06dfe64>] from [<c0689d5c>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.265171] Function entered at [<c0689d5c>] from [<c068a5c4>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.270984] Function entered at [<c068a5c4>] from [<c0307b60>]
Sat May 23 11:28:45 2020 kern.warn kernel: [76471.276894] ---[ end trace 2f58026363a19bec ]---
...

I restarted the network as shown around 141151 seconds which gave the output there.

2 Likes

I should note that I think the kernel WARN in dmesg above is unrelated. I am experiencing the problem now on my iphones and I do not see that kernel WARN in dmesg at all :confused:

To fix the problem, I actually rebooted the router about 4 hours ago. Here is dmesg.

I noticed that right after the reboot, the iphones again experienced the lag I described above. The only line that seems out of place from demsg is this one which I do not understand:

[   72.608110] ath10k_pci 0000:01:00.0: Invalid peer id 1 or peer stats buffer, peer: dc211200  sta:   (null)

When I google on that error, I found this bug report.... related to my problem???

Note - I waited a few hours but still, the laggy web browsing was present on the iphones. I just ran /etc/init.d/network restart and the iphones (wifi) are working as expected with very fast responses. Here is dmesg again after I restarted it (around 5328 seconds).

Opened the following report against ath10kct although that might not be what is actually wrong:

I flashed R7800-master-r13313-6934b20912-20200520-1850 and found that the problem is present under it as well.

Thoughts are appreciated.

What modem / type of internet / isp speed are you running? SQM on?

Post:

  1. cat /etc/config/wireless

  2. your wired vs. wireless speedtest with high resolution bufferbloat:

http://www.dslreports.com/speedtest

Arris T25/Cable/ 200 down and 12 up. SQM is on. Will post 1 and 2 in an hour or so when back home. Do you want 2 with or without SQM enabled?

Itā€™d be interesting to compare with SQM on and off.

If the tests look fine consider testing your modem too since it is a puma 7 chipset (Some people are reporting lag): http://www.dslreports.com/tools/puma6

@ACwifidude -

config wifi-device 'radio0'
	option type 'mac80211'
	option hwmode '11a'
	option path 'soc/1b500000.pci/pci0000:00/0000:00:00.0/0000:01:00.0'
	option country 'US'
	option htmode 'VHT40'
	option legacy_rates '0'
	option channel '48'

config wifi-iface 'default_radio0'
	option device 'radio0'
	option network 'lan'
	option mode 'ap'
	option ssid 'blast'
	option encryption 'psk2+ccmp'
	option key '***'
	option wpa_disable_eapol_key_retries '1'
	option macfilter 'allow'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'

config wifi-device 'radio1'
	option type 'mac80211'
	option hwmode '11g'
	option path 'soc/1b700000.pci/pci0001:00/0001:00:00.0/0001:01:00.0'
	option country 'US'
	option htmode 'HT40'
	option channel '11'
	option disabled '1'

config wifi-iface 'default_radio1'
	option device 'radio1'
	option mode 'ap'
	option encryption 'psk2+ccmp'
	option wpa_disable_eapol_key_retries '1'
	option network 'guest'
	option ssid '24_guest'
	option key '***'
	option disabled '1'

config wifi-iface 'wifinet0'
	option device 'radio0'
	option mode 'ap'
	option ssid '50_guest'
	option encryption 'psk2+ccmp'
	option wpa_disable_eapol_key_retries '1'
	option network 'guest'
	option key '***'
	option macfilter 'allow'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'

config wifi-iface 'wifinet1'
	option device 'radio1'
	option mode 'ap'
	option ssid 'caco'
	option network 'lan'
	option encryption 'psk2+ccmp'
	option wpa_disable_eapol_key_retries '1'
	option macfilter 'allow'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	list maclist '***'
	option key '***'
	option disabled '1'
  1. Test results under 2 conditions of SQM wired and wireelss

SQM enabled wired PC
SQM enabled wireless (iphone)
SQM disabled wired PC
SQM disabled wireless (iphone)

Try two things-

  1. set your 5 ghz radio to 80mhz width. Might as well get full speed to the AP

  2. turning off your mac filter - mac filters give a false sense of security and cause issues with ios devices and wifi. Apple products spoof their MAC address when communicating with access points ever since ios 8 - so a good part of your struggle is probably trying to authenticate to an access point that is demanding the true MAC. I donā€™t agree with all the recommendations in their article but the MAC filter section makes sense:

Last thing not wifi related - your non SQM wired test was really odd. Nearly 1 second of lag (D buffer bloat!) is quite a bit and would produce noticeable lag in loading things. Try the puma6 test and see if your modem is acting up.

1 Like

Thanks for the suggestions. Trying 80 MHz/no filtering now.

I tried the puma6 test:
bad

I will be returning this modem.

1 Like

Yikes. That would explain the lag! Best of luck with a non-puma modem.

Yes, the shitty Intel chipset you mentioned in that $250 modem is horrible! Thanks for your help to uncover this unrelated problem! I will return this modem and get a different one.

That said, I am still perplexed by the initial problem I posted relating to the cause of laggy wifi on some devices.

A functioning modem and getting rid of mac filtering should give you a solid base. iOS devices hate mac filtering - a quick google search and youā€™ll find how big an issue its been since iOS 8. Its a good thing, why advertise your real MAC address?

Two additional things to consider:

  1. You have two SSIDs per frequency. When you have virtual interfaces they act like an additional access point on that channel and you create more management / overhead packets to prevent collisions. The more SSIDs you have on a channel the bigger performance degradation. It usually isnā€™t much for two SSIDs on a single channel but depending on neighbors, number of clients, etc, it can add up (more lag). Consider consolidating SSIDs on each frequency if appropriate. Name the 2.4ghz and 5ghz channels different (see point #2)

  2. There are different preferences out there for which frequency to connect to. Personally on my iOS devices I have them forget the 2.4ghz SSID and only have them remember one 5ghz SSID. The speed on 2.4ghz is terrible and more appropriate for smart devices, 5ghz is so much quicker for iOS devices. Jumping between frequencies and between virtual interfaces can cause issues with roaming wifi devices. Every jump introduces lag.

Lets see what the same testing looks like with all fully functioning equipment. You should be able to max out your ISP connection and have solid bufferbloat.

We are pretty isolated from neighbors so interference from other radios isn't a concern. I only have 3-4 clients at any given time connected and my 2.4 GHz radio on the R7800 is disabled.

I have a new modem now and it is unaffected by puma 6 test

Did you mean with the puma 6 test? It is much better now/hardly any reds in the test results. I believe I am still affected by the original issue I posted though.

Iā€™d set your transmission power to a set number. Usually mid 20ā€™s usually works well. 30 should be the max. A set number will give consistent results over the default.

What types of wireless clients are having issues - all of them or only some? Zero issues with wired clients now that the modem is fixed? What does dsl speed reports testing look like (less lag?).

I have super simple SQM settings. Never really found great benefit from advanced settings - anything unique with your SQM setup?


root@OpenWrt:~# cat /etc/config/sqm

config queue 'eth0'
        option interface 'eth0.2'
        option enabled '1'
        option debug_logging '0'
        option verbosity '5'
        option upload '32500'
        option download '0'
        option qdisc 'fq_codel'
        option script 'simplest_tbf.qos'
        option qdisc_advanced '0'
        option linklayer 'ethernet'
        option overhead '22'


According to luci, the "maximum transmit power" is "driver default" which is 23 dBm which is the maximum from the pulldown. I can force a setting of "23 mBm (199 mW)" from the pulldown... is there something I can inspect to see what the current is (verbiage indicated max but driver can reduce).

EDIT: Here is how to see the actual power I believe.

# iwinfo|grep Tx
          Tx-Power: 23 dBm  Link Quality: 38/70
          Tx-Power: 23 dBm  Link Quality: 54/70
          Tx-Power: 0 dBm  Link Quality: unknown/70

I see it on the following devices based on pinging said device from the OW router:
*iPhone 7
*iPhone 8 Plus
*iPad Air 2
*Macbook Air

For whatever reason, I do not see it on the following:
*Lenovo P1
*Raspberry Pi4B (connected via wireless)
*iPhone 11

Regarding SQM, I have the following:

# cat /etc/config/sqm

config queue 'eth1'
	option qdisc_advanced '0'
	option interface 'eth0.2'
	option verbosity '5'
	option linklayer 'ethernet'
	option overhead '22'
	option upload '11500'
	option qdisc 'fq_codel'
	option debug_logging '0'
	option enabled '1'
	option download '225000'
	option script 'simplest_tbf.qos'

Select one of the manual transmit powers. You can pull your wireless settings via cat and see if it gives you steady results. SQM settings look good.

You have a good diversity of clients. Those wireless devices you are not seeing might be asleep. Lag with all clients or just some clients?

Lag only with a few clients, just the older iphone, ipad, and macbook. Other devices connected at the same time the iphones are lagging do not show the lag. This can be measured qualitatively by watching a google hit set take 5-20+ seconds to appear, or directly by pingging the device from the OW router and seeing those really long ping times and packet loss.

The first symptom is slow web browsing on the device when this problem is occurring. I can confirm it by pinging that device so I know it is not asleep.

Interesting that it is only the apple devices.

  1. Any unique network configuration for these, unique DNS, or anything else changed from the default openwrt config?

  2. Running adblock or any other additional programs you are running that could be jacking with particular website functioning?

  3. You could try doing a ā€œreset network settingsā€ on one of these older apple devices and see if there are gremlins in the remembered client network settings.

  4. You could update your router to the latest hynman master build (CT or ath10k - you have two different wifi driver options, Iā€™d try both options)

  5. lastly - you could select not to keep your configuration so that you start with the default settings if you think there might be a bug from prior configuration changes.

Beyond the above posts on simplifying/tweaking the configuration settings, trying a different wifi driver, considering reseting clients +/- the router to default configuration - Iā€™m out of any additional tips or tricks. Hope one of the above helps! :sunglasses:

My wifi settings for 5ghz (for comparison if anything is different):


root@OpenWrt:~# uname -a
Linux OpenWrt 5.4.42 #0 SMP Tue May 26 15:52:03 2020 armv7l GNU/Linux

root@OpenWrt:~# cat /etc/config/wireless

config wifi-device 'radio0'
        option type 'mac80211'
        option hwmode '11a'
        option path 'soc/1b500000.pci/pci0000:00/0000:00:00.0/0000:01:00.0'
        option htmode 'VHT80'
        option channel '161'
        option txpower '22'
        option legacy_rates '0'
        option country 'US'
        option beacon_int '101'

config wifi-iface 'default_radio0'
        option device 'radio0'
        option network 'lan'
        option mode 'ap'
        option ft_over_ds '1'
        option ssid '*****'
        option ft_psk_generate_local '1'
        option key '*****'
        option ieee80211r '1'
        option encryption 'psk2+ccmp'
        option ieee80211k '1'
        option bss_transition '1'
        option ieee80211v '1'


Thanks for all the suggestions.

  1. No unique settings within the idevices.
  2. Running pihole on a RPi but it is not serving DHCP, it is just blocking ads
  3. I tried the reset network config but it did not help
  4. I tried running the latest hnyman build but experienced the same issues with it (did not switch drivers, just used what came on out-of-the-box which I believe is CT.
  5. I might do a factory reset and rebuild my configuration from scratch just to rule it out.

The only non-standard thing I have is physical port #4 is VLANed to be on the guest zone I created. Beyond that, I have a pretty standard "LAN" and "Guest" zone setup.

I did a factory reset and setup a simple 5 GHz SSID. I did not experience the lag initially but it did occur eventually. I thought it was due to another client joining, but it seems inconsistent.

I have switched to the ath10k (not ath10k-ct) and have been using it for 5 days now without any issue.