AQL and the ath10k is *lovely*

The iOS Wi-Fi issue - Wi-Fi symbol shows device connected, but no internet connectivity pending manual disconnect and reconnect - resurfaced.

Namely with snapshot:

22.03-SNAPSHOT - r19464-a4390ea283

no issues.

But with snapshot:

22.03-SNAPSHOT r19466-08e1812900

issues back.

Therefore I think at least from this commit:

Things were good, but from:

Things went bad again. I am blaming something in these airtime fairness commits:

  • aab535d
  • 08e1812

@ka2107 and @vochong what have your findings been?


@amteza I am not sure of the significance of this, but @richb-hanover-priv did some testing on 22.03-RC4 and found a big latency increase from 8ms to 28ms 90% using WiFi on RT3200 here:

Compare (wired):

with (over Wi-Fi):

image

It's frustrating because we have managed to get bufferbloat-related latency resolved even for variable rate LTE connections (by adjusting CAKE bandwidth on the fly based on RTT measurements) but then Wi-Fi introduces its own latency, undoing our work for Wi-Fi clients.

So now I'm not sure where things work well for the RT3200. My big issue has been iOS connectivity issues which seem to have been worsened with the recent airtime fairness commits.

Hello,

My r7800 just crashed (uptime was about 3 days) when I woke up early this morning and used my phone to check messages. Basically almost no traffic at this early morning (4am) and the crash happened. Could someone please take a look at the crash dumps below. Thanks!

NAME="OpenWrt"
VERSION="SNAPSHOT"
BUILD_ID="r19916+18-326e109f24"
OPENWRT_BOARD="ipq806x/generic"
OPENWRT_ARCH="arm_cortex-a15_neon-vfpv4"

<1>[211787.522720] 8<--- cut here ---
<1>[211787.522758] Unable to handle kernel paging request at virtual address c0b6348c
<1>[211787.524681] pgd = fafc0e29
<1>[211787.531961] [c0b6348c] *pgd=42a1941e(bad)
<0>[211787.534751] Internal error: Oops: 8000000d [#1] SMP ARM
<4>[211787.538911] Modules linked in: nss_ifb ecm ath10k_pci ath10k_core ath wireguard nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet mac80211 libchacha20poly1305 ipt_REJECT curve25519_neon cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_conntrack xt_comment xt_cgroup xt_addrtype xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_HL xt_DSCP xt_CT xt_CLASSIFY sch_cake ppp_async poly1305_arm nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_reject_ipv4 nf_log_ipv6 nf_log_ipv4 nf_log_common nf_flow_table nf_conntrack_netlink libcurve25519_generic libcrc32c iptable_nat iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat chacha_neon fuse sch_tbf
<4>[211787.539672]  sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact qca_nss_qdisc qca_nss_pppoe pppoe pppox ppp_generic slhc ledtrig_usbport cryptodev xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_NPT ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 nfsv4 nfsv3 nfs nfs_ssc msdos bonding ifb ip6_udp_tunnel udp_tunnel sit qca_nss_drv qca_nss_gmac oid_registry tunnel4 ip_tunnel tun xfrm_user xfrm_ipcomp af_key xfrm_algo vfat fat lockd sunrpc grace hfsplus hfs cdrom dns_resolver nls_utf8 nls_iso8859_15 nls_iso8859_1 nls_cp850 nls_cp437 nls_cp1250 wp512
<4>[211787.609333]  twofish_generic twofish_common tgr192 tea serpent_generic khazad cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common anubis xts crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha1_generic seqiv md5 kpp echainiv ecb des_generic libdes cmac authenc uas usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom ohci_platform ohci_hcd phy_qcom_ipq806x_usb ahci fsl_mph_dr_of ehci_platform ehci_fsl sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd ramoops reed_solomon pstore gpio_button_hotplug f2fs ext4 mbcache jbd2 exfat dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax crc32c_generic crc32_generic cbc encrypted_keys trusted tpm
<4>[211787.760086] CPU: 0 PID: 30299 Comm: kworker/0:1 Not tainted 5.10.120 #0
<4>[211787.782315] Hardware name: Generic DT based system
<4>[211787.789010] Workqueue: events dbs_work_handler
<4>[211787.793598] PC is at 0xc0b6348c
<4>[211787.798124] LR is at krait_mux_set_parent+0x60/0x64
<4>[211787.801583] pc : [<c0b6348c>]    lr : [<c0698678>]    psr: 60000033
<4>[211787.806533] sp : c7303d90  ip : 00000000  fp : c1ea9280
<4>[211787.812867] r10: c15b9b18  r9 : 00000000  r8 : c7303dd4
<4>[211787.818164] r7 : 00000002  r6 : ffffffff  r5 : 00000001  r4 : c15f8a58
<4>[211787.823460] r3 : c0b6348d  r2 : c0d9b31c  r1 : 20000013  r0 : 000346dc
<4>[211787.829800] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
<4>[211787.836396] Control: 10c5787d  Table: 481d406a  DAC: 00000051
<0>[211787.844034] Process kworker/0:1 (pid: 30299, stack limit = 0x724fb079)
<0>[211787.849676] Stack: (0xc7303d90 to 0xc7304000)
<0>[211787.856195] 3d80:                                     c15f8a64 c0d803a4 ffffffff c069a038
<0>[211787.860726] 3da0: 00000000 c0d803a4 ffffffff c0341ab4 c15b9b00 c0d803a4 c15f93c0 00000002
<0>[211787.868972] 3dc0: 2faf0800 c1ea9400 00000000 c0688a2c c159ae00 c15fa6c0 2faf0800 23c34600
<0>[211787.877218] 3de0: c159ae00 23c34600 c15f93c0 00000000 c1591180 c068ce98 c15f92a8 c1591180
<0>[211787.885464] 3e00: c0698b24 c14d13c0 2faf0800 c1ea9400 00000000 c068cedc c15f93c0 00000000
<0>[211787.893711] 3e20: 23c34600 c1591180 c1ea9480 c1ea9400 00000000 c068d108 23c34600 23c34600
<0>[211787.901957] 3e40: 00000000 ffffffff 23c34600 c0d82604 c15f93c0 c1ea0880 23c34600 dd98a010
<0>[211787.910205] 3e60: 2faf0800 c1ea9480 c1ea9400 c068d288 c1ca3c00 23c34600 dd98a010 2faf0800
<0>[211787.918451] 3e80: c1ea9480 c07b0f0c 00000000 c0dd2ca8 c1ea9438 c1ea94b8 00000000 23c34600
<0>[211787.926696] 3ea0: c7302000 c1ebc400 00000000 c0dd2c70 00000001 000927c0 00000000 00000000
<0>[211787.934944] 3ec0: c7302000 c07b6164 c1ebc400 000c3500 000927c0 000000a1 c1ebc400 c1ea9700
<0>[211787.943191] 3ee0: c1ea9780 c1ea9700 c1ea0f00 c1ea9780 00000000 c07b96fc c1ea9738 00000000
<0>[211787.951437] 3f00: c1ea9704 c0d90460 00000000 00000000 00000000 c07ba3e4 c1ea9738 c272cf80
<0>[211787.959682] 3f20: dd990980 dd993b00 00000000 c03388f4 00000008 dd990998 c272cf80 c272cf94
<0>[211787.967930] 3f40: dd990980 00000008 dd990998 c0d03d00 dd990b40 c0338bdc c0d9bab8 c0d0c164
<0>[211787.976176] 3f60: c272cf80 c8a831c0 c8150900 00000000 c7302000 c0338b68 c272cf80 c839fec4
<0>[211787.984422] 3f80: c8a831e4 c033eac0 00000000 c8150900 c033e964 00000000 00000000 00000000
<0>[211787.992669] 3fa0: 00000000 00000000 00000000 c0300148 00000000 00000000 00000000 00000000
<0>[211788.000916] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<0>[211788.009162] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
<0>[211788.017407] [<c0698678>] (krait_mux_set_parent) from [<c069a038>] (krait_notifier_cb+0x58/0xb8)
<0>[211788.025653] [<c069a038>] (krait_notifier_cb) from [<c0341ab4>] (srcu_notifier_call_chain+0x7c/0xf4)
<0>[211788.034592] [<c0341ab4>] (srcu_notifier_call_chain) from [<c0688a2c>] (__clk_notify+0x70/0x94)
<0>[211788.043704] [<c0688a2c>] (__clk_notify) from [<c068ce98>] (clk_change_rate+0xfc/0x2b8)
<0>[211788.052121] [<c068ce98>] (clk_change_rate) from [<c068cedc>] (clk_change_rate+0x140/0x2b8)
<0>[211788.060107] [<c068cedc>] (clk_change_rate) from [<c068d108>] (clk_core_set_rate_nolock+0xb4/0x1f8)
<0>[211788.068441] [<c068d108>] (clk_core_set_rate_nolock) from [<c068d288>] (clk_set_rate+0x3c/0x170)
<0>[211788.077471] [<c068d288>] (clk_set_rate) from [<c07b0f0c>] (dev_pm_opp_set_rate+0x348/0x674)
<0>[211788.086498] [<c07b0f0c>] (dev_pm_opp_set_rate) from [<c07b6164>] (__cpufreq_driver_target+0x1a0/0x5b4)
<0>[211788.094921] [<c07b6164>] (__cpufreq_driver_target) from [<c07b96fc>] (od_dbs_update+0xcc/0x1a0)
<0>[211788.104030] [<c07b96fc>] (od_dbs_update) from [<c07ba3e4>] (dbs_work_handler+0x38/0x74)
<0>[211788.113060] [<c07ba3e4>] (dbs_work_handler) from [<c03388f4>] (process_one_work+0x1fc/0x470)
<0>[211788.121133] [<c03388f4>] (process_one_work) from [<c0338bdc>] (worker_thread+0x74/0x5d4)
<0>[211788.129550] [<c0338bdc>] (worker_thread) from [<c033eac0>] (kthread+0x15c/0x160)
<0>[211788.137707] [<c033eac0>] (kthread) from [<c0300148>] (ret_from_fork+0x14/0x2c)
<0>[211788.145163] Exception stack(0xc7303fb0 to 0xc7303ff8)
<0>[211788.152288] 3fa0:                                     00000000 00000000 00000000 00000000
<0>[211788.157512] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<0>[211788.165753] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
<0>[211788.174000] Code: 6962 5f6f 7274 6d69 (6200) 6f69 
<4>[211788.180841] ---[ end trace b770df971b364141 ]---
<1>[211787.522720] 8<--- cut here ---
<1>[211787.522758] Unable to handle kernel paging request at virtual address c0b6348c
<1>[211787.524681] pgd = fafc0e29
<1>[211787.531961] [c0b6348c] *pgd=42a1941e(bad)
<0>[211787.534751] Internal error: Oops: 8000000d [#1] SMP ARM
<4>[211787.538911] Modules linked in: nss_ifb ecm ath10k_pci ath10k_core ath wireguard nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet mac80211 libchacha20poly1305 ipt_REJECT curve25519_neon cfg80211 xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_ecn xt_dscp xt_conntrack xt_comment xt_cgroup xt_addrtype xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_HL xt_DSCP xt_CT xt_CLASSIFY sch_cake ppp_async poly1305_arm nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_reject_ipv4 nf_log_ipv6 nf_log_ipv4 nf_log_common nf_flow_table nf_conntrack_netlink libcurve25519_generic libcrc32c iptable_nat iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat chacha_neon fuse sch_tbf
<4>[211787.539672]  sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred act_gact qca_nss_qdisc qca_nss_pppoe pppoe pppox ppp_generic slhc ledtrig_usbport cryptodev xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_NPT ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 nfsv4 nfsv3 nfs nfs_ssc msdos bonding ifb ip6_udp_tunnel udp_tunnel sit qca_nss_drv qca_nss_gmac oid_registry tunnel4 ip_tunnel tun xfrm_user xfrm_ipcomp af_key xfrm_algo vfat fat lockd sunrpc grace hfsplus hfs cdrom dns_resolver nls_utf8 nls_iso8859_15 nls_iso8859_1 nls_cp850 nls_cp437 nls_cp1250 wp512
<4>[211787.609333]  twofish_generic twofish_common tgr192 tea serpent_generic khazad cast6_generic cast5_generic cast_common camellia_generic blowfish_generic blowfish_common anubis xts crypto_user algif_skcipher algif_rng algif_hash algif_aead af_alg sha1_generic seqiv md5 kpp echainiv ecb des_generic libdes cmac authenc uas usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom ohci_platform ohci_hcd phy_qcom_ipq806x_usb ahci fsl_mph_dr_of ehci_platform ehci_fsl sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd ramoops reed_solomon pstore gpio_button_hotplug f2fs ext4 mbcache jbd2 exfat dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax crc32c_generic crc32_generic cbc encrypted_keys trusted tpm
<4>[211787.760086] CPU: 0 PID: 30299 Comm: kworker/0:1 Not tainted 5.10.120 #0
<4>[211787.782315] Hardware name: Generic DT based system
<4>[211787.789010] Workqueue: events dbs_work_handler
<4>[211787.793598] PC is at 0xc0b6348c
<4>[211787.798124] LR is at krait_mux_set_parent+0x60/0x64
<4>[211787.801583] pc : [<c0b6348c>]    lr : [<c0698678>]    psr: 60000033
<4>[211787.806533] sp : c7303d90  ip : 00000000  fp : c1ea9280
<4>[211787.812867] r10: c15b9b18  r9 : 00000000  r8 : c7303dd4
<4>[211787.818164] r7 : 00000002  r6 : ffffffff  r5 : 00000001  r4 : c15f8a58
<4>[211787.823460] r3 : c0b6348d  r2 : c0d9b31c  r1 : 20000013  r0 : 000346dc
<4>[211787.829800] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
<4>[211787.836396] Control: 10c5787d  Table: 481d406a  DAC: 00000051
<0>[211787.844034] Process kworker/0:1 (pid: 30299, stack limit = 0x724fb079)
<0>[211787.849676] Stack: (0xc7303d90 to 0xc7304000)
<0>[211787.856195] 3d80:                                     c15f8a64 c0d803a4 ffffffff c069a038
<0>[211787.860726] 3da0: 00000000 c0d803a4 ffffffff c0341ab4 c15b9b00 c0d803a4 c15f93c0 00000002
<0>[211787.868972] 3dc0: 2faf0800 c1ea9400 00000000 c0688a2c c159ae00 c15fa6c0 2faf0800 23c34600
<0>[211787.877218] 3de0: c159ae00 23c34600 c15f93c0 00000000 c1591180 c068ce98 c15f92a8 c1591180
<0>[211787.885464] 3e00: c0698b24 c14d13c0 2faf0800 c1ea9400 00000000 c068cedc c15f93c0 00000000
<0>[211787.893711] 3e20: 23c34600 c1591180 c1ea9480 c1ea9400 00000000 c068d108 23c34600 23c34600
<0>[211787.901957] 3e40: 00000000 ffffffff 23c34600 c0d82604 c15f93c0 c1ea0880 23c34600 dd98a010
<0>[211787.910205] 3e60: 2faf0800 c1ea9480 c1ea9400 c068d288 c1ca3c00 23c34600 dd98a010 2faf0800
<0>[211787.918451] 3e80: c1ea9480 c07b0f0c 00000000 c0dd2ca8 c1ea9438 c1ea94b8 00000000 23c34600
<0>[211787.926696] 3ea0: c7302000 c1ebc400 00000000 c0dd2c70 00000001 000927c0 00000000 00000000
<0>[211787.934944] 3ec0: c7302000 c07b6164 c1ebc400 000c3500 000927c0 000000a1 c1ebc400 c1ea9700
<0>[211787.943191] 3ee0: c1ea9780 c1ea9700 c1ea0f00 c1ea9780 00000000 c07b96fc c1ea9738 00000000
<0>[211787.951437] 3f00: c1ea9704 c0d90460 00000000 00000000 00000000 c07ba3e4 c1ea9738 c272cf80
<0>[211787.959682] 3f20: dd990980 dd993b00 00000000 c03388f4 00000008 dd990998 c272cf80 c272cf94
<0>[211787.967930] 3f40: dd990980 00000008 dd990998 c0d03d00 dd990b40 c0338bdc c0d9bab8 c0d0c164
<0>[211787.976176] 3f60: c272cf80 c8a831c0 c8150900 00000000 c7302000 c0338b68 c272cf80 c839fec4
<0>[211787.984422] 3f80: c8a831e4 c033eac0 00000000 c8150900 c033e964 00000000 00000000 00000000
<0>[211787.992669] 3fa0: 00000000 00000000 00000000 c0300148 00000000 00000000 00000000 00000000
<0>[211788.000916] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<0>[211788.009162] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
<0>[211788.017407] [<c0698678>] (krait_mux_set_parent) from [<c069a038>] (krait_notifier_cb+0x58/0xb8)
<0>[211788.025653] [<c069a038>] (krait_notifier_cb) from [<c0341ab4>] (srcu_notifier_call_chain+0x7c/0xf4)
<0>[211788.034592] [<c0341ab4>] (srcu_notifier_call_chain) from [<c0688a2c>] (__clk_notify+0x70/0x94)
<0>[211788.043704] [<c0688a2c>] (__clk_notify) from [<c068ce98>] (clk_change_rate+0xfc/0x2b8)
<0>[211788.052121] [<c068ce98>] (clk_change_rate) from [<c068cedc>] (clk_change_rate+0x140/0x2b8)
<0>[211788.060107] [<c068cedc>] (clk_change_rate) from [<c068d108>] (clk_core_set_rate_nolock+0xb4/0x1f8)
<0>[211788.068441] [<c068d108>] (clk_core_set_rate_nolock) from [<c068d288>] (clk_set_rate+0x3c/0x170)
<0>[211788.077471] [<c068d288>] (clk_set_rate) from [<c07b0f0c>] (dev_pm_opp_set_rate+0x348/0x674)
<0>[211788.086498] [<c07b0f0c>] (dev_pm_opp_set_rate) from [<c07b6164>] (__cpufreq_driver_target+0x1a0/0x5b4)
<0>[211788.094921] [<c07b6164>] (__cpufreq_driver_target) from [<c07b96fc>] (od_dbs_update+0xcc/0x1a0)
<0>[211788.104030] [<c07b96fc>] (od_dbs_update) from [<c07ba3e4>] (dbs_work_handler+0x38/0x74)
<0>[211788.113060] [<c07ba3e4>] (dbs_work_handler) from [<c03388f4>] (process_one_work+0x1fc/0x470)
<0>[211788.121133] [<c03388f4>] (process_one_work) from [<c0338bdc>] (worker_thread+0x74/0x5d4)
<0>[211788.129550] [<c0338bdc>] (worker_thread) from [<c033eac0>] (kthread+0x15c/0x160)
<0>[211788.137707] [<c033eac0>] (kthread) from [<c0300148>] (ret_from_fork+0x14/0x2c)
<0>[211788.145163] Exception stack(0xc7303fb0 to 0xc7303ff8)
<0>[211788.152288] 3fa0:                                     00000000 00000000 00000000 00000000
<0>[211788.157512] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<0>[211788.165753] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
<0>[211788.174000] Code: 6962 5f6f 7274 6d69 (6200) 6f69 
<4>[211788.180841] ---[ end trace b770df971b364141 ]---
<0>[211788.193774] Kernel panic - not syncing: Fatal exception
<2>[211788.193816] CPU1: stopping
<4>[211788.198153] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D           5.10.120 #0
<4>[211788.200670] Hardware name: Generic DT based system
<4>[211788.208325] [<c030e46c>] (unwind_backtrace) from [<c030a204>] (show_stack+0x14/0x20)
<4>[211788.213098] [<c030a204>] (show_stack) from [<c0632ea8>] (dump_stack+0x94/0xa8)
<4>[211788.221081] [<c0632ea8>] (dump_stack) from [<c030d190>] (do_handle_IPI+0x140/0x184)
<4>[211788.228196] [<c030d190>] (do_handle_IPI) from [<c030d1f0>] (ipi_handler+0x1c/0x2c)
<4>[211788.236184] [<c030d1f0>] (ipi_handler) from [<c037184c>] (__handle_domain_irq+0x90/0xf4)
<4>[211788.243564] [<c037184c>] (__handle_domain_irq) from [<c064c154>] (gic_handle_irq+0x90/0xb8)
<4>[211788.251896] [<c064c154>] (gic_handle_irq) from [<c0300b8c>] (__irq_svc+0x6c/0x90)
<4>[211788.260393] Exception stack(0xc146df18 to 0xc146df60)
<4>[211788.267775] df00:                                                       00000000 0000c09e
<4>[211788.272918] df20: 1cd58000 dd99fd80 00000000 c809dbc0 c1c69040 00000000 dd99f030 0000c09e
<4>[211788.281163] df40: 00000000 0000c09e 365d2ec0 c146df68 c07bd41c c07bd43c 60000013 ffffffff
<4>[211788.289399] [<c0300b8c>] (__irq_svc) from [<c07bd43c>] (cpuidle_enter_state+0x180/0x380)
<4>[211788.297643] [<c07bd43c>] (cpuidle_enter_state) from [<c07bd68c>] (cpuidle_enter+0x3c/0x5c)
<4>[211788.305888] [<c07bd68c>] (cpuidle_enter) from [<c034e678>] (do_idle+0x208/0x2a4)
<4>[211788.314046] [<c034e678>] (do_idle) from [<c034e9d0>] (cpu_startup_entry+0x1c/0x20)
<4>[211788.321685] [<c034e9d0>] (cpu_startup_entry) from [<423015ac>] (0x423015ac)

Your crash is likely caused by the Krait CPU clock bug. Doesn’t look like it’s related to mac80211.

1 Like

Is it possible to replace the files on the router responsible for cpu support (e.g. changing clocks, overclocking) and usb (changing the frequency) without uploading the entire image again?

My problems with the connection speed may be related not so much to AQL or DSA..., and to the board-2.bin file - changing to the newest standard (with OpenWrt) gives better speeds up close (even by 1/3) but the range decreases drastically and does not allow me to go beyond floor. On the other hand, the modified one (only for ea6350v3) gives a range (up to 200 meters from the router), but the speed close up is 1/3 lower, which I don't care much about anyway. Strange, because on stable releases before 22.03 and master on 5.10 it was fine despite the use of a modified board file- near and the far speeds were from 30 to 15 Mbps respectively - maybe in the 5.15 kernel some modifications were made to read or use board-2.bin files differently as in earlier versions OpenWrt?
I can also be guilty of USB in the router (errors

[6313.592704] usb usb2-port1: Cannot enable. Maybe the USB cable is bad?
[ 6359.172602] usb usb2-port1: Cannot enable. Maybe the USB cable is bad?

), but this does not affect the speed and availability of the cable connection or stable of wireless - I would rather bet a conflict between the USB frequency and the 2.4GHz wifi frequency (although this supposedly can take place if the port works in version 3.0 and not as currently 2.0).

As for AQL tests and the latest patches (all 8 patches) - there is still an error in the log (and interestingly completely random - regardless of the load, number of clients), but it does not have any visible effects as before. Without patch 334 and 335, the pings are higher and the near / far speeds are similar to how apply all patches.

EDIT: I change board file on in old optymized (from 19.07 notengobattery images) , and...
20-22Mbps/25ms in test room (from before 1.5-2Mbps/28-30ms)// tests without 334 and 335 patch.
Will see how long work fine.

What is the clock bug? The min 800MHz frequency?
FWIW, I always run my R7800 with the performance governor at max 1.7GHz, minimal temperature increase 2-3C, barely gets over 60C in the summer. I used to have an Asus router that ran at 80C for many years, I think 60C is fine. Why do we bother with freq scaling for a device that's always plugged in, no battery life concerns like a phone :man_shrugging:

1 Like

excerpts from simultaneous netperf (AP -> wifi clients) using nmba and "59" client (the one that did 0.07 mbps BE upload in the tests above). Both clients in the same location as the test above. MCS values as reported on the AP (I checked several times during the ~5 min netperfs - they did not change).

59:

Interim result:   90.16 10^6bits/s over 1.001 seconds ending at 1656515800.668
Interim result:   88.43 10^6bits/s over 1.019 seconds ending at 1656515801.687
Interim result:   87.33 10^6bits/s over 1.013 seconds ending at 1656515802.700
Interim result:   88.50 10^6bits/s over 1.000 seconds ending at 1656515803.700
Interim result:   89.57 10^6bits/s over 1.005 seconds ending at 1656515804.705
Interim result:   88.42 10^6bits/s over 1.013 seconds ending at 1656515805.717
Interim result:   89.54 10^6bits/s over 1.005 seconds ending at 1656515806.722
Interim result:   89.47 10^6bits/s over 1.001 seconds ending at 1656515807.723
Interim result:   88.54 10^6bits/s over 1.013 seconds ending at 1656515808.736
Interim result:   87.97 10^6bits/s over 1.008 seconds ending at 1656515809.744
Interim result:   89.95 10^6bits/s over 1.003 seconds ending at 1656515810.747

nmba:

Interim result:   13.06 10^6bits/s over 2.077 seconds ending at 1656515800.050
Interim result:   13.27 10^6bits/s over 1.002 seconds ending at 1656515801.053
Interim result:   18.14 10^6bits/s over 1.002 seconds ending at 1656515802.054
Interim result:   16.41 10^6bits/s over 1.100 seconds ending at 1656515803.154
Interim result:   14.09 10^6bits/s over 1.163 seconds ending at 1656515804.317
Interim result:   14.41 10^6bits/s over 1.015 seconds ending at 1656515805.332
Interim result:   13.51 10^6bits/s over 1.066 seconds ending at 1656515806.398
Interim result:   12.35 10^6bits/s over 1.090 seconds ending at 1656515807.488
Interim result:   14.68 10^6bits/s over 1.015 seconds ending at 1656515808.503
Interim result:   19.06 10^6bits/s over 1.013 seconds ending at 1656515809.516
Interim result:   17.03 10^6bits/s over 1.121 seconds ending at 1656515810.637

I did not make a cut and paste error nor did I miss lablel the netperf results.

MCS from r7500v2 AP

r7500v2 # iw dev wlan0 station dump | grep "tx bitrate"
        nmba tx bitrate:     360.0 MBit/s VHT-MCS 8 40MHz short GI VHT-NSS 2
        59 tx bitrate:     180.0 MBit/s MCS 12 40MHz short GI

Both 59 and nmba are about 10 m from the AP, no clear line of site, and about 3 m from each other. There are gaps in the (wood frame and drywall) walls obstructing their line of site.

I started the 59 wifi client netperf first (this use to matter in the past - atm i can start either client first and get reproducible results). A single client netperf to nmba at this location can achieve 300+ mbps. When I start the nmba netperf fisrt, I'll see 200-300+ mbps until I start the 59 netperf at which point it drops to 10-20 mbps.

I'll bet that if I put tbf's on the wifi clients and limit their throughput to about 30 mbps and then repeat the "reversed" flent rtt_fair test, I'll get more meaningful results.

I wish the pre-packaged netperf binaries had the "-w" and "-b" options included by default. It would make using flent for this kind of testing a lot easier than playing with qdisc's.

EDIT 0: I added the 56 wifi client back in and tried simultaneous netperf's. 56 also is in the same location are the original reverse rtt_fair test above (~1 m from AP, clear line of site)

In words: I started the 56 client netperf first, then added nmba and 59. Throughput on the 56 client went from ~165 mbps (before starting nmba and 59) to a complete stop.

MCS 56 client stream on it's own:

r7500v2 # iw dev wlan0 station dump | grep "tx bitrate"
        nmba tx bitrate:     300.0 MBit/s VHT-MCS 7 40MHz short GI VHT-NSS 2
        59 tx bitrate:     180.0 MBit/s MCS 12 40MHz short GI
        56 tx bitrate:     270.0 MBit/s MCS 15 40MHz

MCS all three clients streaming:

r7500v2 # iw dev wlan0 station dump | grep "tx bitrate"
        nmba tx bitrate:     360.0 MBit/s VHT-MCS 8 40MHz short GI VHT-NSS 2
        59 tx bitrate:     150.0 MBit/s MCS 7 40MHz short GI
        56 tx bitrate:     30.0 MBit/s MCS 8 40MHz short GI

So something like this probably did happen during the reverse rtt_fair test. The only way I've been able to avoid it is to limit the total throughput to something the AP can handle (about 100 mbps by my estimation with this configuration).

ty for the suggestion to look at MCS.

EDIT 1: I turned the mac (nmba) off and added a third ubuntu wifi client (call it 135).

3 client (56, 59, & 135) simultaneous netperf:

56: ~30 mbps; tx bitrate:     150.0 MBit/s MCS 7
59: ~60 mbps; tx bitrate:     270.0 MBit/s MCS 14
135: ~70 mbps; tx bitrate:     300.0 MBit/s MCS 15

So ATF works for me? Or ATF works only if sans apple?

I probably will be able to during the weekend. Here I'm using imagebuilder for it.

In my case the difference is not so dramatic, WiFi vs cable is ≈4-6 ms vs ≈0.2-1 ms in normal conditions. Under load if you check my Waveform it increases only by 5-10 ms under normal conditions on WiFi.

@quarky

never mind

I'll leave an edited version of this post up in the event it helps someone else.

For non apple devices, be sure to disable the wifi powersave feature (which I have been doing for my reported results above).

On ubuntu:

sudo iwconfig <wifi_if> power off

Apparently this is not possible on apple devices. After a little googling and reading others experiences with ping, I will not use an apple device for testing. I can't say if what I observed above using a mac result from this or some other feature mac/broadcom have hidden in their software or hardware.

It does look like QCA9980 with ath10k-ct driver firmware does not support wifi power save.

r7500v2 # cat /sys/kernel/debug/ieee80211/phy*/netdev:wlan*/stations/*/peer_ps_state
2
2
...

the output 2 indicates disabled

non-ct ath10k may not support it in the future as well:

Something that @Ansuel is working on I believe. IIRC, it's something to do with L2 and CPU clock not in sync when switching from/to 384Mhz causing corruption (somehow) to the cache.

I set my R7800 to a min of 800MHz CPU clock and let it scale using the schedutil governor. Seems stable for my R7800 so far.

Funny, I've just done exactly what you did before reading your message. I switched to schedutil governor and min_freq = 800 MHz.

Before that, my CPU governor was ondemand and min_freq = 800 MHz, and I've kept getting random annoying crashes at the least expected moments.

In Intel-based machines whose CPU is not capable of supporting the Intel P-state (older CPUs), most Linux distros use the default governor "schedutil".

Intel's own Linux distro "ClearOS" uses the governor "Performance" as the default, so its benchmark numbers always look good. Intel cheater :slight_smile:

1 Like

Hi Felix,

Do you have any plan to backport the changes to 21.02 branch as well? Hopefully the next 21.02.4 will be reliable again in terms of WIFI.

Thanks a lot!

I have not seen the issue occur after the mt76 update. I am currently running

OpenWrt 22.03-SNAPSHOT r19482-2b8021d614 / LuCI openwrt-22.03 branch git-22.167.28394-8a4486a

I also did not see this issue with

OpenWrt 22.03-SNAPSHOT r19455-f608779f92 / LuCI openwrt-22.03 branch git-22.167.28394-8a4486a

and

OpenWrt SNAPSHOT r19873-a703f9ed0b / LuCI Master git-22.167.28356-8effea5

However one time (not sure which OpenWrt version I was running at that time), I thought the issue occurred again but it turned out to be due to something else.

Websites couldn't load while I was still connected to WiFi in my PIxel 6, but I was able to log into my RT3200 router from my phone (which means I was not disconnected from WiFi) using the local IPv4 address.

It turned out to be some issue with "Private DNS" (DNS over TLS) feature in my Pixel 6 (Android 12, Build SQ3A.220605.009.B1). I have disabled "Private DNS" in my phone for now and there have been no issues since then.

My /etc/config/wireless is still same as at 802.11r Fast Transition how to understand that FT works? - #105 by ka2107 (I only changed option he_bss_color from '128' to '8' when moving from MASTER-SNAPSHOT to 22.03-SNAPSHOT).

Hi Felix @nbd, thank you for your patches. It has been working very well for me on Belkin RT3200.

One question though, do these changes affect ath9k in any way? I have a TP-Link Archer A7 v5 in a remote location (in another country), currently running 22.03.0-rc4. Will it improve WiFi latency on the 2.4 GHz band (HT20, 802.11n only) if I flash the latest 22.03-SNAPSHOT on the Archer A7?

I would be comforted if everyone could re-demonstrate a rrul_be result like this, over 300 seconds, on all the wifi chipsets openwrt supports, in whatever the final patchset looks like.

Nuke it from orbit. It's the only way to be sure.

1 Like

Hi Dave @dtaht, while I would love to run the RRUL test and provide you the results from different clients and APs, unfortunately Flent only runs on Linux. I have a Arch Linux installation running on a laptop with Intel AC 9560 WiFi card. I am able to run Flent on it. I will try to run the test on it provide the results to you. However most of the time my laptop is wired over Ethernet (Intel I219-V, or Realtek based USB) to my RT3200 (ISP: Comcast Xfinity, DOCSIS 3.1, Arris S33 Modem, 50/10 Mbps) and I almost never connect WiFI on it. However this will still be a result from only a single client and single AP/Router.

I tried setting up Flent on a 2017 Macbook Air (Router/AP: Belkin RT3200, ISP: AirTel, Country: India, GPON Fiber 40/40 Mbps, PPPoE) running macOS 12.4 Monterey but I was not able to set it up due to python dependencies issue. It may be my own lack of understanding since I am not familiar with macOS. Flent also does not run on Windows.

For this remote location, I can provide Waveform results and maybe newer speedtest.net results with loaded latency. But those tests would be over VNC which I am not sure how it will affect the results. It would be nice to be able to run Flent on those systems though.

I also have a Netgear R7800 (ISP: AT&T Fiber; Symmetric Gigabit; AT&T 5268ac in IP Passthrough mode) and TP-Link Archer A7 v5 (ISP: BSNL, Country: India, EPON Fiber 60/60 Mbps, PPPoE). However these 2 devices I can only control from WAN side and I do not have any client devices I control on which I can VNC into and run any WiFi tests.

Yeah, nah, not gonna happen to shiny and useful. I will redo with a Linux box.

Entropy level in master on 5.15.45 not actualize... Is low. Maybe have it this can be the reason for poor wlan and transfers performance (especially from a distance) and jumping pings and log errors - still popping up the same despite applying all recent fixes. It probably has to do with the problem on this topic as well.
At stable on 19.07 I had> 3500 points (variables), now I have a constant low of 256 and despite turning on rng-tools or haveged this level does not change at all.

root@OpenWrt:~# /etc/init.d/haveged restart
root@OpenWrt:~# cat /proc/sys/kernel/random/entropy_avail
256
root@OpenWrt:~# /etc/init.d/haveged status
running
root@OpenWrt:~# cat /proc/sys/kernel/random/entropy_avail
256
root@OpenWrt:~#

Replacing board-2.bin files with different calibrations does not help much - sometimes they jump with better results, but usually it is very poor (with speed from a distance, but also pings can jump like a monkey on a tree).

EDIT: Ok, Changes valuse entrophy are 'new feature' and not bugs. But does it not have any impact on other system evlemets, such as transfers or pings?

EDIT2:
In logs I see a little errors:

Fri Jul  1 11:51:29 2022 daemon.err haveged[6070]: haveged: command socket is listening at fd 3
Fri Jul  1 11:51:29 2022 daemon.info haveged[6070]: haveged starting up
Fri Jul  1 11:51:30 2022 daemon.err haveged[6070]: haveged: ver: 1.9.18; arch: generic; vend: ; build: (gcc 11.3.0 CV); collect: 128K
Fri Jul  1 11:51:30 2022 daemon.err haveged[6070]: haveged: cpu: (); data: 32K (P); inst: 32K (P); idx: 19/40; sz: 32744/67304
Fri Jul  1 11:51:30 2022 daemon.err haveged[6070]: haveged: fills: 0, generated: 0
Fri Jul  1 11:51:30 2022 daemon.err haveged[6070]: haveged: Stopping due to signal 15
Fri Jul  1 11:51:30 2022 daemon.err haveged[6070]:
Fri Jul  1 11:51:30 2022 daemon.err haveged[6070]: fills: 1, generated: 512 K bytes, RNDADDENTROPY: 256

I am sorry that getting flent up and running on OSX has become so darn hard. Neither @tohojo or I have ready access to an OSX box. Could you file a bug here: https://github.com/tohojo/flent/issues with the errors you get on trying to get it built?

1 Like

I tend to be concerned about a lack of entropy also. How is performance without encryption?

1 Like