AQL and the ath10k is *lovely*

Everything is fine, but the range in 5Ghz is worse, the signal penetrates less obstacles, and what follows, you would need several devices, not one, to provide as such Internet at the distance of 100 meters from the device (not 100 m2). My test environment is 2 floors below, but at home you can think about 5GHz, but not on a free air. With the current settings (without helping with external antennas, without special equipment placement and frying neighbors with radio waves :sweat_smile:) I get good transfers even up to 150m from the router, but in the test room it sucks. The other thing is that other people at my place also use only 2.4GHz.

The other thing is that the original firmwares are better - and this is where we should improve, not create prostheses - because it's going to happen that the more modern the equipment, the worse it's going to be to get the same results as on old junk (on ea6350v3 I'm getting worse transfers as on wr1043nd v3 and earlier wr1043v1 (where I was getting the best results, but the equipment got fried)).

5GHz attenuation is not that severe with California's wooden frame houses and drywall, so one good router (like r7800 and proper location) can cover the whole house easily.

And since houses are close together, 2.4GHz interference ought to be quite bad. However, OEM firmware always performs much better with 2.4GHz than OpenWrt in spite of all this 2.4GHz interference.

2 Likes

Worse if the house is a typical bunker with multiple reinforced ceilings and thick masonry walls. California in USA have hot climate (good temperatures, not bad humidity), Poland not (freeze temperatures in 1/2 of year, bad humidity for 3/4 year) :slight_smile:...

Bunker? It seems to have been built to prepare for wars, against that rogue nation in the 21st century.

:slight_smile:, no- this is typical house in 1980-1995 years- i.e. 'cubic' (in polish lang: "kostka").

Now are also building houses made of brick and concrete, but lower ones, also in my case, the neighbors will not disturb between each other the ether with too strong 2.4GHz for a long time - and it happens (that even in my case they transmit at full power and at 40MHz)...

Edit: OK, I tested patches of AQL, and... I delete all without 330- seems to be useful.

Patch 332-mac80211-fix-ieee80211_txq_may_transmit-regression causes log errors.
Patch 331-mac80211-improve-AQL-tx-time-estimation causes worse speed in speed tests (especially at further customer distances from the access point).
Patch 333-mac80211-rework-the-airtime-fairness-implementation will worked (mayble), after rework, now it not compiled without patches 331 and 332.

Patch 330-mac80211-fix-overflow-issues-in-airtime-fairness-cod mayble have potencial on smaller ping (first tests near router).

1 Like

I rebooted router on 2nd partition (with master 5.10.1xx) and is better... (download, often ping also). Mayble problem with wireless (acces times and speed of Internet - iperf shows almost always better transfers) is not from airtime. Mayble it is about an Minstrel algorithm or something else.
Tests:
Today is 20-22Mbps with speedtest.net, 30-34Mbps with iperf on master 5.10.111, pings from 24 to 28ms.

16-19Mbps, pings 28-34 with on 5.15.38 kernel.

Yesterday on 5.15.38 with all master patches- 8-12Mbps, pings about 40ms , without new patches (delete and recompile) this is same :thinking:

All on 2.4GHz... Big storm and hail and all the other weather plagues. Today I are no longer working on computer. All devices electric I turned off.

I need to rebuild the environment from scratch because I messed something up (not every compiled image install on router, if it works, it doesn't completely - I wanted to switch from kmod-ath10-ct to normal)....

So after a lot of discussions with Toke (who built the airtime fairness scheduling code), I completely dropped my previous round of patches, reverted the code back to the earlier round-robin scheduler and added a number of improvements on top of it.
In my very basic and limited testing with mt76, latency and throughput are looking good so far.
You can find the changes in my staging tree at https://git.openwrt.org/?p=openwrt/staging/nbd.git;a=summary (single commit, top of the list).
Please give that a try and let me know if it works better for you guys.

8 Likes

Great news! Thanks a lot Felix.

Hi @nbd,

Thanks for working on this.

Setup:

Openwrt:

  • Openwrt 22.03
  • IPQ8064
  • 256MB RAM
  • ath10k-ct 5.15

Test setup:

  • I changed my setup to only include 20 linux clients (no mac or ios clients).

Patches

I applied your new patches on top of latest (as of 3 days ago) openwrt 22.03.

Results

They are the same :frowning:

  • high memory usage and then crash (Unable to handle kernel NULL pointer dereference) - report below
  • more than 3 or 4 clients, download: tcp throughput drops to around 100Mbps and udp throughput halves
  • more than 3 or 4 clients, upload: udp throughput drops to around 50Mpbs
  • same results when adding (throughput drops) or removing (throughput increases) clients while running the tests

I checked the build_dir to see if patches were applied correctly for mac80211 files and looks like they were applied. But I am going to do a "super clean" build just in case something didn't compile correct.


[  143.584926] 8<--- cut here ---
[  143.587923] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[  143.591004] pgd = 99bcb531
[  143.599347] [00000004] *pgd=00000000
[  143.599679] netifd invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
[  143.601764] Internal error: Oops: 5 [#1] SMP ARM
[  143.605501] CPU: 0 PID: 678 Comm: netifd Not tainted 5.10.120 #0
[  143.613566] Modules linked in:
[  143.618428] Hardware name: Generic DT based system
[  143.618433]  nft_fib_inet
[  143.624429] [<c030e54c>] (unwind_backtrace) from [<c030a2d0>] (show_stack+0x14/0x20)
[  143.627280]  nf_flow_table_ipv6
[  143.632062] [<c030a2d0>] (show_stack) from [<c061d10c>] (dump_stack+0x94/0xa8)
[  143.634750]  nf_flow_table_ipv4
[  143.642569] [<c061d10c>] (dump_stack) from [<c041606c>] (dump_header+0x58/0x1a8)
[  143.645427]  nf_flow_table_inet
[  143.652719] [<c041606c>] (dump_header) from [<c041687c>] (oom_kill_process+0x1ec/0x1f0)
[  143.655757]  ath10k_pci
[  143.663396] [<c041687c>] (oom_kill_process) from [<c0417264>] (out_of_memory+0x194/0x358)
[  143.666260]  ath10k_core
[  143.674246] [<c0417264>] (out_of_memory) from [<c045aa9c>] (__alloc_pages_nodemask+0x9ac/0xed4)
[  143.676677]  ath
[  143.685007] [<c045aa9c>] (__alloc_pages_nodemask) from [<c045afdc>] (__get_free_pages+0x18/0x3c)
[  143.687615]  wireguard
[  143.696034] [<c045afdc>] (__get_free_pages) from [<c04f2c38>] (proc_pid_readlink+0xa0/0x1e8)
[  143.698119]  nft_reject_ipv6
[  143.706886] [<c04f2c38>] (proc_pid_readlink) from [<c0489544>] (vfs_readlink+0x100/0x110)
[  143.709057]  nft_reject_ipv4
[  143.717651] [<c0489544>] (vfs_readlink) from [<c047c440>] (do_readlinkat+0xb0/0x10c)
[  143.720515]  nft_reject_inet
[  143.728586] [<c047c440>] (do_readlinkat) from [<c0300060>] (ret_fast_syscall+0x0/0x54)
[  143.731538]  nft_reject
[  143.739258] Exception stack(0xc2a6dfa8 to 0xc2a6dff0)
[  143.742128]  nft_redir
[  143.749857] dfa0:                   00000011 b6f0f984 beacbd34 beacbd10 00000011 beacbd04
[  143.752195]  nft_quota
[  143.757410] dfc0: 00000011 b6f0f984 beacbd10 00000055 b6f11010 00046784 00045b90 beacbd6c
[  143.759664]  nft_objref
[  143.767908] dfe0: 00045e68 beacbd00 00017724 b6eedcec
[  143.770167]  nft_numgen
[  143.778493] Mem-Info:
[  143.780667]  nft_nat
[  143.785880] active_anon:35 inactive_anon:1244 isolated_anon:0
[  143.785880]  active_file:115 inactive_file:115 isolated_file:0
[  143.785880]  unevictable:0 dirty:0 writeback:0
[  143.785880]  slab_reclaimable:878 slab_unreclaimable:3507
[  143.785880]  mapped:188 shmem:15 pagetables:95 bounce:0
[  143.785880]  free:2777 free_pcp:92 free_cma:0
[  143.788128]  nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mac80211 libchacha20poly1305 curve25519_neon cfg80211 poly1305_arm nfnetlink nf_reject_ipv6
[  143.790567] Node 0 active_anon:140kB inactive_anon:4976kB active_file:460kB inactive_file:460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:752kB dirty:0kB writeback:0kB shmem:60kB writeback_tmp:0kB kernel_stack:752kB all_unreclaimable? no
[  143.801229]  nf_reject_ipv4 nf_log_ipv6 nf_log_ipv4 nf_log_common nf_defrag_ipv6 nf_defrag_ipv4 libcurve25519_generic libcrc32c crc_ccitt compat chacha_neon ledtrig_usbport ledtrig_oneshot ip6_udp_tunnel udp_tunnel seqiv cmac leds_gpio
[  143.825258] Normal free:11108kB min:16384kB low:20480kB high:24576kB reserved_highatomic:0KB active_anon:140kB inactive_anon:4976kB active_file:588kB inactive_file:612kB unevictable:0kB writepending:0kB present:229376kB managed:216724kB mlocked:0kB pagetables:380kB bounce:0kB free_pcp:368kB local_pcp:328kB free_cma:0kB
[  143.847844]  xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_qcom ohci_platform ohci_hcd ledtrig_transient phy_qcom_ipq806x_usb ahci fsl_mph_dr_of ehci_platform ehci_fsl sd_mod ahci_platform libahci_platform libahci
[  143.870139] lowmem_reserve[]:
[  143.896889]  libata scsi_mod ehci_hcd gpio_button_hotplug crc32c_generic
[  143.896912] CPU: 1 PID: 77 Comm: kworker/1:4 Not tainted 5.10.120 #0
[  143.896915] Hardware name: Generic DT based system
[  143.896928] Workqueue: ubiblock0_1 ubiblock_do_work
[  143.896940] PC is at submit_descs+0x64/0x1a4
[  143.896951] LR is at vchan_tx_submit+0x74/0x88
[  143.919187]  0
[  143.937607] pc : [<c070775c>]    lr : [<c0687fb4>]    psr: 20000113
[  143.937611] sp : c1313c98  ip : c103231c  fp : 000001f4
[  143.937615] r10: ffffffff  r9 : c2afe000  r8 : c12c0f40
[  143.937620] r7 : 00000000  r6 : c12c0f40  r5 : c12c0f84  r4 : c1c87640
[  143.937624] r3 : 000079c5  r2 : c2f92f38  r1 : 20000113  r0 : fffffff4
[  143.937628] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  143.937632] Control: 10c5787d  Table: 44a6806a  DAC: 00000051
[  143.937638] Process kworker/1:4 (pid: 77, stack limit = 0x3e2370e7)
[  143.937642] Stack: (0xc1313c98 to 0xc1314000)
[  143.937649] 3c80:                                                       c1369040 00000000
[  143.937666] 3ca0: 0000001b 00000004 c12c0f40 c0708f28 00000000 c11e1cc0 00000001 c11e1cc0
[  143.937683] 3cc0: c1369040 c2afd800 00000000 0000be1b 0000be1b c2afd800 c12c0f40 00000000
[  143.940565]  0
[  143.947344] 3ce0: c1313dc8 c1369040 c07093f8 00000800 00000000 00000000 00000000 c1313dc8
[  143.947360] 3d00: c136911c c06f8c44 c190f200 0011edd8 00000040 00000000 c2afd800 00000800
[  143.947376] 3d20: 0000be1b 00000000 00000000 c2afd800 c2afd800 00000800 00000000 0000be1b
[  143.947391] 3d40: 00000000 00000000 00000000 00000040 00000000 c0dda304 00000000 00000000
[  143.947406] 3d60: 00000000 00000000 00000001 c1369040 c1800c00 c1313dc8 00000000 01f0d000
[  143.953672]  0
[  143.958282] 3d80: 00000000 00000000 00000000 c06e1648 c1313dc8 cdde0d40 0000b6ab 00000000
[  143.958298] 3da0: 00000000 0000d000 000000f8 00001000 00000000 c0ac128c 01f0d000 c06e175c
[  143.958314] 3dc0: c1313dc8 00000c40 00000000 00001000 00000000 00000000 00000000 00000000
[  143.958332] 3de0: c2afd000 00000000 00000030 00000004 c1920000 c07199e0 00001000 c1313e1c
[  143.963045]
[  143.967570] 3e00: c2afd000 c1920000 c1300800 c2afd000 c0a4af74 c0ac1258 00000001 c0716d50
[  143.967585] 3e20: 00000030 00000000 c1920000 c1300800 00000030 c2afd000 000000f8 00000001
[  143.967602] 3e40: 00000030 c0717710 00001000 01e3372f c0a3fef4 9ef5325f 00000000 c1b405b8
[  143.967617] 3e60: c1b40578 00001000 00013000 00000005 c1920000 c1300800 00000030 c0717acc
[  143.971809] Normal:
[  143.973475] 3e80: 0000c000 00001000 00000000 00000000 00001c00 c1312000 00000000 00000000
[  143.979671] 175*4kB
[  143.984844] 3ea0: 00008400 c1300800 c1b40578 00016c00 c1b404c0 c1920000 00000001 c0716124
[  143.990102] (ME)
[  143.996651] 3ec0: 0000d000 00016c00 00000000 00000000 ff7f8e00 0001f000 00008400 00018800
[  144.003224] 147*8kB
[  144.010366] 3ee0: 00000031 c1b40578 c1b404c0 c12fee00 00000000 c0721ce0 00016c00 00000000
[  144.010381] 3f00: c2eeaa00 c1b406d0 c1313f44 c1b40568 c1075c00 cdde0900 ff7f8e00 00000000
[  144.010397] 3f20: 00000040 00000000 c1312000 c0338c00 00000008 cdde0918 c1075c00 c1075c14
[  144.010413] 3f40: cdde0900 00000008 cdde0918 c0d03d00 cdde0ac0 c0338eec c0daeb48 c0d0c02c
[  144.016085] (ME)
[  144.022169] 3f60: c1075c00 c107a940 c107a6c0 00000000 c1312000 c0338e78 c1075c00 c1311ec4
[  144.022184] 3f80: c107a964 c033edd4 00000000 c107a6c0 c033ec78 00000000 00000000 00000000
[  144.022200] 3fa0: 00000000 00000000 00000000 c0300148 00000000 00000000 00000000 00000000
[  144.022218] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  144.026679] 71*16kB
[  144.034847] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[  144.034859] [<c070775c>] (submit_descs) from [<c0708f28>] (read_page_ecc+0x170/0x580)
[  144.034874] [<c0708f28>] (read_page_ecc) from [<c06f8c44>] (nand_read_oob+0x1d4/0x790)
[  144.034887] [<c06f8c44>] (nand_read_oob) from [<c06e1648>] (mtd_read_oob+0x94/0x168)
[  144.034897] [<c06e1648>] (mtd_read_oob) from [<c06e175c>] (mtd_read+0x40/0x5c)
[  144.034907] [<c06e175c>] (mtd_read) from [<c07199e0>] (ubi_io_read+0xd4/0x368)
[  144.043053] (UME)
[  144.051159] [<c07199e0>] (ubi_io_read) from [<c0717710>] (ubi_eba_read_leb+0xa8/0x3e0)
[  144.051170] [<c0717710>] (ubi_eba_read_leb) from [<c0717acc>] (ubi_eba_read_leb_sg+0x84/0x194)
[  144.051179] [<c0717acc>] (ubi_eba_read_leb_sg) from [<c0716124>] (ubi_leb_read_sg+0x94/0xd4)
[  144.051188] [<c0716124>] (ubi_leb_read_sg) from [<c0721ce0>] (ubiblock_do_work+0x94/0x134)
[  144.051204] [<c0721ce0>] (ubiblock_do_work) from [<c0338c00>] (process_one_work+0x1fc/0x474)
[  144.051214] [<c0338c00>] (process_one_work) from [<c0338eec>] (worker_thread+0x74/0x5d4)
[  144.052722] 33*32kB
[  144.060974] [<c0338eec>] (worker_thread) from [<c033edd4>] (kthread+0x15c/0x160)
[  144.060984] [<c033edd4>] (kthread) from [<c0300148>] (ret_from_fork+0x14/0x2c)
[  144.060988] Exception stack(0xc1313fb0 to 0xc1313ff8)
[  144.060998] 3fa0:                                     00000000 00000000 00000000 00000000
[  144.061012] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  144.061025] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  144.069170] (UME)
[  144.077295] Code: e1550004 03a00000 0a000005 e594001c (e5903010)
[  144.077409] ---[ end trace 0f6ecaa7d5167b70 ]---
2 Likes

Ok, tested with a clean build. Same result as: AQL and the ath10k is *lovely* - #463 by sjpacket

2 Likes

Hi Sjpacket,

Did you have a chance to try 21.02.1 image in your setup? That version was the last OpenWrt version using the round-robin scheduler before it was replaced by VTBA scheduler in later versions. Other than the crash, I'm wondering whether you may also encounter the same throughput drop problem in your setup when using 21.02.1.

Can you also clarify your throughput drops when adding more clients? Did you refer to a drop in total aggregated throughput of all additional clients, or the throughput drop of a single client when additional clients are associated to the wifi network (but without sending traffic in a concurrent manner)?

1 Like

Hello vochong,

I didn't test 21.02.1 version yet. I will try that later today.

The test I am running is:

  • connect 1 client to the AP (only associate)
  • no traffic is passed
  • run iperf3 from only first client
  • start associating other clients without sending traffic in a concurrent manner
  • only one client is sending traffic, all others are idle.

So the throughput drop I am observing is of a single client when additional clients are associated to the wifi network and the throughput recovers (increases) back to normal as I am disassociating the clients.

Also, it's not just associating clients while the test is running, it's also when more than 3 or 4 clients are associate prior to the test and then when the test is run, the throughput starts off at 50% lower.

This is not happening with openwrt 19.07.

Hi sjpacket,

To help mitigate the high latency and random crash problems, I disabled AQL on my R7800 (running an old ACWIFIdude's 22.03 image 5.10.113 built on May 8 13:53:12 2022). I did notice that single client throughput tended to drop only a bit with other associated clients (about 6) having slight concurrent download speeds (e.g. watching youtube @720p and browsing web). With AQL being enabled, the speed drop (for the same single client) was more obvious in the same condition. It's possible that AQL and ATF (aka VTBA scheduler) are not supposed to be in a marriage :slight_smile: Perhaps their coliving for the past year has proved their irremediable incompatibility, and should end in an amicable departure from each other. Felix was their matchmaker and will also be their divorce lawyer :slight_smile:

For example (approximately):
450 Mbps -> 390 Mbps (AQL disabled) vs. 260 Mbps (AQL enabled).

Before switching to 21.02.1, can you please do this experiment with your current image:

  • Disable AQL using these commands:

echo 0 > /sys/kernel/debug/ieee80211/phy0/aql_enable
echo 0 > /sys/kernel/debug/ieee80211/phy1/aql_enable

  • Repeat the previous speed tests again.

while we are dealing with a deeper set of bugs here, I live, die and swear by the outputs of the rtt_fair test suite: https://www.cs.kau.se/tohojo/airtime-fairness/

Is it possible your clients could be running netperf and we get those sort of tests back?

It is the comparison plots that are the most helpful. I note that with most wifi (without these patches), performance drops precipitously in the first place.

1 Like

If it's possible for use to schedule a live videoconference or a meeting via irc, I am in the PST time zone, and available most mornings except tuesdays, where I am available 8:30-9:30 AM only.

1 Like

Hi Dave,

According to the original paper (https://www.usenix.org/system/files/conference/atc17/atc17-hoiland-jorgensen.pdf), it seems that the validation of the ATF scheduler implementation was done with the ath9k driver instead of the ath10k driver. Most of the reported issues so far seem to be related to ath10k. If that's the case, the peculiarities in the behavior of the ath10k driver/firmware may dictate more extensive modifications in the ATF implementation in order to accommodate them.

"We have implemented our proposed queueing scheme in
the Linux kernel, modifying the mac80211 subsystem to
include the queueing structure itself, and modifying the
ath9k and ath10k drivers for Qualcomm Atheros 802.11n
and 802.11ac chipsets to use the new queueing structure.
The airtime fairness scheduler implementation is limited
to the ath9k driver, as the ath10k driver lacks the required
scheduling hooks."

According to Felix' latest patch (after his numerous communications with Toke):

The virtual time scheduler code has a number of issues:
  10 - queues slowed down by hardware/firmware powersave handling were not properly handled.
  12 - on ath10k in push-pull mode, tx queues that the driver tries to pull from were starved, causing excessive latency
  14 - delay between tx enqueue and reported airtime use were causing excessively bursty tx behavior
  17 The bursty behavior may also be present on the round-robin scheduler, but there it is much easier to fix without introducing additional regressions
  20 Signed-off-by: Felix Fietkau <nbd@nbd.name>

@sjpacket, do the patches make any difference in your test compared to leaving them out without any other changes to the tree? Does it make a difference if you only use the first patch (330-mac80211-switch-airtime-fairness-back-to-deficit-rou.patch)?

Hi @nbd,

I haven't tested with applying one patch at a time. Will try that this week.

What I have noticed is that I do not see a crash when I switched to ath10k mainline. I see a crash a soon as I switch to ath10k-ct.

Results:
openwrt 22.03 + ath10k-ct + no patches = crash (OOM)
openwrt 22.03 + ath10k-ct + latest patches = crash (NULL Pointer deference)
openwrt 22.03 + ath10k + no patches = no crash
openwrt 22.03 + ath10k + latest patches = no crash (I see a quick dip around 8 mins or so but throughput recovers fine)

There are a lot of variables and patches to test :laughing: and I also need to make sure I don't have any issues in my test setup.

2 Likes

btw upstream ath10k firmware doesn't have powersave

1 Like

Thanks, @sjpacket. The NULL pointer dereference crash that you pointed out also seems to be caused by OOM. It happens in a completely unrelated place, so I don't think it's directly related to my patch, except for the fact that my patch might slightly change memory usage patterns.
How is latency/throughput with my patch on mainline ath10k?

1 Like

Hi @vochong,
regarding ATF and AQL: AQL is essential for making ATF work well. If AQL significantly reduces throughput, then that's an important issue which needs to be fixed. Please let me know how well the current version with my patches works for you with AQL and ATF enabled.

3 Likes