AQL and the ath10k is *lovely*

Huge thanks for calling my attention to this issue! I definitely experience the same behavior as described there and added a post to that thread as a result. It looks like great progress is being made there, so I will be following it for sure. Thanks again!

1 Like

the op in that issue requested minimal contributions from others. I think your post is fine tho given you have similar hardware and symptoms. That said its best to create a new issue as @greearb suggested.

I may be experiencing similar issues and possibly related to apple devices, but different hardware (running ath10k-firmware-qca99x0-ct-full-htt firmware).

While pinging with an iphone 7, I do see some lost packets while also doing an iperf3 test from this phone to the AP. However my ping rtt's are much better than you report. ping during iperf3 from a 2019 mac book air to the AP seem fine...

A few other observations if it helps...

I have to be careful not iperf test when my spouse is connected (on the same 5GHz, windows 10 client) and doing a video conference as the audio will kick off - I don't recall this behavior earlier this year (Jan-Feb time frame).

I initially got very similar symptoms (sporadic 1000+ ms ping rtt) after upgrading an ubuntu client form 18.04 to 20.04 but this turned out to be client related (some wifi power saving setting - testing with the iphone 7 plugged in vs on battery, I don't see a difference but perhaps there are other apple device power saving features I could try).

I'm not sure if this is "ath10k AQL" related but the windows client video/voice behavior make me suspicious.

HTH

EDIT another "symptom" worth mentioning. I can no longer reliably use dsl reports to test "buffer bloat" from any AP wifi client (I have a separate DIY x86 router running sqm). If my AP wifi network is quiet, I can get results from a wifi client that will match a test done over the wire (on the same AP).

However, if the wifi network is "busy" bandwidth sporadically drops off (sometimes by as much as half) midway through the test and the test reports B or C for buffer bloat. A "busy" network does not seem to give these results when testing on the wire.

In the Jan-March time frame, I could get straight "A's" using fq_codel and simple.qos on the router, testing with a wifi client, when the network was "busy." I'm trying cake/piece of cake now, but I don't think this is router/sqm related - I'm pretty sure its happening upstream on the AP for wifi clients only.

2 Likes

@huaracheguarache, please test the mac80211 commit from my staging tree at https://git.openwrt.org/?p=openwrt/staging/nbd.git;a=summary
Hopefully it should resolve your AQL latency issue without hurting the high throughput case

2 Likes

Ok, it seems like your latest patch has fixed the regression. Here are the results I got with no patches (I reverted the patch that caused the regression):

To test the behaviour with multiple stations downloading I ran an iperf3 test on my smartphone at around 150 seconds, hence the latency spike and drop in throughput. Here's also a close-up of the area where the ping is more stable:

And these are the results I got with the original patch and the fix:

It actually looks a bit better than the test without any of the patches. And a close-up:

Which look pretty good! What I don't really understand though is why the ping climbs so high when I run a concurrent download on my smartphone. This seems to be an issue with AQL, unrelated to your patches, which needs to be looked at.

1 Like

@dtaht Any idea what might be happening during the part with the high latency? I'm running a build with the codel target lowered to 10 ms and aql_threshold is set to 6000.

1 Like

FWIW after about 24hr of use, the recent mac802011 patches are an improvement (but the lag and packet loss I saw before these patches was never as bad as others observed).

Is there a way to disable/enable "air time fairness" at runtime? Looking at this thread it does not look like there anything I can adjust/tune at run time.

If I run a simultaneous iperf (or netperf) from two different wifi clients on the same band/ssid, I've observed a "faster client" (farther from the AP but still the fastest client on the AP) dropping in speed from 200+ mps down to 5 mbps during the test. The slower client (closer the AP) stays relatively constant at ~ 160 mbps. Simultaneous irtt during the combined iperf/netperf test from either client to the local server(s) (a wired connection to the AP) during the test shows good rtt and no packet loss. Equal distant from the AP, the faster client can do ~500 mbps while the slower can do ~180 mbps.

I'm running the tests with the network in use so results vary but I do see one client "suppressing" others more than I can justify by my WAN connection (30 mbps down, ~3 mbps up).

EDIT additional testing running irtt simultaneous with netperf on the client being slowed down (netperf also run simultaneously from a second unaffected client) shows significant rtt (> 1 s).

output below (irtt & netperf started on the affected client about the same time, netperf started on the second client as indicated in the output below.

nmba [10] $ ./go/bin/irtt client XXX.XXX.45.26
[Connecting] connecting to XXX.XXX.45.26
[XXX.XXX.45.26:2112] [Connected] connection established
seq=0 rtt=3.03ms rd=332.8ms sd=-329.8ms ipdv=n/a
seq=1 rtt=3.26ms rd=332ms sd=-328.7ms ipdv=227µs
seq=2 rtt=2.07ms rd=331.7ms sd=-329.7ms ipdv=1.2ms
seq=4 rtt=192ms rd=499.3ms sd=-307.2ms ipdv=n/a
seq=5 rtt=91.57ms rd=344.2ms sd=-252.6ms ipdv=100.5ms
seq=6 rtt=143.8ms rd=344.3ms sd=-200.4ms ipdv=52.25ms
seq=7 rtt=26.02ms rd=348ms sd=-322ms ipdv=117.8ms
seq=8 rtt=10.05ms rd=338.5ms sd=-328.4ms ipdv=15.97ms
seq=9 rtt=21.68ms rd=342.9ms sd=-321.2ms ipdv=11.63ms
seq=10 rtt=22.79ms rd=337.8ms sd=-315ms ipdv=1.11ms
seq=11 rtt=16.98ms rd=336.9ms sd=-319.9ms ipdv=5.81ms
seq=12 rtt=10.09ms rd=337.9ms sd=-327.9ms ipdv=6.89ms
seq=13 rtt=5.18ms rd=334.5ms sd=-329.3ms ipdv=4.91ms
seq=14 rtt=9.67ms rd=339ms sd=-329.4ms ipdv=4.48ms
seq=15 rtt=10.95ms rd=334.1ms sd=-323.2ms ipdv=1.28ms
seq=16 rtt=2.72ms rd=332.7ms sd=-330ms ipdv=8.23ms
seq=17 rtt=9.02ms rd=335.9ms sd=-326.9ms ipdv=6.31ms
seq=18 rtt=4.52ms rd=333.4ms sd=-328.9ms ipdv=4.5ms
seq=19 rtt=5.39ms rd=335.1ms sd=-329.7ms ipdv=870µs
seq=20 rtt=12.17ms rd=334.7ms sd=-322.6ms ipdv=6.78ms
seq=21 rtt=170.7ms rd=337.2ms sd=-166.5ms ipdv=158.5ms
seq=22 rtt=11.5ms rd=337.1ms sd=-325.6ms ipdv=159.2ms
seq=23 rtt=436.1ms rd=334.4ms sd=101.7ms ipdv=424.6ms
seq=24 rtt=631ms rd=345ms sd=286ms ipdv=194.9ms
seq=25 rtt=10.69ms rd=335.1ms sd=-324.4ms ipdv=620.3ms
seq=26 rtt=518.5ms rd=340.6ms sd=177.9ms ipdv=507.8ms
seq=27 rtt=11.68ms rd=337.1ms sd=-325.4ms ipdv=506.8ms
seq=28 rtt=208.4ms rd=339.1ms sd=-130.7ms ipdv=196.7ms
seq=29 rtt=117.3ms rd=336.5ms sd=-219.2ms ipdv=91.08ms
seq=30 rtt=646.4ms rd=340.4ms sd=306ms ipdv=529.2ms
seq=31 rtt=72.32ms rd=349.6ms sd=-277.3ms ipdv=574.1ms
seq=32 rtt=43.87ms rd=336.8ms sd=-292.9ms ipdv=28.46ms
seq=33 rtt=7.72ms rd=338.4ms sd=-330.7ms ipdv=36.15ms
seq=34 rtt=416.4ms rd=343.5ms sd=72.82ms ipdv=408.6ms
seq=35 rtt=4.89ms rd=333.8ms sd=-328.9ms ipdv=411.5ms
seq=36 rtt=39.52ms rd=335.4ms sd=-295.9ms ipdv=34.62ms
seq=37 rtt=44.65ms rd=344.2ms sd=-299.6ms ipdv=5.13ms
seq=38 rtt=15.45ms rd=343.1ms sd=-327.6ms ipdv=29.19ms
seq=39 rtt=6.77ms rd=336.6ms sd=-329.9ms ipdv=8.69ms
seq=40 rtt=279.5ms rd=346.2ms sd=-66.77ms ipdv=272.7ms
seq=41 rtt=64.47ms rd=337.7ms sd=-273.2ms ipdv=215ms
seq=42 rtt=1.21s rd=335.7ms sd=875.6ms ipdv=1.15s
seq=43 rtt=397.4ms rd=333.4ms sd=64.04ms ipdv=813.9ms
seq=44 rtt=416.3ms rd=335.1ms sd=81.23ms ipdv=18.93ms
seq=45 rtt=428.1ms rd=335.4ms sd=92.66ms ipdv=11.76ms
seq=46 rtt=181.4ms rd=334.3ms sd=-152.9ms ipdv=246.7ms
seq=47 rtt=371.3ms rd=334.4ms sd=36.86ms ipdv=189.9ms
seq=48 rtt=81.52ms rd=339.7ms sd=-258.2ms ipdv=289.8ms
seq=49 rtt=330.7ms rd=335ms sd=-4.23ms ipdv=249.2ms
seq=51 rtt=524.7ms rd=333ms sd=191.7ms ipdv=n/a
seq=52 rtt=306.7ms rd=336.6ms sd=-29.89ms ipdv=218.1ms
seq=53 rtt=50.8ms rd=337.5ms sd=-286.7ms ipdv=255.9ms
seq=54 rtt=113.5ms rd=333.9ms sd=-220.4ms ipdv=62.68ms
seq=55 rtt=199.2ms rd=335.4ms sd=-136.1ms ipdv=85.74ms
seq=56 rtt=721.9ms rd=337.9ms sd=384ms ipdv=522.6ms
seq=57 rtt=6.14ms rd=336.2ms sd=-330.1ms ipdv=715.7ms
seq=58 rtt=13.66ms rd=334.1ms sd=-320.4ms ipdv=7.52ms
[XXX.XXX.45.26:2112] [WaitForPackets] waiting 3.63s for final packets
seq=59 rtt=322.7ms rd=335.3ms sd=-12.57ms ipdv=309.1ms

                          Min      Mean    Median      Max   Stddev
                          ---      ----    ------      ---   ------
                RTT    2.07ms   173.1ms   47.72ms    1.21s  239.8ms
         send delay  -330.7ms  -167.2ms  -294.4ms  875.6ms  240.4ms
      receive delay   331.7ms   340.3ms   336.6ms  499.3ms  21.64ms
                                                                   
      IPDV (jitter)     227µs   198.3ms   91.08ms    1.15s    250ms
          send IPDV      55µs   195.9ms   84.27ms    1.15s  250.3ms
       receive IPDV    10.4µs    6.47ms    2.63ms  155.1ms  20.66ms
                                                                   
     send call time    30.5µs     123µs              498µs   84.1µs
        timer error     102µs    1.29ms             3.78ms    977µs
  server proc. time    6.04µs    11.1µs             26.9µs   3.29µs

                duration: 1m3s (wait 3.63s)
   packets sent/received: 60/58 (3.33% loss)
 server packets received: 58/60 (3.33%/0.00% loss up/down)
     bytes sent/received: 3600/3480
       send/receive rate: 488 bps / 469 bps
           packet length: 60 bytes
             timer stats: 0/60 (0.00%) missed, 0.13% error

### netperf output below approx. simultaneous started with irtt output above ###

nmba [63] $ netperf -l 60 -D 1s -H XXX.XXX.45.26
MIGRATED TCP STREAM TEST from (null) (XXX.XXX.0.0) port 0 AF_INET to (null) () port 0 AF_INET : demo
Interim result:   67.70 10^6bits/s over 2.494 seconds ending at 1597509441.475
Interim result:  221.78 10^6bits/s over 1.012 seconds ending at 1597509442.486
Interim result:  250.37 10^6bits/s over 1.005 seconds ending at 1597509443.492
Interim result:  237.70 10^6bits/s over 1.041 seconds ending at 1597509444.533
Interim result:  260.36 10^6bits/s over 1.011 seconds ending at 1597509445.544
Interim result:  225.17 10^6bits/s over 1.160 seconds ending at 1597509446.703
Interim result:  241.81 10^6bits/s over 1.041 seconds ending at 1597509447.744
Interim result:  246.44 10^6bits/s over 1.013 seconds ending at 1597509448.757
Interim result:  182.91 10^6bits/s over 1.399 seconds ending at 1597509450.155
Interim result:  230.01 10^6bits/s over 1.012 seconds ending at 1597509451.167
Interim result:  167.48 10^6bits/s over 1.377 seconds ending at 1597509452.545
Interim result:  185.26 10^6bits/s over 1.002 seconds ending at 1597509453.547
Interim result:  198.51 10^6bits/s over 1.014 seconds ending at 1597509454.561
Interim result:  177.08 10^6bits/s over 1.113 seconds ending at 1597509455.674

### started netperf on second client approx. here ###
### no output was redacted here, just these comments inserted ###

Interim result:    7.45 10^6bits/s over 23.779 seconds ending at 1597509479.453
Interim result:    0.38 10^6bits/s over 19.532 seconds ending at 1597509498.986
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131072 131072 131072    63.88      53.30   

### netperf output from second client ###

(p383.mkl) [16] $ netperf -l 60 -D 1s -H XXX.XXX.45.26
MIGRATED TCP STREAM TEST from XXX.XXX.0.0 (XXX.XXX.0.0) port 0 AF_INET to XXX.XXX.45.26 () port 0 AF_INET : demo
Interim result:   77.73 10^6bits/s over 2.054 seconds ending at 1597509457.651
Interim result:  145.66 10^6bits/s over 1.028 seconds ending at 1597509458.679
Interim result:  156.10 10^6bits/s over 1.019 seconds ending at 1597509459.697
Interim result:  148.71 10^6bits/s over 1.050 seconds ending at 1597509460.747
Interim result:  154.39 10^6bits/s over 1.009 seconds ending at 1597509461.755
Interim result:  158.68 10^6bits/s over 1.014 seconds ending at 1597509462.770
Interim result:  149.30 10^6bits/s over 1.062 seconds ending at 1597509463.832
Interim result:  159.60 10^6bits/s over 1.027 seconds ending at 1597509464.859
Interim result:  163.26 10^6bits/s over 1.053 seconds ending at 1597509465.911
Interim result:  164.67 10^6bits/s over 1.017 seconds ending at 1597509466.928
Interim result:  147.25 10^6bits/s over 1.118 seconds ending at 1597509468.046
Interim result:  162.72 10^6bits/s over 1.010 seconds ending at 1597509469.057
Interim result:  157.13 10^6bits/s over 1.036 seconds ending at 1597509470.093
Interim result:  150.34 10^6bits/s over 1.044 seconds ending at 1597509471.137
Interim result:  156.09 10^6bits/s over 1.039 seconds ending at 1597509472.176
Interim result:  171.78 10^6bits/s over 1.038 seconds ending at 1597509473.214
Interim result:  156.11 10^6bits/s over 1.101 seconds ending at 1597509474.315
Interim result:  144.44 10^6bits/s over 1.081 seconds ending at 1597509475.396
Interim result:  149.38 10^6bits/s over 1.057 seconds ending at 1597509476.453
Interim result:  167.97 10^6bits/s over 1.051 seconds ending at 1597509477.504
Interim result:  166.29 10^6bits/s over 1.041 seconds ending at 1597509478.545
Interim result:  144.20 10^6bits/s over 1.153 seconds ending at 1597509479.699
Interim result:  157.44 10^6bits/s over 1.037 seconds ending at 1597509480.736
Interim result:  163.50 10^6bits/s over 1.013 seconds ending at 1597509481.749
Interim result:  158.93 10^6bits/s over 1.029 seconds ending at 1597509482.779
Interim result:  158.69 10^6bits/s over 1.061 seconds ending at 1597509483.840
Interim result:  168.49 10^6bits/s over 1.009 seconds ending at 1597509484.849
Interim result:  164.34 10^6bits/s over 1.056 seconds ending at 1597509485.905
Interim result:  168.65 10^6bits/s over 1.049 seconds ending at 1597509486.954
Interim result:  170.73 10^6bits/s over 1.030 seconds ending at 1597509487.984
Interim result:  159.83 10^6bits/s over 1.068 seconds ending at 1597509489.051
Interim result:  167.36 10^6bits/s over 1.053 seconds ending at 1597509490.104
Interim result:  169.60 10^6bits/s over 1.056 seconds ending at 1597509491.161
Interim result:  168.03 10^6bits/s over 1.009 seconds ending at 1597509492.170
Interim result:  163.91 10^6bits/s over 1.025 seconds ending at 1597509493.195
Interim result:  164.34 10^6bits/s over 1.068 seconds ending at 1597509494.263
Interim result:  169.86 10^6bits/s over 1.056 seconds ending at 1597509495.319
Interim result:  170.20 10^6bits/s over 1.045 seconds ending at 1597509496.364
Interim result:  161.42 10^6bits/s over 1.070 seconds ending at 1597509497.434
Interim result:  175.67 10^6bits/s over 1.065 seconds ending at 1597509498.499
Interim result:  174.19 10^6bits/s over 1.008 seconds ending at 1597509499.507
Interim result:  166.29 10^6bits/s over 1.048 seconds ending at 1597509500.554
Interim result:  171.50 10^6bits/s over 1.025 seconds ending at 1597509501.579
Interim result:  179.17 10^6bits/s over 1.013 seconds ending at 1597509502.592
Interim result:  166.87 10^6bits/s over 1.074 seconds ending at 1597509503.666
Interim result:  160.98 10^6bits/s over 1.056 seconds ending at 1597509504.722
Interim result:  170.06 10^6bits/s over 1.041 seconds ending at 1597509505.763
Interim result:  162.35 10^6bits/s over 1.087 seconds ending at 1597509506.850
Interim result:  168.24 10^6bits/s over 1.064 seconds ending at 1597509507.914
Interim result:  166.26 10^6bits/s over 1.048 seconds ending at 1597509508.962
Interim result:  168.52 10^6bits/s over 1.046 seconds ending at 1597509510.008
Interim result:  170.65 10^6bits/s over 1.038 seconds ending at 1597509511.046
Interim result:  172.41 10^6bits/s over 1.034 seconds ending at 1597509512.080
Interim result:  167.24 10^6bits/s over 1.031 seconds ending at 1597509513.110
Interim result:  165.13 10^6bits/s over 1.013 seconds ending at 1597509514.123
Interim result:  171.17 10^6bits/s over 1.048 seconds ending at 1597509515.172
Interim result:  189.94 10^6bits/s over 0.426 seconds ending at 1597509515.597
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131072  16384  16384    60.19     159.52
2 Likes

I also want to know if there's a way to disable AQL for testing, with SQM I don't have any bufferbloat through wired connections but on 5ghz WiFi I get an initial latency spike during dslreports test.

Edit: router it's hAP ac2 (IPQ4018) with ath10k-ct-smallbuffers driver, I also have native IPv6 from my ISP but the tests are done over IPv4.

repeating the tests in my prior post after this commit.

There seems to be an improvement; however, I still results like this below running simultaneous netperf from two clients on same 5 GHz band.

nmba [6] $ netperf -l 60 -D 1s -H XXX.XXX.45.26
MIGRATED TCP STREAM TEST from (null) (0.0.0.0) port 0 AF_INET to (null) () port 0 AF_INET : demo
Interim result:  178.96 10^6bits/s over 1.394 seconds ending at 1599148232.795
Interim result:  166.30 10^6bits/s over 1.078 seconds ending at 1599148233.874
Interim result:  192.22 10^6bits/s over 1.004 seconds ending at 1599148234.877
Interim result:  192.60 10^6bits/s over 1.024 seconds ending at 1599148235.901
Interim result:  198.60 10^6bits/s over 1.024 seconds ending at 1599148236.925
Interim result:  193.27 10^6bits/s over 1.031 seconds ending at 1599148237.956
Interim result:  200.07 10^6bits/s over 1.022 seconds ending at 1599148238.978
Interim result:  186.72 10^6bits/s over 1.067 seconds ending at 1599148240.045
Interim result:  183.12 10^6bits/s over 1.019 seconds ending at 1599148241.064
Interim result:  198.59 10^6bits/s over 1.008 seconds ending at 1599148242.073
Interim result:  201.03 10^6bits/s over 1.007 seconds ending at 1599148243.079
#
# netperf started on second client here
#
Interim result:   44.47 10^6bits/s over 4.504 seconds ending at 1599148247.583
Interim result:   35.79 10^6bits/s over 1.260 seconds ending at 1599148248.843
Interim result:   36.44 10^6bits/s over 1.007 seconds ending at 1599148249.850
Interim result:   40.88 10^6bits/s over 1.052 seconds ending at 1599148250.902
Interim result:   25.45 10^6bits/s over 1.607 seconds ending at 1599148252.509
Interim result:    7.18 10^6bits/s over 3.506 seconds ending at 1599148256.015
Interim result:   31.02 10^6bits/s over 1.115 seconds ending at 1599148257.130
Interim result:   30.40 10^6bits/s over 1.104 seconds ending at 1599148258.234
Interim result:   23.63 10^6bits/s over 1.287 seconds ending at 1599148259.520
Interim result:   35.42 10^6bits/s over 1.036 seconds ending at 1599148260.557
Interim result:   57.03 10^6bits/s over 1.011 seconds ending at 1599148261.568
Interim result:   40.07 10^6bits/s over 1.413 seconds ending at 1599148262.981
Interim result:   21.04 10^6bits/s over 1.894 seconds ending at 1599148264.874
Interim result:   23.64 10^6bits/s over 1.064 seconds ending at 1599148265.939
Interim result:   34.36 10^6bits/s over 1.007 seconds ending at 1599148266.946
Interim result:   21.80 10^6bits/s over 1.587 seconds ending at 1599148268.533
Interim result:   20.48 10^6bits/s over 1.075 seconds ending at 1599148269.608
Interim result:   53.36 10^6bits/s over 1.120 seconds ending at 1599148270.728
Interim result:   54.03 10^6bits/s over 1.339 seconds ending at 1599148272.067
Interim result:   39.15 10^6bits/s over 1.366 seconds ending at 1599148273.433
Interim result:    3.58 10^6bits/s over 11.134 seconds ending at 1599148284.567
Interim result:  126.80 10^6bits/s over 1.009 seconds ending at 1599148285.576
Interim result:  167.09 10^6bits/s over 1.010 seconds ending at 1599148286.586
Interim result:  179.32 10^6bits/s over 1.017 seconds ending at 1599148287.604
Interim result:   87.72 10^6bits/s over 2.032 seconds ending at 1599148289.636
Interim result:   92.16 10^6bits/s over 1.001 seconds ending at 1599148290.637
Interim result:  168.93 10^6bits/s over 0.770 seconds ending at 1599148291.407
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131072 131072 131072    60.04      68.58 

Irtt results run simultaneous with the two netperf sessions looks ok...

                          Min      Mean    Median      Max   Stddev
                          ---      ----    ------      ---   ------
                RTT    1.66ms   26.55ms   13.48ms  113.2ms  28.13ms
         send delay  -92.54ms  -72.97ms  -85.33ms  18.14ms  27.77ms
      receive delay   93.43ms   99.52ms    97.5ms  126.3ms   6.47ms
                                                                   
      IPDV (jitter)     139µs   22.24ms   11.43ms  107.3ms  27.16ms
          send IPDV    66.3µs    19.8ms    8.17ms  108.7ms   27.6ms
       receive IPDV     101µs    4.31ms    1.86ms  27.81ms   5.91ms
                                                                   
     send call time    26.3µs     114µs             1.18ms    188µs
        timer error    29.9µs    1.13ms             3.51ms    777µs
  server proc. time    7.33µs    10.1µs             25.7µs   2.32µs

                duration: 59.3s (wait 339.7ms)
   packets sent/received: 60/59 (1.67% loss)
 server packets received: 59/60 (1.67%/0.00% loss up/down)
     bytes sent/received: 3600/3540
       send/receive rate: 488 bps / 479 bps
           packet length: 60 bytes
             timer stats: 0/60 (0.00%) missed, 0.11% error

Multiple client netperf/irtt on 2.4 GHz band have similar features but less pronounced.

May this helped?

The unit of the return value of ieee80211_get_rate_duration is nanoseconds, not milliseconds.

Regardless, I appreciate the effort.

FWIW,

I get similar results independent of ct firmware. The results above are with
firmware-5-ct-full-htt-mgt-community-12.bin-lede.019
which is the latest in master.

I just tested again with
firmware-5-ct-htt-mgt-community-qcache.bin
dated 08/27/2020 (from here - I believe a beta qcache (re)enabled firmware) and get similar results so i don't think this is related to the "qcache" related observations reported on @greearb's github site.

EDIT: my wifi chipset is "9980" i.e. not the r7800...

EDIT 1: after about 6 hours up, the beta "qcache" firmware crashed. Moving on...

Tbh WiFi feels faster and latency no longer fluctuates erratically on fast.com test, haven't done extensive testing so

Have you tried the same tests on a non-CT build?

no

however, I was just waiting for that question... :grinning:

If you can suggest a non-ct firmware for the 9980 (newer than the kavalo one here) I could use with a recent ath10k driver with openwrt, please let me know.

Note:
the kavalo firmware for the 9980 is 5 years old - last time I tried I had no end of issues (but there might have been other contributing factors)

I've "binwalked" 3 recent stock firmwares and the only folder that has something ath10k firmware like is
/lib/firmware/AR900B/hw.2
which looks like:

athwlan.bin
athwlan.codeswap.bin
boarddata_0.bin
boarddata_1.bin
boardData_AR900B_CUS238_5GMipiHigh_v2_004.bin
... (bunch more of "boardData_* files)
otp.bin
utf.bin
utf.codeswap.bin
waltest.codeswap.bin

I'm seem to recall asking about this several years ago and I suspect I'd have to "stitch" some of the .bin's together to get a working firmware. Given my lack of knowledge about the firmware, this likely would be a lengthy trial an error process that may yield no working firmware.

BEGIN EDIT
OT but for anyone else who might find this post later...
I had to use the "wayback machine" to call up

https://wireless.kernel.org/en/users/Drivers/ath10k/firmware

which has this quote:

Firmware API 2

Embedding both firmware and otp images into same file firmware-2.bin. Firmware meta data provided through FW IE. Added in commit 1a222435a dated Sep 27 2013, for Linux 3.13.

END EDIT

Lastly, I did see a comment on the ddwrt forums about an updated firmware for the 9980 as recent as a few months ago - I haven't checked that yet mostly because I'd rather stick with firmware for which I can get support.

I'm not familiar with the 9980, personally. But do you have the option to just select Kernel modules > Wireless Drivers > kmod-ath10k along with Firmware > ath10k-firmware-qca99x0 (again, the non-CT variant) for your build? Since the issue I posted in the CT firmware GitHub, I have moved back to the non-CT ath10k and have been having great results with it at the present time (running master snapshot builds).

1 Like

Apologies for the long conversation but this issue is starting to be a problem for me and I appreciate the opportunity talk about it.

I'll look again but last time I checked that uses the "kavalo" firmware. As I have not tried it in some time, I can try it to see if it will work long enough to for me to see a difference.

Even if it works, I think it's possible the "issue" can originate outside the ath10k-ct driver/firmware (i.e. the issue might originate from code in mac80211 that only presents when it tries to use air time fairness with a firmware that supports it - like ct).

Running a non ct driver and firmware, do you see an output if you:

cat /sys/kernel/debug/ieee80211/phy0/netdev\:wlan0/stations/*/airtime
cat /sys/kernel/debug/ieee80211/phy1/netdev\:wlan1/stations/*/airtime

? ref. here

I do see that output because AQL is implemented in the mac80211 stack regardless of the firmware, right? (I could be wrong about that...)

root@OpenWrt:~# cat /sys/kernel/debug/ieee80211/phy0/netdev\:wlan0/stations/*/airtime
RX: 0 us
TX: 75713662 us
Weight: 256
Deficit: VO: 5 us VI: -215 us BE: 213 us BK: 127 us
RX: 0 us
TX: 21855523 us
Weight: 256
Deficit: VO: -1457 us VI: 256 us BE: -22 us BK: -92 us
RX: 0 us
TX: 1402354 us
Weight: 256
Deficit: VO: 134 us VI: -354 us BE: -63 us BK: -727 us
RX: 0 us
TX: 8775706 us
Weight: 256
Deficit: VO: 92 us VI: -55 us BE: -229 us BK: -90 us
RX: 0 us
TX: 65495711 us
Weight: 256
Deficit: VO: -15 us VI: 59 us BE: -345 us BK: -370 us
RX: 0 us
TX: 24782803 us
Weight: 256
Deficit: VO: -160 us VI: 129 us BE: -996 us BK: 48 us
RX: 0 us
TX: 62467683 us
Weight: 256
Deficit: VO: -18 us VI: 118 us BE: -63 us BK: 120 us
RX: 0 us
TX: 2454780 us
Weight: 256
Deficit: VO: 121 us VI: -58 us BE: 128 us BK: 69 us
RX: 0 us
TX: 247745338 us
Weight: 256
Deficit: VO: -202 us VI: -38 us BE: -354 us BK: -488 us
root@OpenWrt:~# cat /sys/kernel/debug/ieee80211/phy1/netdev\:wlan1/stations/*/airtime
RX: 0 us
TX: 1167 us
Weight: 256
Deficit: VO: -48 us VI: 256 us BE: -95 us BK: 256 us
1 Like

thanks for that.

Based on prior posts (most of which in this thread above) I got the impression support is needed in both the ath10k(-ct) driver/firmware and mac80211.

After a brief scan of the initial commit, most patches are to files in net/mac20811. There is only one line added to drivers/net/wireless/ath/ath10k/mac.c

+	wiphy_ext_feature_set(ar->hw->wiphy, NL80211_EXT_FEATURE_AQL);

which i think is the clue i needed to "disable AQL" for a test.

The firmware is a black box so I have no clue if it has/needs functionality to support AQL.

I have a few things to try and play with just not enough time to test do to "remote learning" starting up next week for my children. I may have to revert to something stable after that if this continues.

FWIW it looks like ath10k-ct 5.8 is comming (mailing list here) but if interpret @greearb's comment

If you are trying the 5.8 version, then it is almost completely untested and may be full of bugs.

then "master" may continue to be buggy for a while yet.

1 Like

I think you are indeed correct about -ct firmware + driver being necessary to support AQL.

I switched back to the -ct firmware + driver on my build. But I did some testing I had not done before and it appears the -ct-htt variant of the firmware is the one causing me issues at the moment. When I switch away from the -ct-htt variant to just the -ct variant, my odd latency issues that I was experiencing before seem to go away. I had always been using the -ct-htt variant prior to now.

I'll run with the -ct variant for a couple days and see how things look, but thankfully you got me thinking on this again. :slight_smile:

3 Likes

that's a great tip for me... I've always used the htt variant. I'll try the non htt and see if I get an improvement.

Based on what I've read, I think AQL should be functional with non ct driver/firmware (at least for the r7800), but I'm not the right person to confirm that.

EDIT: I just loaded the non htt variant and re-tested (like I have done above) and I still see the same undesirable (AQL) behaviour. irtt results seem little changed. I may have made some progress finding a non ct firmware to try but won't be able to test that for a bit. Regardless, I really appreciate the tip.

1 Like

Yeah, something just doesn’t feel right with the -ct firmware right now. I take a substantial throughput hit over the standard ath10k, even with the non HTT variant. I’ll probably be switching back to my non-CT build in a little while.

Sorry this non-HTT route didn’t work out better for the both of us :frowning:

So I built and loaded the ath10k driver commenting out the line
wiphy_ext_feature_set(ar->hw->wiphy, NL80211_EXT_FEATURE_AQL);
in mac.c

Running with that, there are no aql files in
/sys/kernel/debug/ieee80211/phy0/netdev\:wlan0/stations/*/aql when stations are connected to phy0; however,
cat /sys/kernel/debug/ieee80211/phy0/netdev\:wlan0/stations/*/airtime
still returns an output.

Repeating the tests above "sans aql" gives similar results so it seems my testing/observations are more airtime fairness related than aql.

What makes a bigger difference is client/AP location (AQL enabled in the ath10k driver this case). Airtime fairness seems to behave more like I'd expect when all clients have a line of sight to the AP. i.e. the faster client stay fast while slower clients take a hit. Even then, I'm still surprised at how much of a bandwidth hit the slow clients take when I start simultaneous netperf sessions from 2-3 clients. It still seems like some clients come to a complete stop briefly. This may happen more frequently with increased router uptime - I'll need to test over an extended period.

As an aside, most testing I've done is with 80 MHz width channels on the 5 GHz band (channel 36). I tried with 40 MHz width and generally see similar behavior just not as pronounced.

Lastly, I'd really like to adjust the airtime station weights with a command like

r7500v2 # iw dev wlan0 station set <MAC address> airtime_weight 26
command failed: Not supported (-95)

but I only get the output above. Perhaps this hostapd commit will enable that (but I'm likely just misunderstanding).

There hasn't been much video conferencing going on in my household since the lastest mac80211 updates so I'll need to give this some time to see if it's really improved now or not. That and I'd still like to try a non-ct firmware just to see if makes a difference.