AQL and the ath10k is *lovely*

FWIW, I've been running a April 2 or 4 snapshot on my TP-Link C7 (its an AP only in my system) since, well that day... with little issue. Seems the occasional 2.4ghz radio dropout thats been plaguing C7 users in the past few stable revisions hasn't been happening as much, only a couple times, vs every 2-5days it seemed with 19.07.2

This wont have Dave's patch changes, but was a few days past release of the basic ath10k radio stuff, and it's nice to see my 5ghz radio w only 20-30ms latency finally!

Are these AQL tweaks included in all snapshot builds using ath10k-ct devices or do we need to add them? I use @hnyman community build for the R7800.

Are you referring to the patch here?

I guess so, just wondering if these improvements will be in the snapshot builds I'm using or do I need to manually add them somehow.

The first patches at the top are in the snapshots, yes.

In my case I moved onto the 5.4 build, as it was easier to backport stuff to it. But I have no major results to report on that besides progress on the bug. If you are doing your own builds, I would certainly like to see 5.4 tested harder.

iff --git a/target/linux/ath79/Makefile b/target/linux/ath79/Makefile
index 9b203cf48e..a955602ba9 100644
--- a/target/linux/ath79/Makefile
+++ b/target/linux/ath79/Makefile
@@ -8,7 +8,7 @@ SUBTARGETS:=generic mikrotik nand tiny

FEATURES:=ramdisk

-KERNEL_PATCHVER:=4.19
+KERNEL_PATCHVER:=5.4
KERNEL_TESTING_PATCHVER:=5.4

include $(INCLUDE_DIR)/target.mk

What is the point of adding DQL to ath10k if AQL already exists? From: https://patchwork.kernel.org/patch/11512397/#23316961

@dtaht @tohojo

It also doesn't seem to work for me on ath10k "classic"/non-ct. I get the same error as another reported in Patchworks with ath10k_pcie saying:

failed to increase tx pending count: -16, dropping

Similar hardware since I have an 2x Archer A7v5s (QCA988X; ac Wave 1).

Linking it here in case anyone else wants to try.

diff --git a/drivers/net/wireless/ath/ath10k/core.c b/drivers/net/wireless/ath/ath10k/core.c
index f26cc6989dad..72771ff38a94 100644
--- a/drivers/net/wireless/ath/ath10k/core.c
+++ b/drivers/net/wireless/ath/ath10k/core.c
@@ -2497,6 +2497,10 @@ static int ath10k_core_init_firmware_features(struct ath10k *ar)
 		return -EINVAL;
 	}
 
+	dql_init(&ar->htt.dql, HZ);
+	ar->htt.dql.max_limit = ar->htt.max_num_pending_tx;
+	ar->htt.dql.min_limit = 8;
+
 	if (ar->hw_params.num_peers)
 		ar->max_num_peers = ar->hw_params.num_peers;
 	else
diff --git a/drivers/net/wireless/ath/ath10k/htt.h b/drivers/net/wireless/ath/ath10k/htt.h
index 4a12564fc30e..19024d063896 100644
--- a/drivers/net/wireless/ath/ath10k/htt.h
+++ b/drivers/net/wireless/ath/ath10k/htt.h
@@ -13,6 +13,7 @@
 #include <linux/dmapool.h>
 #include <linux/hashtable.h>
 #include <linux/kfifo.h>
+#include <linux/dynamic_queue_limits.h>
 #include <net/mac80211.h>
 
 #include "htc.h"
@@ -1965,8 +1966,8 @@ struct ath10k_htt {
 	/* Protects access to pending_tx, num_pending_tx */
 	spinlock_t tx_lock;
 	int max_num_pending_tx;
-	int num_pending_tx;
 	int num_pending_mgmt_tx;
+	struct dql dql;
 	struct idr pending_tx;
 	wait_queue_head_t empty_tx_wq;
 
diff --git a/drivers/net/wireless/ath/ath10k/htt_tx.c b/drivers/net/wireless/ath/ath10k/htt_tx.c
index e9d12ea708b6..911a79470bdf 100644
--- a/drivers/net/wireless/ath/ath10k/htt_tx.c
+++ b/drivers/net/wireless/ath/ath10k/htt_tx.c
@@ -144,8 +144,8 @@ void ath10k_htt_tx_dec_pending(struct ath10k_htt *htt)
 {
 	lockdep_assert_held(&htt->tx_lock);
 
-	htt->num_pending_tx--;
-	if (htt->num_pending_tx == htt->max_num_pending_tx - 1)
+	dql_completed(&htt->dql, 1);
+	if (dql_avail(&htt->dql) > 0)
 		ath10k_mac_tx_unlock(htt->ar, ATH10K_TX_PAUSE_Q_FULL);
 }
 
@@ -153,11 +153,11 @@ int ath10k_htt_tx_inc_pending(struct ath10k_htt *htt)
 {
 	lockdep_assert_held(&htt->tx_lock);
 
-	if (htt->num_pending_tx >= htt->max_num_pending_tx)
+	if (dql_avail(&htt->dql) <= 0)
 		return -EBUSY;
 
-	htt->num_pending_tx++;
-	if (htt->num_pending_tx == htt->max_num_pending_tx)
+	dql_queued(&htt->dql, 1);
+	if (dql_avail(&htt->dql) <= 0)
 		ath10k_mac_tx_lock(htt->ar, ATH10K_TX_PAUSE_Q_FULL);
 
 	return 0;
diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
index 2d03b8dd3b8c..1fe251742b0a 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -3998,7 +3998,7 @@ static bool ath10k_mac_tx_can_push(struct ieee80211_hw *hw,
 	if (ar->htt.tx_q_state.mode == HTT_TX_MODE_SWITCH_PUSH)
 		return true;
 
-	if (ar->htt.num_pending_tx < ar->htt.tx_q_state.num_push_allowed)
+	if (dql_avail(&ar->htt.dql) < ar->htt.tx_q_state.num_push_allowed)
 		return true;
 
 	if (artxq->num_fw_queued < artxq->num_push_allowed)
@@ -4159,7 +4159,7 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
 	if (ar->htt.tx_q_state.mode != HTT_TX_MODE_SWITCH_PUSH)
 		return;
 
-	if (ar->htt.num_pending_tx >= (ar->htt.max_num_pending_tx / 2))
+	if (dql_avail(&ar->htt.dql) < (ar->htt.dql.limit / 2))
 		return;
 
 	rcu_read_lock();
@@ -7160,7 +7160,7 @@ void ath10k_mac_wait_tx_complete(struct ath10k *ar)
 			bool empty;
 
 			spin_lock_bh(&ar->htt.tx_lock);
-			empty = (ar->htt.num_pending_tx == 0);
+			empty = (ar->htt.dql.num_completed == ar->htt.dql.num_queued);
 			spin_unlock_bh(&ar->htt.tx_lock);
 
 			skip = (ar->state == ATH10K_STATE_WEDGED) ||
diff --git a/drivers/net/wireless/ath/ath10k/txrx.c b/drivers/net/wireless/ath/ath10k/txrx.c
index 39abf8b12903..fe7cd53c2bf9 100644
--- a/drivers/net/wireless/ath/ath10k/txrx.c
+++ b/drivers/net/wireless/ath/ath10k/txrx.c
@@ -80,7 +80,7 @@ int ath10k_txrx_tx_unref(struct ath10k_htt *htt,
 
 	ath10k_htt_tx_free_msdu_id(htt, tx_done->msdu_id);
 	ath10k_htt_tx_dec_pending(htt);
-	if (htt->num_pending_tx == 0)
+	if (htt->dql.num_completed == htt->dql.num_queued)
 		wake_up(&htt->empty_tx_wq);
 	spin_unlock_bh(&htt->tx_lock);

You don't need this patch, simply 'make menuconfig -> Global build settings -> Use the testing kernel version'

I am currently running this patch that Toke proposed. It is running well for me on my R7800 with ath10k-ct firmware and ath10k-ct-smallbuffers driver. I also noted this in that same thread.

My question to you (and others): are you willing to switch to the ath10k-ct and try it out?

Here’s an example of the difference I am seeing pre-Toke patch and after:


Notice the pre-Toke patch latency spikes that are absent with the patch.

What did you do to adopt it to CT? Did you put it in the package/kernel/ath10k-ct/patches folder and then change the file references?

I mean I don't really mind trying CT v. Non-CT, I just find that Non-CT works better for Wave 1 devices in terms of throughput and stability.

Especially since Ben focuses more on Wave 2, 10.4 firmware devices.

1 Like

The information I shared here contained incorrect paths. I have removed it for now so as to eliminate any confusion it may cause.

1 Like

I am beginning to suspect that the osx box I have been using for testing is actually tossing at least some of it's data into the VI queue, and that is the source of the rrul_be asymmetry. I don't know if I can take actual caps of the 802.11ac air with an ath10k with three antennas at the moment.

As this "optimization" only grants the osx box more bandwidth for itself, and most (but not all) of the
current airtime scheduler is optimized for switching between stations fairly, I think if I add another non-apple box to this test, that I'll see a result closer to what I would expect.

yea, those spikes look very much like what overfilling the tx ring looks like, and the resulting collapse.

I wrote up what that used to look like on ethernet in my "Backpressure" comment over here. https://www.reddit.com/r/HomeNetworking/comments/g4upyw/how_to_determine_if_a_wireless_router_or_access/

There are several very good papers also showing how BQL like techniques are vastly better than merely having a high/low watermark on the tx ring. Still, watermarks are good start here. Keep the patches coming! :slight_smile:

I believe @tohojo was thinking this dql patch wasn't really the direction to go in the end. Maybe he can weigh in here with thoughts around that.

Regardless, I have been continuing to use the dql patch in the absence of anything better at this point and don't have any real complaints with it at this point. :slight_smile:

I'd like every ethernet driver with a high/low tx ring watermark replaced with a bql implementation. I don't know if the devices under test have that? It's EASY to do if you have the device in front of you.

http://flent-fremont.bufferbloat.net/~d/broadcom_aug9.pdf

as for wifi... well... agggh. gettin there! :slight_smile: Somebody send an ath10k to toke as a christmas gift.

Ok. sure. I know how building OpenWrt/Quilt works. But two questions - how does this patch work when:

  1. You're putting it in the /target/linux/generic/patches-5.4 when the mac80211 subsystem that OpenWrt uses is in the /package/kernel/mac80211/ and it has it's own existing patch folder under subsys or ath or w/e. The WiFi stack that you applied the patch to doesn't even get used - unless something has changed from master v. the 19.07 branch that I'm on re: the switch to kernel 5.4. Was backports deprecated and it's not being used anymore?
  2. The patches for ath10k-ct driver should exist in package/kernel/ath10k-ct/patches. Yet, you're applying the patch to ath10k classic.

Somehow though, you're still getting better results with the patch but with OpenWrt's default config, the WiFi stack you're applying the patches to doesn't even get used.

1 Like

Which make/model would be best? I'll seriously send it :wink:

1 Like

+1 for Netgear R7800 if you can snag one :wink:

I did not mean to come across as insulting by including the build/Quilt "basics". I was hoping that might help others who might want to try the patch.

But back to your questions/points, you are correct in that it doesn't seem to add up regarding the location of the patch and the resulting build. Let me try to move the patch to package/kernel/ath10k-ct/patches and check the result.

Obviously, I will update my original post accordingly if needed.

EDIT
So I recreated the patch in package/kernel/ath10k-ct/patches and recompiled. It did build and I flashed the image to my R7800. However, I am now seeing this kernel warning repeated constantly in the logs:
ath10k_pci 0001:01:00.0: failed to increase tx pending count: -16, dropping

Clients are unable to connect, so I had to revert back to my previous image. At this point I am not sure what to make of this whole thing. Based on my testing, it is apparent something changed right at the time I originally tested the dql patch. But based on your experience, which I'm sure is far beyond mine, the dql patch would not have been in play.

Again, not sure what to make of it all at the moment. The data points to a change, but apparently it was not dql--and no idea what else at this point.

EDIT 2
I removed the dql patch again and checked out htt_tx.c here /openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_musl_eabi/linux-ipq806x_generic/ath10k-ct-smallbuffers/ath10k-ct-2020-04-29-3637be6f/ath10k-5.4. I do see this:

void ath10k_htt_tx_dec_pending(struct ath10k_htt *htt)
{
        lockdep_assert_held(&htt->tx_lock);

        htt->num_pending_tx--;
        if ((htt->num_pending_tx <= (htt->max_num_pending_tx / 4)) && htt->needs_unlock) {
                htt->needs_unlock = false;
                ath10k_mac_tx_unlock(htt->ar, ATH10K_TX_PAUSE_Q_FULL);
        }
}

That is the change Ben introduced here: https://marc.info/?l=linux-wireless&m=158808593808807&w=2

I'm thinking now the results I shared from testing were instead due to Ben's change for the "Restart xmit queues below low-water mark." It fits with the timing. If that is the answer, I certainly apologize for attributing the data to the wrong change. Certainly unintentional on my part. :-1:

I may pick up 2! I thought the nighthawk series was stuck with broadcomm :face_vomiting:

Thankfully the R7800 has an awesome QCA9884 instead of BCM! There is also a Zyxel 'equivalent' to the R7800 that I haven't tried, but would love to if I can find one for cheap at some point. Here's a thread about it: Netgear R7800 vs Zyxel NBG6817, is it worth the extra money?

The R7800's are popular on eBay right now, but can be had for around $125 these days.