[Openwrt 23.5.02] mac80211 stops pushing data and packet rate drops to 0 with large buffer sizes

I am facing an issue, where my wireless driver is unable to push data with large buffer sizes and packet rate drops to 0.

With more investigation I see that there is an issue with the “wake_tx_queue " ieee80211_ops callback ieee80211_handle_wake_tx_queue in Openwrt.

Originally mac80211 v6.1.24 brought in the model, where the driver pulls data frames from the mac80211 queue instead of letting mac80211 push them via drv_tx(), if the driver indicates that it uses this model by implementing the .wake_tx_queue driver operation. This makes “.wake_tx_queue” an optional callback in v6.1.24.

Whereas, since openwrt pulls patches from newer versions, it did bring in patches from v6.2 which reverts this model, and again makes “.wake_tx_queue” a mandatory callback in Openwrt.

git.openwrt.org Git - openwrt/openwrt.git/blob - package/kernel/mac80211/patches/subsys/306-01-v6.2-wifi-mac80211-add-internal-handler-for-wake_tx_queue.patch

To mitigate this situation, we assign “ieee80211_handle_wake_tx_queue" as the wake_tx_queue callback in our wireless driver.

I also see that there was a regression in mac80211 which was fixed as a part of following patch

https://patchwork.kernel.org/project/linux-wireless/patch/20221230121850.218810-1-alexander@wetzel-home.de/

This is cherry-picked into 6.1.24 Mac80211 as suitable for it, but Openwrt with its patches has brought back in the "wake_tx_push_queue" function but did not bring the changes from the above patch to fix the issue.

So if I bring that in as follows everything is back to normal,

--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -292,22 +292,12 @@ static void wake_tx_push_queue(struct ie
                               struct ieee80211_sub_if_data *sdata,
                               struct ieee80211_txq *queue)
 {
-       int q = sdata->vif.hw_queue[queue->ac];
        struct ieee80211_tx_control control = {
                .sta = queue->sta,
        };
        struct sk_buff *skb;
-       unsigned long flags;
-       bool q_stopped;

        while (1) {
-               spin_lock_irqsave(&local->queue_stop_reason_lock, flags);
-               q_stopped = local->queue_stop_reasons[q];
-               spin_unlock_irqrestore(&local->queue_stop_reason_lock, flags);
-
-               if (q_stopped)
-                       break;
-
                skb = ieee80211_tx_dequeue(&local->hw, queue);
                if (!skb)
                        break;

Thanks for posting this - this can explain troubles my friends are having at local hackerspace during open days, when the wireless traffic there is high.
I think I'll try this out - is there a chance you could submit this patch to Github or mailing list?
cc @hauke and @robimarko