I am opening a new topic because I posted about this about 2 months ago but that topic has been auto closed due to 10 days of inactivity.
So I am still getting these errors on my ath11k ipq807x devices.
I own 2 of these, one ipq8072 and one ipq8074. They both exhibit this behavior.
ath11k c000000.wifi: failed to flush transmit queue, data pkts pending X
This happens just after an STA has been disconnected unexpectedly. That is to say: If you are doing a transfer on a device and you start walking away from your AP while this transfer is taking place and you keep walking until you are too far and unexpectedly loose connection, chances are you will see this in your logs.
So, now some further details:
If you see data
pkts pending 1 chances are this was just the STA pool packet that was sent out to see if you are still connected to the device or not.
If you see something over 1, then the ungraceful disconnect probably happened while a "real transfer" was taking place: eg you were pushing / pulling traffic to / from the AP while you disconnected ungracefully for whatever reason (signal loss? device battery died?).
Why this matters:
I have noticed the following behavior: If you see
data pkts pending 1, your AP will most likely recover from this. Transfer to / from all other connected STAs will block for a second or two (eg: ping loss, etc)... But your AP will recover, and besides having a 2-3 second pause to / from ALL other connected devices, you will continue on without any issues.
Now once every month or so, I see this error with something like 100 or 200 data pkts pending. At this point, the AP will NOT be able to recover. The behavior I experience is that all other devices connected to the AP will exhibit INCREDIBLY high ping rates (1000ms - 10000ms) including right out ping loss.... And this will not clear until a full reboot. I have experienced this twice so far, once at about 25 days up time and once at about 35 days up time. There are no other messages in the logs. Everything else looks perfectly fine. I will add that its quite possible that had I waited for "some time" the XXX packets would have flushed, and things would have recovered. But I only spent about 5-10 minutes before giving her the ole reboot. Restarting hostapd did not help.
I have just gone ahead and compiled 2 builds using openwrt snapshot (r23400 to be exact)... Without the 2 related patches that are quite likely the culprit:
Both my APs are now 100% up and running, configured and as they are to be without any expected reboots, that is to say the state they are in now I do not plan on rebooting them manually...
I will post every week or so letting you all know if removing these fixes this or not.
Last but not least, I am well aware of the potential security implications of excluding these 2 patches.