AQL and the ath10k is *lovely*

it looks like your search ended in 2019. It looks like work has continued into 2020 https://lore.kernel.org/linux-wireless/?q=Toke+Høiland-Jørgensen

Has anyone else observed this same issue (https://lists.bufferbloat.net/pipermail/make-wifi-fast/2020-September/002933.html) where Airtime Fairness doesn't play well with AQL?

2 Likes

I note that I kind of dropped out of openwrt dev for the past year. What's the status of the AQL patches in mainline presently? And this patch?

1 Like

@tohojo Hi! What's the latest news on this?

I did rebase the patch fairly recently and sent it off to Felix for testing before posting on the list. He hasn't gotten back to me, but I assume it's on his backlog.

I do plan to do a bit more testing myself, but need to set up an environment for that (I'm awfully short on clients here). If you wanna volunteer for testing I can send you the patch, otherwise y'all will just have to wait :wink:

1 Like

Sure, I can probably try it out, and I imagine others here on the forum would like to as well. Could you post the patch here?

What kind of tests would you recommend doing?

2 Likes

I'd also like to try this out.

https://github.com/tohojo/openwrt/blob/virtual-airtime/package/kernel/mac80211/patches/subsys/346-mac80211-Switch-to-a-virtual-time-based-airtime-sche.patch

Only compile-tested on OpenWrt

Airtime fairness-related ones? :wink:

Generally, testing if fairness is worse/better than without the patch, and checking that it doesn't break anything (such as stations stalling entirely, that sort of thing)

3 Likes

@tohojo Just to double check, is this the right way to do it: I copied the patch to /package/kernel/mac80211/patches/subsys/ and compiled an image.

Running an 8-stream TCP download with flent totally stalled out my laptop and my smartphone on the network. Here's the graph from the test:

I am also seeing the same thing on an access point with equipped with a QCA 9880 (AC Wave 1) chipset.

Right, that was the kind of failure mode I was afraid of (and why I haven't posted the patch again yet). Will try to get some stations setup and do some testing myself...

1 Like

This conversation is kind of about 3 different things. ATF, reducing the fq_codel target more generally, and interactions with various bits of ath10k firmware. I'd like to get cracking on this again in the coming month, would like it if we standardized on a 10ms target universally, to start with. Then I don't know what ath10k firmware is most common, my take on it is the smallbuffers one is too small for higher rates. What does the -htt one do?

As for ATF, goferit toke... :slight_smile:

Longer term I do wish we had a good way of rate limiting multicast and even more importantly, limiting retries, on this firmware at lower rates. Any clues to be had here?

The HTT-MGT firmware transports management frames over the
normal HTT tx path instead of over wmi.

What is the tradeoffs of that? What do other platforms like ath9k or mt76 do?

The right place to rate limit multicast is probably in the firewall / bridge firewall. The sooner we transition to nftables the better. Using the firewall allows you to choose which multicast to drop much more fine-grained than just a queue based solution. For example it'd be a major problem to drop ipv6 router adverts or neighbor discovery packets, but it wouldn't be a problem to drop the occasional mDNS or Chromecast discovery or dropbox discovery packet. Also I think OpenWrt should move to a situation where it's default to eliminate 6Mbps modulation and below, and have a checkbox in Luci "enable 6Mbps rates (increases range but slows network)" which turns back on 6Mbps, and "enable really slow / legacy rates (only needed if working with pre 2005 hardware)" which turns on the 802.11b rates like 1Mbps etc.

If by default you're doing 12Mbps, first of all it works better with the higher density deployments that are becoming the norm (2 or 3 APs per single-family-home) second of all it makes multicast more viable, as you can multicast a full HD TV stream and still not choke the AP, finally it improves roaming as devices will disconnect more quickly rather than shifting to stupid low-speed rates.

1 Like

Doesn't OpenWrt convert multicast to unicast over WiFi by default?
https://forum.openwrt.org/t/multicast-to-unicast/44318 indicates that it at least should be possible, no?
What am I missing here?

I don't think it's by default. But yes, it is in theory possible. I'm not sure whether it only works for ipv4 or also works for ipv6.

I have always thought it a bad idea to convert most forms of multicast to unicast at layer 2 arbitrarily. Rate limiting multicast to no more than a Xms burst and then serving up the unicast backlog in turn, would be better. Applying fq_codel to multicast would drop bursts like you can get from a mdns-scan more often than dns or dhcp. fq_codel also does
head drop and mdns tends to send a lot of repetitive info.

That said, mdns does not do any congestion control, and it seems possible with
a significant enough flood it will end up clamped out entirely. Which is why we test. But the present situation with wifi multicast is frequently intolerable.

While I'm at it, reducing the max size of the txop under contention is a very good way to restore interactive performance with multiple stations present. There's no way (or wasn't when I looked years ago), to do this dynamically as a function of load. I ran with a BE txop size of 2-3ms back when I was prototyping higher density solutions. It can be reduced in both the advertisement to the stations and on the AP via hostapd.

Data point: two summers ago my wife was teaching at the Cold Spring Harbor Laboratories (a biology research institution). My kids and I went along for the fishing and sailing and BBQs etc. When I was there my phone which normally would make it through a whole day on a single charge was making it like 4 hours instead. I fired up my laptop and wiresharked the WiFi, sure enough the entire campus, which had something like a few thousand employees and researchers was on a single broadcast domain, and every device on the campus was hammering the LAN with multicast packets announcing printers, SMB shares, mDNS entries, smart speakers and smart TVs etc. If I remember correctly it was on the order of hundreds of packets per second, like 150 to 300. I'm fairly sure that what was going on was that the phone was never going into very good sleep mode as it constantly had to receive and process multicast packets.

I'm really not sure what a good solution is for that. I know that for my part if it ever happens again (and COVID has shut down the courses at the lab, so I'm not sure) I'll be bringing a travel router and in the apartment I'll connect to my own SSID. But I think a lot could have been done by rate limiting multicast on the ports where the APs were plugged in. But what are you going to do? rate limit it to 20 pps when 300 are trying to exit the port? This seems like it's going to piss people off who are now unable to find their SMB share or printer. I think the better solution would probably have been to ensure that 12Mbps was the slowest allowable rate, and then the 240kbps would be only 2% of the airtime, whereas in reality they were probably using 1Mbps as the lowest rate and the radio had to be on 25% of the time.