Wifi buffer bloat

Hi all,

is there anything I can do to mitigate wifi/wireless buffer bloating? I verified that the wifi is the bottleneck, not upstream.
Ideal would be something automatic which monitors latency and adjust, but adding a manual limit per client would already help.

would it help to shorten the TX queue of the wlan?

Any suggestions?

Thx,

Ramon

Sure you can, Ramón. This is my current computer connected to the WiFi were the bottleneck sits like in your environment:

https://www.waveform.com/tools/bufferbloat?test-id=c4af5488-7d2c-4f2b-bcd5-f963b55d0fe7

SQM enabled and 10 metres apart from the router and crossing 2 dry walls, in a heavy used network, i.e., Teams call running, NetFlix, Youtube, FaceTime video call and a Geforce NOW gaming session. Not a single hiccup.

The question is what did you enable SQM on? And how does that take into account the individual speeds of the wifi clients?

Thx

fq_codel in the AP does, see below the main relevant configuration options.

/etc/config/sqm — to note this is in the router DOCSIS 3.1 connection (1000/50 Mbps)

config queue 'eth1'
	option interface 'eth1'
	option qdisc 'cake'
	option linklayer 'none'
	option upload '47000'
	option debug_logging '0'
	option verbosity '5'
	option qdisc_advanced '1'
	option qdisc_really_really_advanced '1'
	option ingress_ecn 'ECN'
	option egress_ecn 'ECN'
	option enabled '1'
	option script 'layer_cake_ct.qos'
	option iqdisc_opts 'nat dual-dsthost ingress docsis noatm rtt 80ms diffserv4'
	option eqdisc_opts 'nat ack-filter dual-srchost docsis noatm rtt 80ms diffserv4'
	option download '800000'
	option squash_dscp '0'
	option squash_ingress '0'

Note: I use DSCP marking in my router only, this is why the squash_* parameters and layer_cake_ct.qos; more info searching for dscpclassify.

/etc/config/wireless —in the access point


config wifi-iface 'wifinet1'
	option device 'radio1'
	option mode 'ap'
	option ssid 'Chaos'
	option encryption 'psk2+ccmp'
	option key 'mi.clave'
	option network 'lan lan6'
	option ieee80211r '1'
	option nasid '000C432660B1'
	option mobility_domain 'b347'
	option ft_psk_generate_local '1'
	option ft_over_ds '0'
	option reassociation_deadline '20000'
	option wds '1'
	option ieee80211k '1'
	option bss_transition '1'
	option wnm_sleep_mode '1'
	option time_advertisement '2'
	option time_zone 'AEST-10AEDT,M10.1.0,M4.1.0/3'
	option disassoc_low_ack '0'
	option dtim_period '4'
	option iw_qos_map_set '0,63,255,255,255,255,255,255,255,255,255,255,255,255,255,255'

Note: I don't use QoS marking in my WiFi network everything goes to WM_BE, as you can see above.

And the last final relevant parameters to improve latency under load, are:

# Change AQL parameters, see https://forum.openwrt.org/t/aql-and-the-ath10k-is-lovely/59002/304?page=16
for ac in 0 1 2 3; do echo $ac 1500 1500 > /sys/kernel/debug/ieee80211/phy0/aql_txq_limit; done
for ac in 0 1 2 3; do echo $ac 1500 1500 > /sys/kernel/debug/ieee80211/phy1/aql_txq_limit; done

# Disable AQL threshold (testing, can it be zero?)
echo 1500 > /sys/kernel/debug/ieee80211/phy0/aql_threshold
echo 1500 > /sys/kernel/debug/ieee80211/phy1/aql_threshold

And finally, as a different test. No traffic in the network, sitting 3 m apart from the router (by the way this is a Ubiquiti NanoHD WiFi 5 AP running r24727 snapshot):

I hope this helps.

Interesting. So do i need to have SQM running on the eth1 on the router (i have a R4S using a 1/1 gig fiber, so im not very likely to actually fill the link up)?
Or is it enough to just tweak the AQL parameters?
What does this do exactly? option iw_qos_map_set '0,63,255,255,255,255,255,255,255,255,255,255,255,255,255,255'

Thx!

It makes sure all dscps are treated as UP0, AC_BE...

1 Like

If you want to keep under control buffer bloat, you must run OpenWrt in your AP. In my case the router upload bandwidth is another choking point as my connection is highly asymmetric, that's why running SQM in the router helps with it.

BTW, eth1 in my router (RPi4) is my WAN connection.

Tweaking the AQL parameters on the AP does not really help my case. If I run cake/piece-of-cake on my router and limit the up/down speed then my buffer bloat goes to ~0, without it i have something like 8/+4/+17... Maybe the R7800 is just not that good, or i have too much packet loss or something.

Is this considered as general "good practice" at this point? Or is this still a case-by-case type of modification where "use it where it fits" applies?

Good question. Scheduling tx opportunities should be easier if there is no additional prioritization requirement to take care of, I guess. I am still mostly running with OpenWrt default qos_map (I just pushed the NQB dscp number to AC_BE instead of AC_VI).

1 Like

I'm sorry it does not help. In my case, even if I disable SQM it keeps my traffic under control and with low latency. Might be my ISP or my AP OpenWrt software, or the OpenWrt drivers used (mt76).

Yeah, I would say "user preference" here.

1 Like

Obviously not your fault. I do appreciate the insights you gave me with the config you posted. I learned a lot.
Im guessing its the wireless driver difference.

Anyway, thx again!

1 Like