QoSmate: (Yet Another) Quality of Service Tool for OpenWrt

Hudra · November 29, 2024, 6:54pm

I've spent the afternoon running various tests, and I believe I've found a solution to address the RED issues, I suggest making the following changes:

Switch from rate to gamerate:

Instead of using the full bandwidth rate, use gamerate, which is typically 15% of the total bandwidth.
This results in smaller, more manageable values.
Since the realtime class only uses a fraction of the total bandwidth, gamerate provides a more accurate representation.

Implement RED's documented burst formula:

burst = (min + min + max) / (3 * avpkt)

This formula is taken directly from RED documentation.
It works reliably across all tested bandwidth ranges (1,000 to 1,000,000 kbit/s).

These changes should lead to a RED configuration that avoids warnings or errors.

Testing showed consistent behavior across different bandwidth scenarios:

Low bandwidth (1,000 kbit/s): burst = 3
Medium bandwidth (90,000 kbit/s): burst = 96
High bandwidth (1,000,000 kbit/s): burst = 1,044

Here is the code:

# Calculate REDMIN and REDMAX based on gamerate and MAXDEL
REDMIN=$((gamerate * MAXDEL / 3 / 8))
REDMAX=$((gamerate * MAXDEL / 8))

# Calculate BURST: (min + min + max)/(3 * avpkt) as per RED documentation
BURST=$(( (REDMIN + REDMIN + REDMAX) / (3 * 500) ))

# Ensure BURST is at least 2 packets
if [ $BURST -lt 2 ]; then
    BURST=2
fi

    "red")
	tc qdisc add dev "$DEV" parent 1:11 handle 10: red limit 150000 min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit burst $BURST probability 1.0
	## send game packets to 10:, they're all treated the same

@choppyc Could you run another test?

Hudra · November 29, 2024, 7:00pm

Hmm, that's really strange. I've run some tests, and the bandwidth is being properly limited on my end. Have you been experiencing this issue for a while, or did it just start recently? Or are you testing something specific?

brada4 · November 29, 2024, 7:24pm

I tried to imitate your fritzbox first with c7 (same soc gige) now with this c50v4 100mbps.
I halved bw - bad
halved again - wonderful, but total rates way off.

I install default 23.05.5 image clearing config, then qosmate using copy boxes in git. No creativity.

Up is 1000/500 then tuf4200 doing very sophisticated qos prioritising <500B packets in pfifo_fast , 2ms+5ms in waveform (5+20 with fq_codel) which makes me wonder what it really measures.

Once looked like poor box is controlling something i changed all posdible qdiscs to oversized bfifo to check if we are really in control of bufferbloat. Nice we are.

Im kind of small time firewall rule typist. Can shape teaffic without qdisc etc.

choppyc · November 29, 2024, 8:31pm

@hudra - Thats sorted it, tested up to 50ms delay in steps of 5 ms, no errors

dlakelan · November 29, 2024, 9:00pm

Certainly noone really thoroughly tested red qdisc. And there's not really a strong reason to believe it offers strong advantages.

Bfifo is imho the best qdisc for use on the game class, or fq_codel if you've got a number of different machines that play games on your network. bfifo let's us control the delay precisely. Fq_codel lets us share the game channel across multiple machines more evenly.

Hudra · November 30, 2024, 9:56am

Great, then I'll push the changes to the repo. Thanks for testing!

I completely agree, and I've always used bfifo myself. However, I have to admit that, coincidentally I've had quite good experiences with red in the last few weeks while gaming. Additionally, since I often get requests, I've been experimenting with alternative gameqdisc settings and different queuing disciplines for the real-time class. Among these, red has performed the best in some cases... at least in the games I play (COD MP and Warzone).

The problem is that my observations are mostly based on a "feels good" impression rather than any reproducible tests. I don't want to promote any "snake oil" that everyone then jumps on, so I've kept quiet about it for now.

Unfortunately, online gaming involves so many moving parts that it's challenging to conduct proper tests.

moeller0 · November 30, 2024, 10:17am

Honestly for a true and exclusive game class, for real time-ish world state update traffic only, the qdisc will not matter... Such game traffic is inelastic and your really want to reserve enough capacity that no meaningful queuing ever occurs in the first place, if it does and reaches tail dropping due to overflowing queue you already 'lost' anyway.

noob · November 30, 2024, 11:29am

i agree completely your comment moeller0 !

just for info i don't use qosmate but the idea of this qos is good i think

brada4 · November 30, 2024, 12:56pm

realtime should not suppress rest of traffic eg game load or game save....

moeller0 · November 30, 2024, 1:15pm

Then do not steer higher volume traffic into that gaming class... but even if, I would assume that loading or saving a game does not happen at the same time as actually playing the game...
But I happily admit, that my on-line gaming experience is non existing, so I might operate on wildly inaccurate assumptions.

AlanDias17 · November 30, 2024, 1:27pm

@Hudra Hey I just noticed something odd into Rules > CAKE Mapping (diffserv4):

AF2x supposed to be in Video tin right?

AlanDias17 · November 30, 2024, 1:33pm

In custom nftables rules is the following method correct to divide UDP traffic to video and voice tins and TCP traffic to Best Effort and bulk to Bulk tin. Like here:

#Pkt Length DSCP

define httpports = {80, 443, 8080}

    chain forward {
        type filter hook forward priority -10; policy accept;

        ## WASH DSCP marks to CS1 for both IPv4 and IPv6 ##
        ip dscp set cs1 counter comment "Wash all ISP DSCP marks to CS1 (IPv4)"
        ip6 dscp set cs1 counter comment "Wash all ISP DSCP marks to CS1 (IPv6)"

        ## HTTP/HTTPS and QUIC (IPv4) ##
        ip protocol {tcp, udp} th sport $httpports counter ip dscp set cs0

        ip protocol tcp th sport $httpports meta length <= 492 counter ip dscp set af13 comment "Best Effort"
        ip protocol tcp th sport $httpports meta length > 492 counter ip dscp set af11 comment "Best Effort"
        ip protocol tcp th sport $httpports meta length > 492 limit rate over 200/second counter ip dscp set cs1 comment "Bulk"

        ip protocol udp th sport $httpports meta length <= 492 counter ip dscp set cs6 comment "Voice"
        ip protocol udp th sport $httpports meta length > 492 counter ip dscp set af43 comment "Video"

        ## HTTP/HTTPS and QUIC (IPv6) ##
        ip6 nexthdr tcp th sport $httpports meta length <= 492 counter ip6 dscp set af13 comment "Best Effort"
        ip6 nexthdr tcp th sport $httpports meta length > 492 counter ip6 dscp set af11 comment "Best Effort"
        ip6 nexthdr tcp th sport $httpports meta length > 492 limit rate over 200/second counter ip6 dscp set cs1 comment "Bulk"

        ip6 nexthdr udp th sport $httpports meta length <= 492 counter ip6 dscp set cs6 comment "Voice"
        ip6 nexthdr udp th sport $httpports meta length > 492 counter ip6 dscp set af43 comment "Video"

        ## Classification by packet length (IPv4) ##
        ip protocol tcp tcp sport != $httpports meta length <= 492 counter ip dscp set cs0 comment "Best Effort"
        ip protocol tcp tcp sport != $httpports meta length > 492 counter ip dscp set cs0 comment "Best Effort"

        ip protocol udp udp sport != $httpports meta length <= 492 counter ip dscp set cs5 comment "Voice"
        ip protocol udp udp sport != $httpports meta length > 492 counter ip dscp set af33 comment "Video"

        ## Classification by packet length (IPv6) ##   
        ip6 nexthdr tcp tcp sport != $httpports meta length <= 492 counter ip6 dscp set cs0 comment "Best Effort"
        ip6 nexthdr tcp tcp sport != $httpports meta length > 492 counter ip6 dscp set cs0 comment "Best Effort"

        ip6 nexthdr udp udp sport != $httpports meta length <= 492 counter ip6 dscp set cs5 comment "Voice"
        ip6 nexthdr udp udp sport != $httpports meta length > 492 counter ip6 dscp set af33 comment "Video"
}

This will minimize both pk_delay & avg_delay around 3-5ms in Video and Voice tins. While Best Effort at 5-30ms and Bulk can go up to 45ms.
Or would you recommend a better approach please?

brada4 · November 30, 2024, 1:35pm

Can you point to the source of your pixelation?
cake source code should give you enough references

brada4 · November 30, 2024, 1:36pm

Please upgrade qosmate(..sh) and rebase

AlanDias17 · November 30, 2024, 1:40pm

Yes, I already made sure it's updated to 0.5.37. Also I couldn't find a source image online but I'm sure it's somewhere in this forum and Elan had sent me back then. Here it is:

Last-times/CAKE-QoS-Script-OpenWrt: CAKE QoS Script (OpenWrt)

Hudra · November 30, 2024, 1:40pm

Based on my experience with COD/Warzone, there are some important considerations regarding queueing disciplines for gaming traffic:

When loading into a Warzone match, the game likely performs network tests and temporarily requires more bandwidth than during actual gameplay. With bfifo and a very small queue limit (controlled via maxdel in QoSmate), this can lead to excessive packet drops that might prevent joining the game entirely. This aligns with moeller0's point about capacity reservation.

While real-time gaming traffic should indeed be segregated, modern games like Warzone typically separate different types of traffic (world state updates vs texture streaming, which is normally done via tcp). This natural separation should be reflected in our traffic classification.
Regarding the original concept behind different real-time qdiscs: I think the idea was to experiment with how games respond to different packet handling strategies. For instance, if a game server employs a jitter buffer, different qdiscs with varying drop/delay characteristics might affect the network probe differently.

brada4 · November 30, 2024, 1:43pm

Justification failed?

https://www.rfc-editor.org/rfc/rfc8325#section-4.3

moeller0 · November 30, 2024, 1:47pm

AlanDias17:

 ## HTTP/HTTPS and QUIC (IPv6) ##
        ip6 nexthdr tcp th sport $httpports meta length <= 492 counter ip6 dscp set af13 comment "Best Effort"
        ip6 nexthdr tcp th sport $httpports meta length > 492 counter ip6 dscp set af11 comment "Best Effort"
        ip6 nexthdr tcp th sport $httpports meta length > 492 limit rate over 200/second counter ip6 dscp set cs1 comment "Bulk"

Respectfully this whole section looks a bit confused, why is 492 bytes somehow considered magic and merits a change in priority? This looks a bit like ACK prioritisation in a rounds about fashion... IMHO ACKs should be treated to exactly the same priority as the forward traffic...

AlanDias17:

## WASH DSCP marks to CS1 for both IPv4 and IPv6 ##
        ip dscp set cs1 counter comment "Wash all ISP DSCP marks to CS1 (IPv4)"
        ip6 dscp set cs1 counter comment "Wash all ISP DSCP marks to CS1 (IPv6)"

And I also struggle a bit with this... a lot of gear (including WiFi) will interpret CS1 as background priority and treat it accordingly... if this is intended behaviour, I would not call this "wash" as wash implies a re-marking to the default which would be CS0/DSCP 0.

So here is the thing about these counterspk_delay & avg_delay they do not tell you about the queuing delay experiences by sparse traffic flows but will be dominated by the most self-congesting flows. You really need to measure the delay increase for the traffic of interest...

brada4 · November 30, 2024, 1:51pm

AlanDias17:

ip6 nexthdr tcp th sport $httpports meta length > 492 counter ip6 dscp set af11 comment "Best Effort"
        ip6 nexthdr tcp th sport $httpports meta length > 492 limit rate over 200/second counter ip6 dscp set cs1 comment "Bulk"

This is super inefficient ip6tables-translate.

moeller0 · November 30, 2024, 1:53pm

Please do not assume that RFC8325 is all that relevant in the real world... it is more IETF guys trying to shoehorn IETF defined PHBs into the default 4 priority classes WiFi WMM offers (spoiler alert these PHBs do only align so lala with what WiFi offers). While a laudable goal, nobody bothered to actually test how these recommendations actually stack up in real life...

Fun factoid: the EDCA parameters of the 4 default WiFI classes were picked by engineers without much research and data, on the assumption that people in the real world would do the heavy lifting and adjust those to their needs, which as far as I can tell never happened (it would also wreck havoc on the fairness of users of the same channel and hence something to address only carefully).