Help prioritizing games with alternative qdisc design

Check if you have them..it is not "standard" it is by device!Some have it ..some not.
Run...
ls -lha /lib/modules/$(uname -r)/ | grep sch

To check.

Gaming is a billion dollar industry to say the least and ESports has become mainstream. I like the idea of REAL testing of what works vs assumptions based on voip. Some titles have great services with rock solid servers but others like COD reel folks in by using dedi servers the first year then switch to hybrid followed by p2p. I'm to the point where I'd like to rent servers and charge $5 a month for each person to play or something along those lines to ensure a better experience and consistent results.

Players are mainly concerned wiith a consistent and speedy delivery of game information for the best experience possible under their control. This seems like a great start if we can confirm the results.

If you have qfq on your router, and you want to test it, it will require some more development, so let me know.

I'm using the 19.07.4 x86 image and I'm almost positive it doesn't have anything extra like qfq

If it has qfq it will probably be in kmod-sched or something like that, so check if you install that.

I spent a little time this afternoon and added in a more "full" tiered queueing setup... Now it has 1 realtime queue, and 4 non-realtime link-share queues, and the ability to classify with some DSCP values.

I'm not sure if it fully works. anyone who wants to try it feel free.

It still has UPRATE and DOWNRATE and GAMEUP and GAMEDOWN... it will tag packets from the GAMINGIP as CS7... any other packets you want to classify into realtime you can tag EF, CS5, CS6, or CS7... if qfq were available we could sub-stratify the realtime stuff (so for example your game play would be given priority over say in-game VOIP) but for right now all realtime stuff is considered the same.

Then CS4, AF41 and AF42 will be considered as a "fast" priority non-realtime, so maybe you want to use this for video conference stuff for example, which won't interfere with games, but will also go ahead of general web browsing.

CS2 will get downgraded from normal priority, and generally accept longer lag
CS1 is ideal for torrents and things that can stall out almost completely for short periods in order to give other classes time to go ahead.

By default things will go into the default class, which is fine.

#!/bin/sh

## "atm" for old-school DSL or change to "DOCSIS" for cable modem, or
## "other" or anything else, for everything else

LINKTYPE="ethernet"

WAN=veth0 # change this to your WAN device name
UPRATE=18000 #change this to your kbps upload speed
LAN=veth1
DOWNRATE=65000 #change this to about 80% of your download speed (in kbps)

## how many kbps of UDP upload and download do you need for your games
## across all gaming machines? 

GAMEUP=800
GAMEDOWN=1600


if [ $((DOWNRATE*10/UPRATE > 100)) -eq 1 ]; then
    echo "We limit the downrate to at most 10x the upstream rate to ensure no upstream ACK floods occur which can cause game packet drops"
    DOWNRATE=$((10*UPRATE))
fi



## set this to "red" or if you want to differentiate between game
## packets into 3 different classes you can use either "drr" or "qfq"
## be aware not all machines will have drr or qfq available
## also qfq or drr require setting up tc filters!

gameqdisc="red"

GAMINGIP="192.168.1.111" ## change this



cat <<EOF

This script prioritizes the UDP packets from / to a set of gaming
machines into a real-time HFSC queue with guaranteed total bandwidth 

Based on your settings:

Game upload guarantee = $GAMEUP kbps
Game download guarantee = $GAMEDOWN kbps

Download direction only works if you install this on a *wired* router
and there is a separate AP wired into your network, because otherwise
there are multiple parallel queues for traffic to leave your router
heading to the LAN.

Based on your link total bandwidth, the **minimum** amount of jitter
you should expect in your network is about:

UP = $(((1500*8)*3/UPRATE)) ms

DOWN = $(((1500*8)*3/DOWNRATE)) ms

In order to get lower minimum jitter you must upgrade the speed of
your link, no queuing system can help.

Please note for your display rate that:

at 30Hz, one on screen frame lasts:   33.3 ms
at 60Hz, one on screen frame lasts:   16.6 ms
at 144Hz, one on screen frame lasts:   6.9 ms

This means the typical gamer is sensitive to as little as on the order
of 5ms of jitter. To get 5ms minimum jitter you should have bandwidth
in each direction of at least:

$((1500*8*3/5)) kbps

The queue system can ONLY control bandwidth and jitter in the link
between your router and the VERY FIRST device in the ISP
network. Typically you will have 5 to 10 devices between your router
and your gaming server, any of those can have variable delay and ruin
your gaming, and there is NOTHING that your router can do about it.

EOF




setqdisc () {
DEV=$1
RATE=$2
OH=37
MTU=1500
highrate=$((RATE*90/100))
lowrate=$((RATE*10/100))
gamerate=$3
useqdisc=$4


tc qdisc del dev "$DEV" root > /dev/null

case $LINKTYPE in
    "atm")
	tc qdisc replace dev "$DEV" handle 1: root stab mtu 2047 tsize 512 mpu 68 overhead ${OH} linklayer atm hfsc default 3
	;;
    "DOCSIS")
	tc qdisc replace dev $DEV stab overhead 25 linklayer ethernet handle 1: root hfsc default 13
	;;
    *)
	tc qdisc replace dev $DEV stab overhead 40 linklayer ethernet handle 1: root hfsc default 13
	;;
esac
     

DUR=$((5*1500*8/RATE))
if [ $DUR -lt 25 ]; then
    DUR=25
fi



#limit the link overall:
tc class add dev "$DEV" parent 1: classid 1:1 hfsc ls m2 "${RATE}kbit" ul m2 "${RATE}kbit"

# high prio realtime class
tc class add dev "$DEV" parent 1:1 classid 1:11 hfsc rt m1 "$((RATE*90/100))kbit" d "${DUR}ms" m2 "${gamerate}kbit"

# fast non-realtime
tc class add dev "$DEV" parent 1:1 classid 1:12 hfsc ls m1 "$((RATE*75/100))kbit" d "${DUR}ms" m2 "$((RATE*30/100))kbit"

# normal
tc class add dev "$DEV" parent 1:1 classid 1:13 hfsc ls m1 "$((RATE*20/100))kbit" d "${DUR}ms" m2 "$((RATE*50/100))kbit"

# low prio
tc class add dev "$DEV" parent 1:1 classid 1:14 hfsc ls m1 "$((RATE*4/100))kbit" d "${DUR}ms" m2 "$((RATE*15/100))kbit"

# bulk
tc class add dev "$DEV" parent 1:1 classid 1:15 hfsc ls m1 "$((RATE*1/100))kbit" d "${DUR}ms" m2 "$((RATE*5/100))kbit"



## set this to "drr" or "qfq" to differentiate between different game
## packets, or use "pfifo" to treat all game packets equally

REDMIN=$((gamerate*30/8)) #30 ms of data

if [ $REDMIN -lt 3000 ]; then
    REDMIN=3000
fi
REDMAX=$((REDMIN * 4)) #200ms of data


case $useqdisc in
    "drr")
	tc qdisc add dev "$DEV" parent 1:11 handle 2:0 drr
	tc class add dev "$DEV" parent 2:0 classid 2:1 drr quantum 8000
	tc qdisc add dev "$DEV" parent 2:1 handle 10: red limit 150000 min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit probability 1.0
	tc class add dev "$DEV" parent 2:0 classid 2:2 drr quantum 4000
	tc qdisc add dev "$DEV" parent 2:2 handle 20: red limit 150000 min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit probability 1.0
	tc class add dev "$DEV" parent 2:0 classid 2:3 drr quantum 1000
	tc qdisc add dev "$DEV" parent 2:3 handle 30: red limit 150000  min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit probability 1.0
	## with this send high priority game packets to 10:, medium to 20:, normal to 30:
	## games will not starve but be given relative importance based on the quantum parameter
    ;;

    "qfq")
	tc qdisc add dev "$DEV" parent 1:11 handle 2:0 qfq
	tc class add dev "$DEV" parent 2:0 classid 2:1 qfq weight 8000
	tc qdisc add dev "$DEV" parent 2:1 handle 10: red limit 150000  min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit probability 1.0
	tc class add dev "$DEV" parent 2:0 classid 2:2 qfq weight 4000
	tc qdisc add dev "$DEV" parent 2:2 handle 20: red limit 150000 min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit probability 1.0
	tc class add dev "$DEV" parent 2:0 classid 2:3 qfq weight 1000
	tc qdisc add dev "$DEV" parent 2:3 handle 30: red limit 150000  min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit probability 1.0
	## with this send high priority game packets to 10:, medium to 20:, normal to 30:
	## games will not starve but be given relative importance based on the weight parameter

    ;;

    *)

	tc qdisc add dev "$DEV" parent 1:11 handle 10: red limit 150000 min $REDMIN max $REDMAX avpkt 500 bandwidth ${RATE}kbit  probability 1.0
	## send game packets to 10:, they're all treated the same
	
    ;;
esac

INTVL=$((100+2*1500*8/RATE))
TARG=$((2*1500*8/RATE+5))

if [ $((MTU * 8 * 10 / RATE > 50)) -eq 1 ]; then ## if one MTU packet takes more than 5ms
    echo "adding PIE qdisc for non-game traffic due to slow link"
    for i in 12 13 14 15; do 
	tc qdisc add dev "$DEV" parent "1:$i" pie limit  "$((RATE * 200 / (MTU * 8)))" target "${TARG}ms" ecn tupdate "$((TARG*3))ms" bytemode
    done
else ## we can have queues with multiple packets without major delays, fair queuing is more meaningful
    echo "adding fq_codel qdisc for non-game traffic due to fast link"

    for i in 12 13 14 15; do 
	tc qdisc add dev "$DEV" parent "1:$i" fq_codel memory_limit $((RATE*200/8)) interval "${INTVL}ms" target "${TARG}ms" quantum $((MTU * 2))
    done
fi

}


setqdisc $WAN $UPRATE $GAMEUP $gameqdisc

## uncomment this to do the download direction via output of LAN
setqdisc $LAN $DOWNRATE $GAMEDOWN $gameqdisc

## we want to classify packets, so use these rules

cat <<EOF

We are going to add classification rules via iptables to the
POSTROUTING chain. You should actually read and ensure that these
rules make sense in your firewall before running this script. 

Continue? (type y or n and then RETURN/ENTER)
EOF

read -r cont

if [ "$cont" = "y" ]; then

    iptables -t mangle -F POSTROUTING

    iptables -t mangle -A POSTROUTING -p udp -s ${GAMINGIP} -j DSCP --set-dscp-class CS7
    iptables -t mangle -A POSTROUTING -p udp -d ${GAMINGIP} -j DSCP --set-dscp-class CS7

    iptables -t mangle -A POSTROUTING -j CLASSIFY --set-class 1:13 # default everything to 1:13,  the "normal" qdisc

    
    ## these dscp values go to realtime: EF, CS5, CS6, CS7
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class EF -j CLASSIFY --set-class 1:11
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class CS5 -j CLASSIFY --set-class 1:11
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class CS6 -j CLASSIFY --set-class 1:11
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class CS7 -j CLASSIFY --set-class 1:11

    iptables -t mangle -A POSTROUTING -m dscp --dscp-class CS4 -j CLASSIFY --set-class 1:12
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class AF41 -j CLASSIFY --set-class 1:12
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class AF42 -j CLASSIFY --set-class 1:12

    iptables -t mangle -A POSTROUTING -m dscp --dscp-class CS2 -j CLASSIFY --set-class 1:14
    iptables -t mangle -A POSTROUTING -m dscp --dscp-class CS1 -j CLASSIFY --set-class 1:15
    
    if [ "$gameqdisc" = "red" ]; then
	echo "Everything is taken care of for RED qdisc"
    else
	echo "YOU MUST PLACE CLASSIFIERS FOR YOUR GAME TRAFFIC HERE"
	echo "SEND GAME TRAFFIC TO 2:1 (high) or 2:2 (medium) or 2:3 (normal)"
	echo "Requires use of tc filters! -j CLASSIFY won't work!"
    fi

    if [ $((DOWNRATE*10/UPRATE > 45)) -eq 1 ]; then
	## we need to trim acks in the upstream direction, we let
	## through a certain number based on download rate and 540
	## byte MSS, then drop 90% of the rest:
	ACKRATE=$((DOWNRATE*1000/8/540*150/100))
	iptables -A forwarding_rule -p tcp -m tcp --tcp-flags ACK ACK -o $WAN -m length --length 0:100 -m limit --limit ${ACKRATE}/second --limit-burst ${ACKRATE} -j ACCEPT
	iptables -A forwarding_rule -p tcp -m tcp --tcp-flags ACK ACK -o $WAN  -m length --length 0:100 -m statistic --mode random --probability .90 -j DROP
    fi

    iptables -t mangle -F FORWARD # to flush the openwrt default MSS clamping rule
    if [ $UPRATE -lt 3000 ]; then
	iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -o $LAN -j TCPMSS --set-mss 540
    fi
    if [ $DOWNRATE -lt 3000 ]; then
	## need to clamp MSS to 540 bytes in both directions to reduce
	## the latency increase caused by 1 packet ahead of us in the
	## queue since rates are too low to send 1500 byte packets at acceptable delay
	iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -o $WAN -j TCPMSS --set-mss 540
    fi


else
    cat <<EOF
Check the rules and come back when you're ready.
EOF
fi

echo "DONE!"


if [ "$gameqdisc" = "red" ]; then
   echo "Can not output tc -s qdisc because it crashes on OpenWrt when using RED qdisc, but things are working!"
else
   tc -s qdisc
fi


1 Like

Not sure whether you did look already, but cake's diffserv4 aims at something similar that also mapps reasonably well into WiFi WMM ACs. Basically you only get one lower than Best Effort priority class, and a few of the dscps might map differently... Especially CS2...
And have a look at https://tools.ietf.org/html/rfc8622 for another contender DSCP for the least effort priority tier.

This might be one of the reasons why ACKs can be problematic, btw. Maybe setting the quantum to say 100 might avoid the release of too many ACKs to the next level? In SQM we try to scale all burst and quantum numbers such that the time to service them is bound to something acceptable, maybe also an idea for this script?

hello this is my result with new script

root@OpenWrt:~# tc -s class show dev eth0.2
class hfsc 1:11 parent 1:1 leaf 10: rt m1 16200Kbit d 25.0ms m2 800Kbit
 Sent 8858108 bytes 39851 pkt (dropped 802, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 0 work 8858108 bytes rtwork 8858108 bytes level 0

class hfsc 1: root
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 0 level 2

class hfsc 1:1 parent 1: ls m1 0bit d 0us m2 18Mbit ul m1 0bit d 0us m2 18Mbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 880748 work 174324705 bytes level 1

class hfsc 1:13 parent 1:1 leaf 800f: ls m1 3600Kbit d 25.0ms m2 9Mbit
 Sent 165466597 bytes 1086707 pkt (dropped 1, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 880748 work 165466597 bytes level 0

class hfsc 1:12 parent 1:1 leaf 800e: ls m1 13500Kbit d 25.0ms m2 5400Kbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 0 level 0

class hfsc 1:15 parent 1:1 leaf 8011: ls m1 180Kbit d 25.0ms m2 900Kbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 0 level 0

class hfsc 1:14 parent 1:1 leaf 8010: ls m1 720Kbit d 25.0ms m2 2700Kbit
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 period 0 level 0

class fq_codel 800f:262 parent 800f:
 (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  deficit 1918 count 0 lastcount 0 ldelay 420us
class red 10:1 parent 10:

root@OpenWrt:~#
root@OpenWrt:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms memory_limit 4Mb ecn
 Sent 2884665331 bytes 3311737 pkt (dropped 0, overlimits 0 requeues 294)
 backlog 0b 0p requeues 294
  maxpacket 7570 drop_overlimit 0 new_flow_count 37293 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc hfsc 1: dev eth0.1 root refcnt 2 default 13
 Sent 2523763254 bytes 1747070 pkt (dropped 9140, overlimits 843403 requeues 0)
 backlog 0b 0p requeues 0
Segmentation fault
root@OpenWrt:~#


Oy, DSCP is such a mess... The concept is fine but the standardization is just bonkers.

Practically speaking, what linux does by default to DSCP->WMM is https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/queues?s[]=dscp it seems to take CS1 and CS2 to bulk. CS0 and CS3 are best effort and then CS4,5 are video and CS6,7 are voice. This follows a similar path, but offers two downgraded tiers that merge in the WMM bulk. You can use the CS2 tier for say filesystem traffic or downloads running for a few mins, and the CS1 tier for torrents or downloads running beyond a few mins or backups...

The IETF recommended 000001 tag maps to best effort in the WMM and that seems unfortunate.

fq_codel quantum I think is used like deficit in DRR, so it will move from queue to queue, but ultimately it's the higher level qdisc that determines the number of packets. HFSC will dequeue 1 at a time I believe unlike HTB which will dump multiple packets based on its quantum.

Well, the idea is that LE should be come an end2end DSCP and not just a PHB, so this was selected as it has low probability of being bleached to DCSP 0 during transit. Makes sense IMHO. The ball was dropped early on, when the decision was to keep the default besteffort PHB at 0, instead of moving that up to 1, but there are also good reasons why that was never a realistic option....

+1.

But regarding linux, 4 priority tiers are actually enough for quite a lot of situations, so WMM has most normal cases covered. WMM's fault is not introducing rate limits for AC_VO and AC_VI, but I digress...

Agreed, as long as the mappings work. Cake got this scrambled relative to linux standard behavior. Diffserv4 treats as

diffserv4
Provides a general-purpose Diffserv implementation with four
tins:
Bulk (CS1), 6.25% threshold, generally low priority.
Best Effort (general), 100% threshold.
Video (AF4x, AF3x, CS3, AF2x, CS2, TOS4, TOS1), 50%
threshold.
Voice (CS7, CS6, EF, VA, CS5, CS4), 25% threshold.

So the cake handling of CS3 and CS2 and related classes is different from the WMM handling .. CS2 is bulk WMM and CS3 is BE but both are "video" to cake

Yes, I believe this partially because cake's definitions are not tailored for perfect fit with the arguably wrong default AC mappings. https://tools.ietf.org/html/rfc8325 for example recommends to put CS2 into AC_BE...
Realistically though, that ship has sailed and WMM default definitions will constrain any non compatible marking scheme severely.

As you said DSCPs are a mess :wink:

Ok, I think this thread is pretty long as is... I'm going to post a link to the github version of the script which will always be the most up to date one... @knomax can you accept this post as the answer to the thread and we can do a couple rounds of debugging and then voila it's all here for posterity?

To download the script, go to the above page, click the "raw" button, and then save the page as your script... voila

If you want to put the file directly to your router... log into the router and do:

cd /tmp
wget https://github.com/dlakelan/routerperf/raw/master/SimpleHFSCgamerscript.sh
2 Likes

Yes of course..i start now and will post instructions and for this newer script.

Thanks! I really appreciate all the work that @knomax did in private messages to test versions of this script on a very constrained and non-ideal line, including many hours of router reboots and packet captures. His games look smooth as butter now even though he has only 830kbps upload! It's quite amazing how much better this is than when we started! Thanks for the late nights testing... I am in California and he is in somewhere far away (Greece?) so this international effort should be celebrated!

Thanks @dlakelan i just "throw" an idea and you take it to "another" level without you this will just stay as an "idea".

Great work! This week is too busy for any testing as I have late nights of scheduled maintenance operations. I look forward to testing it this weekend possibly.

2 Likes

For who is this?

wrong thread lol . multiple tabs open

so all we do is install this script into the /tmp folder?

we still need to modify it to work with our own interface such as eth0.1 ?