SQM, Flow Offloading, VLAN Tagging and Gaming

Hey guys,

passively reading for some time, I encounter some problems/weird behaviours.

Background: I play CS GO on a very competitive level and the only purpose of the router/internet is to get this game to its maximum advantage.

Setup: ISP is Deutsche Telekom with VDSL2 23,2 up / 4,8 down. ISP requires VLAN 7 Tagging. My Router is a Linksys WRT 32X.

First, I am not sure if my VLAN tagging is optimal. I set it up as below (Picture will come in 20 min).
image

Second, I played around A LOT with SQM and Flow Offloading.
My experience so far: It is working best with NO SQM and NO flow offloading. But that seems not right or? HW ACC. is not working good at all. Cake with piece or layer both not working good. Hitreg is bad and it feels a little bit delayed but lag free. Weird finding with wireshark: I setup the DSCP for csgo.exe to 46 but only a fraction of UDP packets have DSCP flag. A lot of them are just unflagged. When I turn on Flow Offloading the response seems to be often faster but it also generates a little lag sometimes. Furthermore, I would really like to use SQM as gaming becomes impossible as soon as one device in the network is doing something more than normal surfing (girlfriend netflixing, instagramming etc.)

So my question is probably: Isn't there a way to configure SQM and or Offloading so it suits this game best? I am happy for any advice from the more experienced and will try it out immediately. I have tried to set the Overhead to 8, 27 and 31 bytes as well as tried it with enabling ECN on egress and switching the DSCP settings for egress (however, I am pretty sure the upload side is the problem).

Additional info: I have two NICs in my computer, one Intel i-219v and one killer e2500. tested with both

2 Likes

First, is the device you're running the game on connected by wire? Don't try to play competitive games from wifi.

Second, i'm not sure how you're DSCP tagging, but if you want the DSCP tags to be used I think you need layer cake.

In order for SQM to work well you will need to shave off some of the raw bandwidth. Start with say 96% of your unregulated bandwidth and drop it by 2% in increments until you get something that works stably, or you can start at say 80 and increase it in increments... you should definitely be able to get good results somewhere between 80 and 95% depending on specifics.

Third, I don't recommend flow offloading in your case. The WRT 32X should have enough power to handle at least 5-10x your connection without it and it adds complexity in debugging.

Edit:

Fourth, you can put two SQM instances on your router, one for egress on WAN and one for egress on LAN and then manipulate DSCP tags in your router using iptables. Unless you use this method iptables won't run until after the queuing so it doesn't help.

I would definitely try to get DSCP working, it will let you prioritize your game traffic above netflix etc. Buffered video playback has nowhere near the latency requirements of gaming. A couple of seconds of delay can still be fine for netflix, though if you can keep it below 100ms you'll likely get the best results, but for VOIP or games 20ms of variability in delay can be enough to be a problem.

2 Likes

Thank you for your reply.

First: Yes of course its wired.

Second: I tried layer cake for that reason but it was not as good as without SQM. And I also figured out the bandwidth and I am aware that I have to cut off some.

Third: Ok. Got that. However, the hitreg seems to be better with it enabled and I have less micro lag.

Fourth: Unfortunately I dont know how to do that iptables thing and to manipulate the dscp on the router. Can you explain it to me?

I would love to get DSCP completely running as I guess it would be an improvement.#

Edit: Here are results with SQM cake layer. that looks awesome. But hitreg is alot worse than without sqm

1 Like

Your dslreports results are very good. It's hard to know whether the hitreg / micro lag is real unless you experience it consistently in the two conditions, if it's just an occasional feeling, variation in the traffic at the server and etc are also factors that would confuse things.

is a link to a post where I give examples of how to add custom firewall commands to mark DSCP, you can put this kind of thing into your /etc/firewall.user But to mark DSCP properly requires being able to decide which packets to prioritize. You could for example ensure that your gaming rig gets a static IP, and then just tag all the UDP packets to or from that rig.

As I said, instead of doing both an uplink and a downlink on WAN, just do an uplink on wan and an uplink on LAN (with the speed of your ISP downlink). Then packets coming in WAN will run through the iptables and be queued / shaped as they leave LAN. Note that this ignores traffic going to wifi. So you might do a simple piece of cake downlink on WAN in addition to the uplink on LAN so that total bandwidth is limited not just wired LAN bandwidth.

shapers work by delaying packets very slightly so as to meet the bandwidth requirements, so you will experience somewhat slightly more delay during idle than you would without a shaper, but you will experience substantially less delay during load with the shaper, basically things get more consistent.

One issue is your uplink is about 3.7Mbps, the largest packet you can send is probably 1500 bytes. At 3.7Mbps a 1500 byte packet takes 3.2 ms to send down the wire. So during load the fastest your system will be able to react to a keypress is 3.2ms and there is nothing you can do to reduce that (except ensure that your game is the only thing sending packets). Game packets are probably only a few hundred bytes, so a 200 byte packet takes 0.4 ms to send. I don't know what the reaction time is for a twitch gamer, but looking online suggests it's about 100ms (more like 200 for a non-gamer). A bigger issue then is probably packet drop rather than packet delay. If the game is sending a game state every say 100ms and you lose a packet in flight, it will appear that you have a 200ms delay. Even if the game is sending state every say 20ms, loss of a packet will be potentially noticeable as you increase your delay from 20ms to 40ms

using layer cake and DSCP should prioritize your game packets so that they don't get delayed and more importantly dropped. simpler systems like piece of cake will not know to avoid dropping your game packets, as they try to get fair bandwidth usage to all the different machines.

3 Likes

Thanks a lot again for your reply.

I entered the iptables as seen below (is that correct?):

image

The SQM is now setup as this (is that correct?) - with the second sqm enabled (upload lan) it feels not as good, but maybe I set it up wrong :

image

It is definitely working better with the DSCP markings and they show correct in wireshark. However, there are still a lot of packets with default markings even though they are in the written portrange as you can see below:

image

Also another question: Shall I still keep the policy based qos setting with csgo.exe on 46 (EF)?

To your other points: yes, you are totally right that a dropped packet is the problem and not as much a delayed packet.

Thank you a lot in advance

It looks like you are capturing on your LAN computer. All the packets leaving your computer are unmarked, as you might expect if your efforts to get DSCP on your packets using a policy wasn't working. Packets arriving at your computer are marked as they went through the iptables on the router before arriving where you capture.

With things set up as you have them, the packets leaving out your WAN port should have DSCP on them because they are received in your router at the LAN and then sent through iptables before being queued for output on the WAN.

You should enable layer_cake on both your SQM instances. See how that works.

1 Like

Thanks for the fast response. Yeah I am capturing on my LAN Comp. Ok, packets leaving the computer are unmarked - understood.- makes sense. Then: Should I mark them? Should it be CS5 or EF?

layer_cake is enabled on both instances.

So I setup everything correct?

seems like it is set up ok, more or less. I know nothing about windows QoS policy because I'm an all Linux all the time guy, but if you can force Windows to put CS5 on the appropriate packets leaving your computer, it could help. CS5 = decimal 40

EDIT: it really won't help that much, because the router will be putting the tag on there before sending it out WAN anyway. it would only really help if you have bottlenecks in your LAN and/or a smart switch with QoS on your LAN

EDIT2: also please note that the output of your WiFi is unaffected by this SQM, so someone suddenly hitting netflix via wifi will still cause problems. There is a way to help with that using a veth pair, but first just test this to see if it's helping you by testing it only with wired clients (like, have someone queue up a netflix stream on a wired machine while you test out your game, and keep wifi out of it).

2 Likes

I just tested it for 5 mins, but to also mark the outgoind packets with cs5 works like a charm!

SQM works also very good with netflixing girlfriends in the lan but I havent tried it with a wifi device. will do so soon. Could I just put another SQM on wifi? Or is that too much?

Thank you so much, it has really helped a lot!

Well you could but it won't necessarily work so well, the issue is that you need to keep the total download below your 22Mbps or whatever it is. To do that you can do the following:

  1. create a veth0 - veth1 pair (requires you to install the veth packages on your router and maybe the full "ip" suite)
  2. put veth1 into your br-lan bridge
  3. alter your routing table so that all packets coming in on WAN get forwarded via veth0, which will send them down a virtual wire to veth1 and inject them into the bridge where they'll get distributed between wired and wireless
  4. place your layer_cake SQM on the veth0 device uplink/egress

You can create the veth pair as follows in your firewall.user script (untested)

ip link add type veth
ip link set up veth0
ip link set up veth1
ip link set veth1 master br-lan

then you'll need to add a static route so that all packets coming in WAN destined for LAN are sent to veth0 rather than br-lan, and put an SQM on veth0

note that this is a little tricky, you could lose your connection to your router if you route to veth0 and it's not properly got veth1 in the bridge or something like that. Save your config backup before you try this stuff, and be prepared to revert via failsafe.

1 Like

thank you again and again =)

It sounds a little bit complicated but I will try it on the weekend.

earlier you wrote:

Can I try it with this option in the meantime? If yes, how do I set it up? A third SQM instance, cake, piece of cake - thats easy. But which interface do I assign it to? wlan1? wlan0? And what bandwidth should I use? The wifi doesnt need that much and the main priority should still be the lan computer.

Also one more question regarding all of my SQM instances: Do you think overhead with 31 is correct? Connection is VDSL2 , PPPoE, VLAN 7 Tagging.

in your existing WAN instance you can specify a download rate, but beware that this will not have the nice properties of the DSCP tagging (so for instance it might drop your game packets).

1 Like

Ok thanks.

I will try the veth thing as nothing will stop my nice new DSCP flaggings =)

Shout out to all other gamers: Try to it up as described in this thread. Im still very happy with the result!

So one thing I am unsure about is whether you are using sqm/cake's IP isolation modes or not, as these should help to isolate different machines from each other (say your gaming from your SO's strreaming). Could you post the contents of "/etc/config/sqm" here?

Well, on a Telekom vdsl2-link the actual overhead on top of the pppoe-wan device is actually 34 Bytes (8 bytes for PPPoE, 22 bytes for the ethernet frame (src-mac(6), dst-mac(6), ethertype(2), frame-check-sequence(4), VLAN(4))) and 4 bytes for the PTM overhead.

DTAG actually uses a traffic shaper at the BNG/BRAS level, so the uplink 4.8 Mbps are not the relevant limit for your uplink shaper. Luckily they for some time now report their estimated achivable tcp/ipv4/http goodput values as part of the PPPoE handshake, (in /etc/ppp/options uncomment "debug" then the PPPoE ACK message will be displayed in the log ("logread | grep -e pppd" will give you the output, look for SRD (download) SRU (ipload))). To get from there to the real configurable gross upload rate calculate:
SRU * (1526)/(1500-8-20-20)
Same for download, just replace SRU with SRD and keep in mind that ingress/download shaping requires a somewhat larger bandwidth sacrifice then egress/upload shaping, so for egress you might be able to set 100% of the calculated value above, for ingress I guess something like 90% of that number should work more reliable.

Getting the shaping bandwidth and overhead is really important for sqm to work as intended, after getting that right, you might want to retest even without the dscp magic (for testing, by all means use dscps there is a reason why sqm supports them). The next step would be internal IP isolation (see https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details especially the section titled " Making cake sing and dance, on a tight rope without a safety net").

Good luck!

1 Like

Thanks for your reply.

here is config sqm:

image

and: How to I uncomment debug? What is the SSH command? Thanks in advance

And of course I will soon look into IP isolation and come back with probably a ton of questions =)

I guess there is still some room for experiments to see whether sqm can work a tad better even without using DSCP marking (again, I still recommend to use DSCPs for use-cases such as yours, but I fell the base configuration can be improved a bit, which partly will also help layer_cake's DSCP treatment)

Basically the protocol used by the program you used to generate the above screen shot.
SSH is short for Sechure SHell an encrypted terminal access to your router, basically a way to issue commands on the router's command line. See https://openwrt.org/docs/guide-quick-start/sshadministration for operating specific details.

By editing the file /etc/ppp/options. One way to do this should be to issue the following command:

sed -i 's/#debug/debug/g' /etc/ppp/options

This replaces the commented form #debug with the "active" form debug.

Sure, take your time, please keep your questions in the open (just keep posting in this thread).

1 Like

SRU=4757#SRD=23619#

So 4996 for upload (maybe I just put 4900?) and 24806 - 10%= 22325 for download

Is that correct?

Something I noticed: my MTU is now 1484. Is that correct or shall I increase it? In the WAN section I typed in 1492.

I noticed I had a small error in my formula above (since corrected) the on-the-wire framesize is 1526 not 1525.
4757 * (1526)/(1500-8-20-20) = 4999.44 Kbps
23619 * (1526)/(1500-8-20-20) = 24822.72 Kbps
for safety reasons I would actually reduce these a bit to:
4757 * (1526)/(1500-8-20-20) * 0.99 = 4949 -> 4950 Kbps
23619 * (1526)/(1500-8-20-20) * 0.9 = 22340 -> 22300 Kbps

But your values should work as well.... (These are starting values anyway, you really should test how the link performs under load, with say the dslreports speedtest).

The next step after getting overhead and shaper bandwidth configured will be to fine-tune the configuration, but first things first :wink:

1 Like

So on pppoe-wan openwrt will automatically set the MTU effectively to 1492, so you should be able to undo your change without any loss of connectivity...

1 Like