Using DSCP for QoS

Mmmh, "ip tos set 192" will that leave the two ECN bits alone? I guess I will need to do some research...

Ty for the links.

Hmm seems like it does?
The ToS Header Field carries an 8bit value.
DSCP uses 6bit + 2bit for ECN.
tc pedit assumes an 8bit value.
So using 192 will result in 11000000
6bit dscp = 110000 = 48 = CS6
and 2 ecn bits set to 0
If we want to set both ecn bits to 1
195 should work.
Which is 11000011 in binary.

Yes that is my point, we want to leave the ECN bits alone as their value changes depending on congestion along the link, so I believe the mask should be 0xfc to leave the two ECN bits alone. At least that is the mask we use in sqm-scripts to extract the 6 dscp-bits out of the 8 TOS/priority bits.

Also something I found:
https://patchwork.kernel.org/patch/3212651/
This indicates it should be possible to teach/configure the kernel to correctly map EF to AC_VO. Assuming that one actually wants to use AC_VO in the first place, as this class does no aggregation but gets access to air-time with a higher likelihood than other ACs, thereby reducing the total usable bandwidth considerably (each wifi transmission has a relative high overhead that can be amortized better if more user data is send behind it, which is what aggregation does).

Best Regards

A mask value of 0xfc (11111100 ?) seems to be fine i guess.

Most systems like cisco, sonicwall, allow to remap the wmm classes?

I just had a look at hostapd and find the following in hostapd.conf:

# QoS Map Set configuration
#
# Comma delimited QoS Map Set in decimal values
# (see IEEE Std 802.11-2012, 8.4.2.97)
# 
# format:
# [<DSCP Exceptions[DSCP,UP]>,]<UP 0 range[low,high]>,...<UP 7 range[low,high]>
# 
# There can be up to 21 optional DSCP Exceptions which are pairs of DSCP Value
# (0..63 or 255) and User Priority (0..7). This is followed by eight DSCP Range
# descriptions with DSCP Low Value and DSCP High Value pairs (0..63 or 255) for
# each UP starting from 0. If both low and high value are set to 255, the
# corresponding UP is not used.
# 
# default: not set
#qos_map_set=53,2,22,6,8,15,0,7,255,255,16,31,32,39,255,255,40,47,255,255

So mapping to EF might still be okay assuming one also defines a proper qos_map. Now I need to figure out how to configure hostapd under openwrt :wink:

Hmm...

default: not set

So by default hostapd doesn't make use of dscp to wmm class mappings?

No, I believe that simply means that the kernel uses whatever default mapping it has.

@nbd, do you have any insight in openwrt's default dscp to WMM user priorities mapping?

So how to create a proper dscp to wmm mapping?

DSCP classes start with a decimal value of 8.
But its possible to have ToS values below that.

qos_map_set=0,7,8,15,16,23,24,31,32,39,40,47,48,55,56,255 ??

If i want to add dscp ef to wmm voice class:

qos_map_set=46,6,0,7,8,15,16,23,24,31,32,39,40,47,48,55,56,255

is this correct?

I believe adding "46,6" should work, but I am not sure whether all the the values actually make sense, UP 1 and 2 map to BK, but only DSCP CS1 should be signifying background/BK (well there is an current RFC for using another bit pattern out of CS0 for BK, but I digress). So I probably would change to:

qos_map_set=46,6,8,1,0,7,255,255,255,255,8,31,32,39,40,47,48,55,56,255

but I have no clue whether that actually is sane... I guess I will need to figure out how to diagnose/measure the AC classes of over the air packets first, because this will require looking at real data...

Sidenote: personally I am not even sure whether playing games with WMM is actually worth it at all, but I agree if one does it it would be nice to do it right :wink:

I think WMM voice class does result in much better audio quality. I think that is "worth it" for my use case at least. Thanks for this discussion on hostapd, I may work on that as well.

I think 8,1 will not work. (or typo?)

DSCP Low Value and DSCP High Value pairs (0..63 or 255)

If you only want to use dscp cs1 (8) it should be 8,255 i guess.
The question is want does 255 mean? Max value? (8bit, would make sense)

//edit
or simply 8,8 lmao

And the default qos_map from hostapd is actually quite good.
But i want 1:1 mapping with EF (and maybe VA?)override:

Maybe something like this:
46,6,8,15,0,7,16,23,24,31,32,39,40,47,48,55,56,63

And is there an alternative to ndpi?
Is l7 matching even future proof? more and more applications use encryption...
What is the best way to classify traffic?

probably not. yes, most stuff already is tls on tcp port 443, quic is also here...
design your network for unresponsive floods of encrypted udp traffic ;D
interesting in this regard https://ripe76.ripe.net/archives/video/23/

probably flows, up until the ammount of flow-state overtaxes memory ^^

1 Like

Why? The last 8 pairs will be interpreted as range limits for the 8 user priorities, but 8,1 being the 9th pair from the end is of the record type "DSCP decimal vaule, user priority", the idea is to directly map CS1 to AC_BK (which is UP 1 or UP 2), and since this is to be the only value in UP1 I set the range for UP(1) to "255,255".

"If both low and high value are set to 255, the corresponding UP is not used."

So for the 8 range definitions this is relative clear...

Here is what IEEE Std 802.11-2012, 8.4.2.97 has to say about qos maps:

8.4.2.97 QoS Map Set element
The QoS Map Set element is transmitted from an AP to a non-AP STA in a (Re)association Response frame or a QoS Map Configure frame and provides the mapping of higher layer quality-of-service constructs to User Priorities defined by transmission of Data frames in this standard. This element maps the higher layer priority from the DSCP field used with the Internet Protocol to User Priority as defined by this standard. The QoS Map Set element is shown in Figure 8-357.
Octets: 1 1 2 2 2 2 2 2
Figure 8-357—QoS Map Set element description
The Length field is set to 16+2×n, where n is the number of Exception fields in the QoS Map set.
DSCP Exception fields are optionally included in the QoS Map Set. If included, the QoS Map Set has a maximum of 21 DSCP Exception fields. The format of the exception field is shown in Figure 8-358.
Octets: 1 1
Figure 8-358—DSCP Exception format
The DSCP value in the DSCP Exception field is in the range 0 to 63 inclusive, or 255; the User Priority value is between 0 and 7, inclusive.
— When a non-AP STA begins transmission of a Data frame containing the Internet Protocol, it matches the DSCP field in the IP header to the corresponding DSCP value contained in this element.
Element ID
Length
DSCP Exception #1 (optional)
...
DSCP Exception #n (optional)
UP 0 DSCP Range
UP 1 DSCP Range
UP 2 DSCP Range
...
UP 7 DSCP Range
DSCP Value
User Priority
684 Copyright © 2012 IEEE. All rights reserved.
IEEE PART 11: WIRELESS LAN MAC AND PHY SPECIFICATIONS Std 802.11-2012
The non-AP STA will first attempt to match the DSCP value to a DSCP exception field and uses the UP from the corresponding UP in the same DSCP exception field if successful; if no match is found then the non-AP STA attempts to match the DSCP field to a UP n DSCP Range field, and uses the n as the UP if successful; and otherwise uses a UP of 0.
— Each DSCP Exception field has a unique DSCP Value.
Octets: 1 1
Figure 8-359—DSCP Range description
The QoS Map Set has a DSCP Range field corresponding to each of the 8 user priorities. The format of the range field is shown in Figure 8-359. The DSCP Range value is between 0 and 63 inclusive, or 255.
— The DSCP range for each user priority is nonoverlapping.
— The DSCP High Value is greater than or equal to the DSCP Low Value.
— If the DSCP Range high value and low value are both equal to 255, then the corresponding UP is not used.

I disagree, the UP 1 and UP2 ranges 16-23 and 24-31 respectively map 16 DSCPs to background priority while effectively ony one single DSCP should map to background. so probably my ranges where also off and UP(0) should range from 0 to 31...

IMHO ndpi is never the solution, but rather part of the problem :wink:

No, this is again rather a heuristic, I prefer to have the sending applications/the sending OS to set the desired DSCPs instead of trying to figure things out post-hoc on the router, but I realize that this approach might not give enough control in more hostile environments then my personal home-net.

I would propose end-2-end dscp markings with slight sanity checking by the network (say only allow CS7 from a few known IP addresses, but remap to CS0 or CS1 for all others).

I overlooked that you used an overwrite there. I apologize :pensive:

Hmm that still doesnt answer what 255 really means.
If someone uses 8,255
Does it now match all values from 8 to 255?
Why does it explicitly state "range 0 to 63 inclusive, or 255"
They could have just wrote 0 to 255.
But max value of 63 makes more sense for dscp (max value for 6bit?)
So maybe 255 has special purpose? (besides setting both high and low to 255 to disable an UP)

Are you sure that only cs1 should go into background class? isnt af1x even lower priority (increased drop probability?)

That is also the problem i have. You can't always trust the sender.

But this is also not a 100% solution.
If you want to remap you still have to identify the traffic :wink:

Please don't, that qos_map format is opaque enough, and I think I still got it wrong with the 8 UP range definitions :wink:

But that would cover 8 bit, but only 6 bits are available, I take it to mean 0-63 or an explicit value-not-set marker. Now I agree that the cases were only half of a range are set to 255 is not terribly well defined. Looking at the kernel it becomes clear that this will simply match all dscps larger than the minimum (if max set to 255) or no dscps if minimum set to 255. I am not sure though whether these values are not sanitized somewhere between hostapd and the kernel.
On the other hand I do not care too much, I would simply follow the instructions in hostapd.conf :wink:

Well, at home I am willing to give my machines the benefit of the doubt, but more in earnest, getting packets categorized correctly in a hostile environment seems pretty difficult (short of white listing a few IP or MAC addresses, which can also be spoofed). I admit though that having more checks/heuristics will make it harder to abuse the marking system.
Now mismarking has different effects in different priority schemes; in a pure precedence model allowing an application to cheat itself into a higher priority class will allow it to starve all lower precedence levels completely, but sqm-scripts priority tiers will not allow this.

No, but it probably is a 99% solution with minimal configuration effort and little chance of doing harm :wink:

Sure but whitelisting the few MAC/IPaddresses you want to allow to use CS7 seems rather simple compared to trying to collect all IPs belonging to a certain service...

Again, I am not advocating against your detailed and well-thought out set of rules, I am just not sure whether one can not get like 95% of the behaviour without having to do this rather complex dance :wink: (and I happily admit that for quite a number of folks 95% might not be good enough).

Anyway, I a still trying to figure out from your great setup, whether I can borrow a simple way of generically copy the egress dscp markings to the conn-tracked ingress packets, as that seems rather desirable to solve the problem that dscps are always a bit dubious on ingress. That sees like a great optional feature for sqm-scripts. Combined with openwrt's ability to create dscp rewrite rules should be sufficient for simpler cases than yours (like putting traffic of a specific IP into a fixed priority tier).

Best Regards & thanks for the great discussion

Cake already makes use of conntrack for its nat feature or?
Maybe it can be expanded to also copy the markings over.

Im also not sure if this a great setup x)
I haven't used linux for a long time.
When using connmarks to set dscp marks, the connection can't be marked for anything else.
So this is somekind of a drawback.
Or maybe some clever use of the mask feature can help here.
(l)uci has also support for setting marks and match against mark.
But i have few problems with that. Only one zone or all zones can be specified.
No custom targets are possible. (-j DSCP)
It would be great having custom target support. (also for ipsets)

I have to thank you :wink:

I also have some offtopic question regarding cake and i don't want to create a separate topic.

root@LEDE:~# tc -s qdisc show dev ifb4eth1
qdisc cake 8009: root refcnt 2 bandwidth 150Mbit diffserv4 dual-dsthost nat ingress split-gso rtt 50.0ms noatm overhead 18 mpu 64
 Sent 68851578560 bytes 48260674 pkt (dropped 19511, overlimits 81070945 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 1942336b of 4Mb
 capacity estimate: 150Mbit
 min/max network layer size:           46 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                   Bulk  Best Effort        Video        Voice
  thresh       9375Kbit      150Mbit       75Mbit    37500Kbit
  target          2.5ms        2.5ms        2.5ms        2.5ms
  interval       50.0ms       50.0ms       50.0ms       50.0ms
  pk_delay        1.2ms        105us        6.0ms         18us
  av_delay        691us         38us        4.6ms          5us
  sp_delay         84us          3us        298us          2us
  pkts         31917695      3458189     11881351      1022950
  bytes     47775352030   3831953540  16912571600    360978176
  way_inds       369666        23386         2752          115
  way_miss         1366        99461         4959          182
  way_cols            0            0            0            0
  drops           15380          578         3553            0
  marks               0            0            0            0
  ack_drop            0            0            0            0
  sp_flows            1            2            1            1
  bk_flows            1            0            0            0
  un_flows            0            0            0            0
  max_len          1514         4542         1514         1514
  quantum           300         1514         1514         1144

root@LEDE:~# tc -s qdisc show dev eth1
qdisc cake 8008: root refcnt 9 bandwidth 18Mbit diffserv4 dual-srchost nat wash ack-filter split-gso rtt 50.0ms noatm overhead 18 mpu 64
 Sent 2403145179 bytes 25997794 pkt (dropped 708036, overlimits 15973834 requeues 881)
 backlog 0b 0p requeues 881
 memory used: 1752256b of 4Mb
 capacity estimate: 18Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                   Bulk  Best Effort        Video        Voice
  thresh       1125Kbit       18Mbit        9Mbit     4500Kbit
  target         16.1ms        2.5ms        2.5ms        4.0ms
  interval       63.6ms       50.0ms       50.0ms       51.5ms
  pk_delay          9us       36.3ms         27us        346us
  av_delay          2us       18.2ms         10us        102us
  sp_delay          1us         67us          1us          2us
  pkts         16550062      2355063      6794883      1005822
  bytes      1493488099    337479186    512747631    110600731
  way_inds       383967        26734       134694          121
  way_miss         1372        90514         8869          485
  way_cols            0            0            0            0
  drops             984          406            1            0
  marks               0            0            0            0
  ack_drop        39541        75808       591296            0
  sp_flows            1            2            1            1
  bk_flows            1            0            0            0
  un_flows            0            0            0            0
  max_len          1514         1886         1486         1194
  quantum           300          549          300          300

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 68941877579 bytes 56775711 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Why has the best effort tin so much delay? (on egress)
I sometimes see that the bulk chain has more delay.
What does "target" and "interval" mean?

And there is also a row "marks" will this be used for connmark tracking?

Good question.

Hmm, @nbd proposed the following https://forum.openwrt.org/t/advanced-policy-routing-where-does-it-go/12402/11?u=moeller0 which I still want to test drive... I have a hunch this might be what could be sufficient for the de-prioritize/prioritize a specific IP crowd.

Sorry, no idea.

Target and interval are parameters for the core AQM part of cake, they work as in codel/fq_codel, IIRC if the minimum sojurn time (delay between enqueue and dequeue) exceeds target for interval then cake will drop or ECN mark packets. Effectively target defines the acceptable standing queue one is willing to tolerate and interval is the time window one is willing to give the flows endpoint to react to drops or marks.

In this context mark denotes ECN congestion encountered (CE) marking. On ecn-enabled flows the endpoints will react to CE-marked packets as they would to dropped packets (except the actual data was transferred and the signal is potentially a bit faster than a drop as a receiver will require 3 duplicate ACKs before accepting a packet as lost/dropped).

Best Regards

Hey @moeller0 did you ever figure out where to put the qos_map_set config?

I'm planning to try out the following map set:

qos_map_set 0,3,255,255,8,15,255,255,16,23,24,39,40,45,46,63,255,255

this puts CS0 into class 3 which goes to BestEffort, then doesn't use class 0, puts CS1 into class 1, doesn't use class 2, puts DSCP 16-23 into class 3 (best effort), 24-39 into class 4 (video), 40-45 into class 5 (also video), 46-63 into class 6 (VOICE), doesn't use class 7

That corresponds more or less to what I want, and kinda matches the TP-link cheap managed switches.

The way this works is also in the hostapd default config, on the left is the "class" I'm talking about above, it's based on 802.1d tags 0-7

# 802.1D Tag (= UP) to AC mappings
# WMM specifies following mapping of data frames to different ACs. This mapping
# can be configured using Linux QoS/tc and sch_pktpri.o module.
# 802.1D Tag	802.1D Designation	Access Category	WMM Designation
# 1		BK			AC_BK		Background
# 2		-			AC_BK		Background
# 0		BE			AC_BE		Best Effort
# 3		EE			AC_BE		Best Effort
# 4		CL			AC_VI		Video
# 5		VI			AC_VI		Video
# 6		VO			AC_VO		Voice
# 7		NC			AC_VO		Voice

Nope, -ENOTIME, I actually got stuck on try to figure out how to measure the status quo (that is, which dscps are mapped to which ACs in the default configuration). I seem to recall that for TX the question should be answered by :
cat /sys/kernel/debug/ieee80211/phy0/ath9k/xmit

but I have no clue yet for the receive side, and without being able to measure the current status and hence the disability to measure differences, playing with qos_map_set seemed futile. (Especially since I consider WMM to be less than ideal*).

Best Regards

*) My main gripe is that AC_VI and AC_VO have no real rate limit and hence allow people to misuse them by marking all packets AC_VO which will almost completely starve the other ACs, clobbering bandwidth for everybody (and that includes other APs/Stations on the same frequency).