Ultimate SQM settings: Layer_cake + DSCP marks

Thanks for your feedback @QOS, hisham and I spent a lot of time trying out many different things. He has a fairly unique and unusual network connection, with different channels to caches that offer him higher speed vs the general internet. In any case. I agree with you that when you mark with DSCP and you get end-to-end QOS including WMM you do feel the snap of it all for realtime communications, such as VOIP or games or whatever.

kmod-sched-ctinfo

What does it do? And when should I use it?

ctinfo action in tc(8) Linux ctinfo action in tc(8)

NAME
ctinfo - tc connmark processing action

SYNOPSIS
tc ... action ctinfo [ dscp MASK [STATEMASK] ] [ cpmark [MASK] ] [ zone ZONE ] [ CONTROL ] [ index ]

DESCRIPTION
CTINFO (Conntrack Information) is a tc action for retrieving data from conntrack marks into various fields. At
present it has two independent processing modes which may be viewed as sub-functions.

   DSCP mode copies a DSCP stored in conntrack's connmark into the IPv4/v6 diffserv field.  The copying may condi-
   tionally  occur  based  on  a  flag also stored in the connmark.  DSCP mode was designed to assist in restoring
   packet classifications on ingress, classifications which may then be used by qdiscs such as CAKE.   It  may  be
   used in any circumstance where ingress classification needs to be maintained across links that otherwise bleach
   or remap according to their own policies.

   CPMARK (copymark) mode copies the conntrack connmark into the packet's mark field.  Without additional  parame-
   ters  it is functionally completely equivalent to the existing connmark action.  An optional mask may be speci-
   fied to mask which bits of the connmark are restored.  This may be useful when DSCP and CPMARK modes  are  com-
   bined.

   Simple  statistics  (tc  -s)  on DSCP restores and CPMARK copies are maintained where values for set indicate a
   count of packets altered for that mode.  DSCP includes an error count where the destination  packet's  diffserv
   field was unwriteable.

PARAMETERS
DSCP mode parameters:
mask A mask of 6 contiguous bits indicating where the DSCP value is located in the 32 bit conntrack mark
field. A mask must be provided for this mode. mask is a 32 bit unsigned value.

   statemask
          A mask of at least 1 bit indicating where a conditional restore flag is located in the 32 bit  conntrack
          mark  field.  The statemask bit/s must NOT overlap the mask bits.  The DSCP will be restored if the con-
          ntrack mark logically ANDed with the statemask yields a  non-zero  result.   statemask  is  an  optional
          unsigned 32 bit value.

CPMARK mode parameters:
mask Store the logically ANDed result of conntrack mark and mask into the packet's mark field. Default is
0xffffffff i.e. the whole mark field. mask is an optional unsigned 32 bit value

Overall action parameters:
zone Specify the conntrack zone when doing conntrack lookups for packets. zone is a 16bit unsigned decimal
value. Default is 0.

   CONTROL
          The  following  keywords allow to control how the tree of qdisc, classes, filters and actions is further
          traversed after this action.

          reclassify
                 Restart with the first filter in the current list.

          pipe   Continue with the next action attached to the same filter.

          drop   Drop the packet.

          shot   synonym for drop

          continue
                 Continue classification with the next filter in line.

          pass   Finish classification process and return to calling qdisc for further packet processing. This  is
                 the default.

   index  Specify  an  index  for  this action in order to being able to identify it in later commands. index is a
          32bit unsigned decimal value.

EXAMPLES
Example showing conditional restoration of DSCP on ingress via an IFB

          #Set up the IFB interface
          tc qdisc add dev ifb4eth0 handle ffff: ingress

          #Put CAKE qdisc on it
          tc qdisc add dev ifb4eth0 root cake bandwidth 40mbit

          #Set interface UP
          ip link set dev ifb4eth0 up

          #Add 2 actions, ctinfo to restore dscp & mirred to redirect the packets to IFB
          tc filter add dev eth0 parent ffff: protocol all prio 10 u32 \
              match u32 0 0 flowid 1:1 action    \
              ctinfo dscp 0xfc000000 0x01000000  \
              mirred egress redirect dev ifb4eth0

          tc -s qdisc show dev eth0 ingress

           filter parent ffff: protocol all pref 10 u32 chain 0
           filter parent ffff: protocol all pref 10 u32 chain 0 fh 800: ht divisor 1
           filter parent ffff: protocol all pref 10 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1
          not_in_hw
            match 00000000/00000000 at 0
              action order 1: ctinfo zone 0 pipe
              index  2  ref  1 bind 1 dscp 0xfc000000 0x01000000 installed 72 sec used 0 sec DSCP set 1333 error 0
          CPMARK set 0
              Action statistics:
              Sent 658484 bytes 1833 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0

              action order 2: mirred (Egress Redirect to device ifb4eth0) stolen
              index 1 ref 1 bind 1 installed 72 sec used 0 sec
              Action statistics:
              Sent 658484 bytes 1833 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0

   Example showing conditional restoration of DSCP on egress

   This may appear nonsensical since iptables marking of egress packets is easy to achieve, however  the  iptables
   flow  classification rules may be extensive and so some sort of set once and forget may be useful especially on
   cpu constrained devices.

          # Send unmarked connections to a marking chain which needs to store a DSCP and set statemask bit in  the
          connmark
          iptables -t mangle -A POSTROUTING -o eth0 -m connmark \
              --mark 0x00000000/0x01000000 -g CLASS_MARKING_CHAIN

          # Apply marked DSCP to the packets
          tc filter add dev eth0 protocol all prio 10 u32 \
              match u32 0 0 flowid 1:1 action \
              ctinfo dscp 0xfc000000 0x01000000

          tc -s filter show dev eth0
           filter parent 800e: protocol all pref 10 u32 chain 0
           filter parent 800e: protocol all pref 10 u32 chain 0 fh 800: ht divisor 1
           filter parent 800e: protocol all pref 10 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1
          not_in_hw
            match 00000000/00000000 at 0
              action order 1: ctinfo zone 0 pipe
              index 1 ref 1 bind 1 dscp 0xfc000000 0x01000000 installed 7414 sec used 0 sec DSCP set 53404 error 0
          CPMARK set 0
              Action statistics:
              Sent 32890260 bytes 120441 pkt (dropped 0, overlimits 0 requeues 0)
              backlog 0b 0p requeues 0

SEE ALSO
tc(8), tc-cake(8) tc-connmark(8) tc-mirred(8)

AUTHORS
ctinfo was written by Kevin Darbyshire-Bryant.

4 Likes

Part of this was answered while typing, but:

I'll quote the author:

ctinfo is a new tc filter action module. It is designed to restore DSCPs
stored in conntrack marks into the ipv4/v6 diffserv field.

The feature is intended for use and has been found useful for restoring
ingress classifications based on egress classifications across links
that bleach or otherwise change DSCP, typically home ISP Internet links.
Restoring DSCP on ingress on the WAN link allows qdiscs such as CAKE to
shape inbound packets according to policies that are easier to indicate
on egress.

Ingress classification is traditionally a challenging task since
iptables rules haven't yet run and tc filter/eBPF programs are pre-NAT
lookups, hence are unable to see internal IPv4 addresses as used on the
typical home masquerading gateway.

Using sqm-scripts, it is added to a QOS script like the one I posted above. You'll notice it at the end of the egress function. And for ingress, the filter used that steals packets for cake in the original script is:

    $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
	match u32 0 0 flowid 1:1 action mirred egress redirect dev $DEV

We change that to:

    $TC filter add dev $IFACE parent ffff: protocol all prio 10 u32 \
	match u32 0 0 action \
	ctinfo dscp 0xfc000000 0x01000000 \
	mirred egress redirect dev $DEV

You'll notice ctinfo is added before mirred, which sets the DSCP flag back to the packets before cake gets them. Normally netfilter would never see those packets which is why we couldn't use iptables rules to mark them on ingress.

1 Like

With kmod-sched-ctinfo you don't need kmod-sched-connmark. ctinfo can do the same thing as sched-connmark, actually it's slightly more powerful in that you can mask which bits of the connmark to restore.

The elephant in the room is that kmod-sched-ctinfo does the copying of stuff stored in the conntrack connmark field to the packet's DSCP/connmark fields and is in the kernel, but the thing that puts the DSCP into the conntrack connmark field is at present an openwrt special and iptables only.

upstream kernel want an nftables implementation of CONNMARK --savedscp-mark as found in ,"iptables -A QOS_MARK_eth0 -t mangle -j CONNMARK --savedscp-mark 0xfc000000/0x01000000"

1 Like

This reminds me, I still want to include a version of @ldir's helpful qos script into sqm-scripts, but I never seem to be getting around to actually testing it first (and getting the required kmod installed in the first place)

Well, a quick check showed kmod-sched-ctinfo to be installed already, so all I need is time to test...

Does that require a specific package ot be installed or does this come with OpenWrt's default iptables packages?

I've switched to nftables and found that it dramatically eased the process of managing dscp marks, in part because there are ingress hooks where you can run full nftables filtering. I suspect you can do your own connmark style management fairly easily, though maybe not in full generality... You could mark and manually match marks and set dscp, but I don't think you can just copy the mark to the dscp field

Default, it is a patch to netfilter ('645-netfilter-connmark-introduce-savedscp.patch') and iptables ('010-add-savedscp-support.patch').

Speaking of reminders, I believe at some point you mentioned investigating the "flowid 1:1" breaking flow isolation that @ldir discovered.

1 Like

Thanks for the reminder, but that is below getting @ldir's script into sqm-script, as I seem to recall that the "flowid 1:1" does not seem to trigger in the qos scripts distributed as part of sqm-scripts. If that assumption is wrong, please holler as that will move this up my loose priority heap (but that still dos not solve my time problem :wink: )

I'm using a derivative of ldir's early qos file and opted to omit it for myself, but in what I believe to be the current revision of ldir's script (here) it is using 'flowid 1:'. Whether that merits any consideration I'll leave to you.

1 Like

Great news, now I am flat out of excuses for not tackling that :wink: (except for the elephant in the room thing, and I believe there is better chance of getting the iptables stuff into the kernel when there are actually users for it in the real world).

1 Like

It doesn't affect the usual sqm scripts because a) none of them use a tc filter on egress (which is where I was being bitten) and b) although flowid 1:1 is used on ingress it is part of the IFB redirect so it doesn't actually affect anything....technically I think it is incorrect but it doesn't actually bite anything because the packets get diverted off anyway

I got bitten because a) I'm the only idiot applying a filter on egress traffic and b) I 'cargo culted' the existing tc filter incantations, modified them, without really understanding. :slight_smile:

1 Like

That is also something on the radar for sqm-scripts, at one point we need to figure out how to support both nf and iptables as alternatives.

You are a pathfinder and explorer here! The idiot qualifier belongs to the core sqm-team (including me) for

I am confident that the only reason we use 1:1 because it worked in one script and no one ever revisited that...

1 Like

Okay, @Barrakketh maybe there actually is only one issue on my todo list...

It's not a question, it's a discussion thread!
i will try to create telegram channel to keep this thread clean!

just use c6, c7 and EF will cause problems.

you shouldn't use Extra arguments box for set dscp!
instead use iptables!

Are you using veth method?

it was a great experience, but if you have problems just turn off ack-filter by removing the parameter

Thanks for commenting, how are you?!