# nft -c -f /etc/nftables.conf
/etc/nftables.conf:249:31-38: Error: Right hand side of binary operation (|) must be constant
ct mark set ((0x00000000 or ip6 dscp) lshift 26) counter
~~~~~~~~~~~~~~^^^^^^^^
/etc/nftables.conf:460:54-55: Error: Left shift of 26 bits is undefined for type of 12 bits width
ct mark set ((ip6 dscp or 0x00000000) lshift 26) counter
~~~~~~~~~~~~~~~~~~~~~~ ^^
Success! Avoid the types completely for IPv6.
ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x2000000
Dumb question, how to restrict that to ip6?
meta nfproto ipv6
what is (@nh,0,16 & 4032)
I'm not familiar with the @nh
notation.
nevermind, described here:
https://www.netfilter.org/projects/nftables/manpage.html
You could probably shove that through a map as well if you wanted. You'd probably have to change the type of the map.
((@nh, 0, 16) & 4032) map @dscptoct
though really not necessary.
So is this what you ended up with? It is definitely working for my IPv6 traffic now!
ct mark set ip dscp map @dscpct counter
ct mark set ((@nh,0,16 & 4032) >> 6) meta nfproto ipv6 counter
ct mark set ct mark lshift 26 or 0x02000000 counter
I don’t think a map is necessary in this case anymore.
order matters, switch the order:
meta nfproto ipv6 ct mark set ...
otherwise you'll be setting the ct mark regardless of the type, but the counter won't fire unless it's ipv6
also add the ipv6 to the second command:
meta nfproto ipv6 ct mark lshift ...
also now that we're reading 16 bits with the 2 bottom bits set 0, do we want a 26 bit shift or a 24 bit shift? @dave14305
Something doesn't seem right here...
Config:
ct mark set ip dscp map @dscpct counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x02000000 counter
meta nfproto ipv6 ct mark set ct mark lshift 26 or 0x02000000 counter
Ruleset (once loaded):
ct mark set (@nh,8,8 & 252) >> 2 map @dscpct counter packets 6 bytes 364
ct mark set @nh,0,16 & 4032 >> 6 counter packets 9 bytes 948
ct mark set ct mark << 26 | 0x02000000 counter packets 15 bytes 1312
meta nfproto ipv6 ct mark set ct mark << 26 | 0x02000000 counter packets 9 bytes 948
Doesn't this handle the lshift for both protocols, so the second one with meta nfproto ipv6
isn't needed?
ct mark set ct mark lshift 26 or 0x02000000 counter
@dave14305 Conntrack marks after these changes:
IPv6 Marks:
30 mark=0x0
13 mark=0x2000000
6 mark=0x62000000
1 mark=0x82000000
IPv4 Marks:
81 mark=0x0
23 mark=0x2000000
34 mark=0x52000000
2 mark=0x5a000000
2 mark=0x6b000000
1 mark=0x82000000
right shift 4 maybe? see packet header structure here: https://en.wikipedia.org/wiki/IPv6_packet
oh never mind, the 6 is because of the ecn bits!
Yes, I guess you can use a single rule for both ipv4 and ipv6 shift and or... definitely don't do it twice!
Noted! This is really great progress! I'm excited!
So for anyone tracking this thread, here's what I have now (thanks to @dave14305 and @dlakelan) that appears to be equivalent in behavior to the ctinfo_layercake.qos setup for iptables:
ct mark set ip dscp map @dscpct counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x02000000 counter
Looking good. Here's an nftrace of a speedtest packet.
trace id 7f0bfd45 inet cttags cttags packet: iif "eth0" oif "eth1" ether saddr xx:xx:xx:e1:a4:ff ether daddr xx:xx:xx:65:78:2e ip6 saddr 2601:xxxx:xxxx:xxxx:d8ba:eb19:5dd:9454 ip6 daddr 2001:558:fe11:30::2 ip6 dscp cs1 ip6 ecn not-ect ip6 hoplimit 63 ip6 flowlabel 797489 ip6 nexthdr tcp ip6 length 44 tcp sport 62763 tcp dport 6020 tcp flags == syn tcp window 65535
trace id 7f0bfd45 inet cttags cttags rule tcp dport 6020-6030 ip6 dscp set cs1 meta nftrace set 1 comment "Comcast Speedtest CS1" (verdict continue)
trace id 7f0bfd45 inet cttags cttags rule ct mark set @nh,0,16 & 4032 >> 6 counter packets 0 bytes 0 (verdict continue)
trace id 7f0bfd45 inet cttags cttags rule ct mark set ct mark << 26 | 0x02000000 (verdict continue)
trace id 7f0bfd45 inet cttags cttags verdict continue
trace id 7f0bfd45 inet cttags in_dscp verdict continue
trace id 7f0bfd45 inet cttags in_dscp policy accept
And with IPv4:
trace id d1b0ed35 inet cttags cttags packet: iif "eth0" oif "eth1" ether saddr xx:xx:xx:e1:a4:ff ether daddr xx:xx:xx:65:78:2e ip saddr 192.168.1.106 ip daddr 69.252.92.74 ip dscp cs1 ip ecn not-ect ip ttl 63 ip id 0 ip protocol tcp ip length 64 tcp sport 60499 tcp dport 6020 tcp flags == syn tcp window 65535
trace id d1b0ed35 inet cttags cttags rule tcp dport 6020-6030 ip dscp set cs1 counter packets 0 bytes 0 meta nftrace set 1 comment "Comcast Speedtest CS1" (verdict continue)
trace id d1b0ed35 inet cttags cttags rule ct mark set (@nh,8,8 & 252) >> 2 counter packets 0 bytes 0 (verdict continue)
trace id d1b0ed35 inet cttags cttags rule ct mark set ct mark << 26 | 0x02000000 (verdict continue)
trace id d1b0ed35 inet cttags cttags verdict continue
trace id d1b0ed35 inet cttags in_dscp verdict continue
trace id d1b0ed35 inet cttags in_dscp policy accept
trace id d1b0ed35 ip masq masqout packet: iif "eth0" oif "eth1" ether saddr xx:xx:xx:e1:a4:ff ether daddr xx:xx:xx:65:78:2e ip saddr 192.168.1.106 ip daddr 69.252.92.74 ip dscp cs1 ip ecn not-ect ip ttl 63 ip id 0 ip length 64 tcp sport 60499 tcp dport 6020 tcp flags == syn tcp window 65535
trace id d1b0ed35 ip masq masqout rule oifname "eth1" masquerade (verdict accept)
Now I need to build dnsmasq with nft sets!
Did you come up with a way to build for your platform? I build mine in a container on my MacBook.
I'm using WSL on a Windows laptop from 2012. Slow going. I'm also using stintel's repo with firewall4 for this build for nftables all the way. We'll see how it goes.
good evening everyone and great job to you, do you think i can test it or i have to wait for you to update it?
I've removed the maps, added timeouts to the sets, and updated the ct mark statements. If you use ctinfo_layercake.qos, comment out the line near the bottom of the script that says ipt_setup
sqm_prepare_script() {
do_modules
verify_qdisc $QDISC "cake" || return 1
# ipt_setup
}
Below is the updated cttags table from my own file.
/etc/nftables.conf excerpt
table inet cttags {
set bulk4 {
type ipv4_addr
timeout 1d
counter
comment "Bulk IPv4"
}
set bulk6 {
type ipv6_addr
timeout 1d
counter
comment "Bulk IPv6"
}
set besteffort4 {
type ipv4_addr
timeout 1d
counter
comment "BE IPv4"
}
set besteffort6 {
type ipv6_addr
timeout 1d
counter
comment "BE IPv6"
}
set video4 {
type ipv4_addr
timeout 1d
counter
comment "Video IPv4"
}
set video6 {
type ipv6_addr
timeout 1d
counter
comment "Video IPv6"
}
set voice4 {
type ipv4_addr
timeout 1d
counter
comment "Voice IPv4"
}
set voice6 {
type ipv6_addr
timeout 1d
counter
comment "Voice IPv6"
}
define facetime_ports = { 3478-3497, 16384-16387, 16393-16402 }
define zoom_ports = { 8801-8810 }
chain in_dscp {
type filter hook postrouting priority 0; policy accept;
oifname $wan ct mark and 0x1c00000 == 0 jump qos_sqm
}
chain qos_sqm {
ct mark and 0x2000000 == 0 counter goto cttags
}
chain qos_sqm_remap {
# Add rules to modify non-zero DSCP incoming from LAN
# Convert the current DSCP value to an equivalent conntrack mark using the map
# Then save it in the high bits of the mark for restoration with act_ctinfo
meta nfproto ipv4 ct mark set (@nh,8,8 & 252) >> 2 counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x3000000
}
chain cttags {
# meta nftrace set 1
ip dscp != 0 counter goto qos_sqm_remap
ip6 dscp != 0 counter goto qos_sqm_remap
# match sets (populated externally by dnsmasq, et al)
ip daddr @bulk4 ip dscp set cs1 comment "bulk4 to CS1"
ip6 daddr @bulk6 ip6 dscp set cs1 comment "bulk6 to CS1"
#ip daddr @besteffort4 ct mark set 0x1000000 comment "besteffort4 to CS0"
#ip6 daddr @besteffort6 ct mark set 0x1000000 comment "besteffort6 to CS0"
ip daddr @video4 ip dscp set af41 comment "video4 to AF41"
ip6 daddr @video6 ip6 dscp set af41 comment "video6 to AF41"
ip daddr @voice4 ip dscp set cs6 comment "voice4 to CS6"
ip6 daddr @voice6 ip6 dscp set cs6 comment "voice6 to CS6"
# individual IP or port rules
ip daddr 17.0.0.0/8 tcp dport { 993, 5223 } ip dscp set cs0 counter comment "Apple Mail and APNS CS0"
udp sport $facetime_ports udp dport $facetime_ports ip dscp set af41 counter comment "Facetime AF41"
udp sport $facetime_ports udp dport $facetime_ports ip6 dscp set af41 counter comment "Facetime AF41"
udp dport $zoom_ports ip dscp set cs3 counter comment "Zoom CS3"
udp dport $zoom_ports ip6 dscp set cs3 counter comment "Zoom CS3"
udp sport 4500 udp dport 4500 ip dscp set cs6 counter comment "WiFi Calling CS6"
udp sport 4500 udp dport 4500 ip6 dscp set cs6 counter comment "WiFi Calling CS6"
tcp dport { 6020-6030 } ip dscp set cs1 counter comment "Comcast Speedtest CS1"
tcp dport { 6020-6030 } ip6 dscp set cs1 counter comment "Comcast Speedtest CS1"
# Convert the current DSCP value to an equivalent conntrack mark using the map
# Then save it in the high bits of the mark for restoration with act_ctinfo
meta nfproto ipv4 ct mark set (@nh,8,8 & 252) >> 2 counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x2000000
}
}
I think the next step is to update the sqm script to run all the individual nft commands to create this cttags table, so that it is tied to the sqm script and not the default nftables ruleset.
ok thanks i have to build from my macbook to have the idir script if i remember with the terminal command
or just wget with adress ? more simply
I will see later for the idir script I will not comment on the line for the moment thank you
On the individual IP/port rules, wouldn't you want to continue checking for ip protocol...
and ip6 nexthdr...
on the front side so you don't evaluate all the port matches for every IPv4 and IPv6 packet?
It’s smart enough already since we’re specifying ip6 dscp or ip dscp. Magic!
ok the end of the script is like this ?
## lets create a tagger table for more sophisticated tagging that
## can't happen at ingress since that's too early in the packet
## processing chain
table inet cttags {
set bulk4 {
type ipv4_addr
timeout 1d
counter
comment "Bulk IPv4"
}
set bulk6 {
type ipv6_addr
timeout 1d
counter
comment "Bulk IPv6"
}
set besteffort4 {
type ipv4_addr
timeout 1d
counter
comment "BE IPv4"
}
set besteffort6 {
type ipv6_addr
timeout 1d
counter
comment "BE IPv6"
}
set video4 {
type ipv4_addr
timeout 1d
counter
comment "Video IPv4"
}
set video6 {
type ipv6_addr
timeout 1d
counter
comment "Video IPv6"
}
set voice4 {
type ipv4_addr
timeout 1d
counter
comment "Voice IPv4"
}
set voice6 {
type ipv6_addr
timeout 1d
counter
comment "Voice IPv6"
}
define facetime_ports = { 3478-3497, 16384-16387, 16393-16402 }
define zoom_ports = { 8801-8810 }
chain in_dscp {
type filter hook postrouting priority 0; policy accept;
oifname $wan ct mark and 0x1c00000 == 0 jump qos_sqm
}
chain qos_sqm {
ct mark and 0x2000000 == 0 counter goto cttags
}
chain qos_sqm_remap {
# Add rules to modify non-zero DSCP incoming from LAN
# Convert the current DSCP value to an equivalent conntrack mark using the map
# Then save it in the high bits of the mark for restoration with act_ctinfo
meta nfproto ipv4 ct mark set (@nh,8,8 & 252) >> 2 counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x3000000
}
chain cttags {
it's just ?
nft list ruleset is launched without error which seems good to me
the complete script with a config to rt3200 router
Summary
`
this assumes eth0 is LAN and eth1 is WAN, modify as needed
flush ruleset
change these
define wan = wan
define lan = br-lan
define guest = eth0.10 # or remove these if you don't use guest or iot networks
define iot = eth0.5 ## but be sure to also remove the reference to them below
define bulksize = 35000000 ## total transfer before being sent to CS1
define voipservers = {10.0.98.113} ## add ipv4 addresses of fixed voip / telephone servers you use here
define cs5ports = {3074,10000-30000,3659,30000-45000} # high priority ports udp + tcp
define af41ports = {10000,8801-8810} # jitsi meet, and zoom
table inet filter {
# if you want to do flow offloading, uncomment this, fix the devices (should be lan and wan)
## and uncomment the lines in the forward path below.
## if you do offloading, then only the first packet from offloaded flows will go through the full path
## so if you want comprehensive dscp marking in the nftables, then you shouldn't do offloading.
#flowtable fastpath {
# hook ingress priority 0 devices = {eth1,eth0};
#}
chain input {
type filter hook input priority 0; policy drop;
# established/related connections
ct state established,related accept
# loopback interface
iifname lo accept
## icmpv6 is a critical part of the protocol, we just
## accept everything, you can lookin to making this
## more restrictive but be careful
ip6 nexthdr icmpv6 accept
# we are more restrictive for ipv4 icmp
ip protocol icmp icmp type { destination-unreachable, router-solicitation, router-advertisement, time-exceeded, parameter-problem } accept
ip protocol igmp accept
ip protocol icmp meta iifname $lan accept
## ntp protocol accept from LAN
udp dport ntp iifname $lan accept
## DHCP accept
iifname $lan ip protocol udp udp sport bootpc udp dport bootps log prefix "FIREWALL ACCEPT DHCP: " accept
## DHCPv6 accept from LAN
iifname $lan udp sport dhcpv6-client udp dport dhcpv6-server accept
## allow dhcpv6 from router to ISP
iifname $wan udp sport dhcpv6-server udp dport dhcpv6-client accept
# SSH (port 22), limited to 10 connections per minute per source host,
# if you want to allow from wan or other networks, add them to the iifname check
iifname $lan ct state new tcp dport ssh meter ssh-meter4 {ip saddr limit rate 10/minute burst 15 packets} accept
iifname $lan ct state new ip6 nexthdr tcp tcp dport ssh meter ssh-meter6 {ip6 saddr limit rate 10/minute burst 15 packets} accept
## allow access to LUCI from LAN
iifname $lan tcp dport {http,https} accept
## DNS for main LAN, we limit the rates allowed from each LAN host to reduce chance of denial of service
iifname $lan udp dport domain meter dommeter4 { ip saddr limit rate 240/minute burst 240 packets} accept
iifname $lan udp dport domain meter dommeter6 { ip6 saddr limit rate 240/minute burst 240 packets} accept
iifname $lan tcp dport domain meter dommeter4tcp { ip saddr limit rate 240/minute burst 240 packets} accept
iifname $lan tcp dport domain meter dommeter6tcp { ip6 saddr limit rate 240/minute burst 240 packets} accept
## allow remote syslog input? you might want this, or remove this
# iifname $lan udp dport 514 accept
counter log prefix "FIREWALL INPUT DROP: " drop
}
chain forward {
type filter hook forward priority 0; policy drop;
## tcp MSS clamping, https://wiki.nftables.org/wiki-nftables/index.php/Mangling_packet_headers
tcp flags syn tcp option maxseg size set 1450 # or "set rt mtu" according to reference
# if you want to do flow offloading try this?
#ip protocol {tcp,udp} flow offload @fastpath
#ip6 nexthdr {tcp,udp} flow offload @fastpath
ct state established,related accept
iifname $wan ct status dnat accept # for port forwarding, apparently needed?
iifname lo accept
iifname {$guest,$iot} oifname $lan drop ## guests can't connect to LAN
iifname {$lan,$guest,$iot} oifname $wan accept ## allow inside to forward to WAN
counter log prefix "FIREWALL FAIL FORWARDING: " drop
}
}
masquerading for ipv4 output on WAN
table ip masq {
map portmaps {
type inet_service : ipv4_addr
elements = {1935 : 192.168.2.160, 3480 : 192.168.2.160, 3074 : 192.168.2.160, 3075 : 192.168.2.160, 3076 : 192.168.2.160, 3077 : 192.168.2.160, 3478 : 192.168.2.160, 3479 : 192.168.2.160, 9308 : 192.168.2.160, 3659 : 192.168.2.160 } # set these up to map ports to specific internal IPs
}
chain masqout {
type nat hook postrouting priority 0; policy accept;
oifname $wan masquerade
}
## this empty table is required to make the kernel do the unmasquerading
chain masqin {
type nat hook prerouting priority 0; policy accept;
iifname $wan dnat to tcp dport map @portmaps
iifname $wan dnat to udp dport map @portmaps
#if you want to do a DMZ style "all ports sent to this address" do this:
# iifname $wan ip protocol {tcp,udp} dnat to 10.38.99.107
# if you want to hijack all DNS to go to the router's port 53 (or substitute a different dest port)
#iifname $lan udp dport 53 redirect to 53
}
}
experimental ingress washing feature. This can avoid problems with for example Comcast sending cs1 tag
and slowing your ingress connection
table netdev tagingress {
set lowpriosrc4 {
typeof ip saddr
elements = {10.29.31.101} # add ip addresses you want to down-prioritize, like for example if you want to punish YouTube.
}
set lowpriosrc6 {
typeof ip6 saddr
elements = {fe80::9091:1212} # add ip6 addresses you want to down-prioritize
}
### EDIT BELOW: set device for wanin and for lanin
chain dscpwanin {
type filter hook ingress device wan priority 0; ## EDIT ME: set the device here
ip dscp set cs0
ip6 dscp set cs0
jump porttags
}
chain dscplanin {
type filter hook ingress device br-lan priority 0; ## EDIT ME: set the device here
ip dscp set cs0
ip6 dscp set cs0
jump porttags
}
chain porttags {
## VOIP servers UDP traffic
ip saddr $voipservers ip protocol udp ip dscp set cs6
ip daddr $voipservers ip protocol udp ip dscp set cs6
#ntp, playstation network ports and minecraft java edition
udp dport $cs5ports ip dscp set cs5
udp dport $cs5ports ip6 dscp set cs5
udp sport $cs5ports ip dscp set cs5
udp sport $cs5ports ip6 dscp set cs5
# jitsi meet and zoom
udp sport $af41ports ip dscp set af41
udp dport $af41ports ip dscp set af41
#down prioritize certain source/dest
ip saddr @lowpriosrc4 ip dscp set cs1
ip6 saddr @lowpriosrc6 ip6 dscp set cs1
ip daddr @lowpriosrc4 ip dscp set cs1
ip6 daddr @lowpriosrc6 ip6 dscp set cs1
}
}
lets create a tagger table for more sophisticated tagging that
can't happen at ingress since that's too early in the packet
processing chain
table inet cttags {
set bulk4 {
type ipv4_addr
timeout 1d
counter
comment "Bulk IPv4"
}
set bulk6 {
type ipv6_addr
timeout 1d
counter
comment "Bulk IPv6"
}
set besteffort4 {
type ipv4_addr
timeout 1d
counter
comment "BE IPv4"
}
set besteffort6 {
type ipv6_addr
timeout 1d
counter
comment "BE IPv6"
}
set video4 {
type ipv4_addr
timeout 1d
counter
comment "Video IPv4"
}
set video6 {
type ipv6_addr
timeout 1d
counter
comment "Video IPv6"
}
set voice4 {
type ipv4_addr
timeout 1d
counter
comment "Voice IPv4"
}
set voice6 {
type ipv6_addr
timeout 1d
counter
comment "Voice IPv6"
}
define facetime_ports = { 3478-3497, 16384-16387, 16393-16402 }
define zoom_ports = { 8801-8810 }
chain in_dscp {
type filter hook postrouting priority 0; policy accept;
oifname $wan ct mark and 0x1c00000 == 0 jump qos_sqm
}
chain qos_sqm {
ct mark and 0x2000000 == 0 counter goto cttags
}
chain qos_sqm_remap {
# Add rules to modify non-zero DSCP incoming from LAN
# Convert the current DSCP value to an equivalent conntrack mark using the map
# Then save it in the high bits of the mark for restoration with act_ctinfo
meta nfproto ipv4 ct mark set (@nh,8,8 & 252) >> 2 counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x3000000
}
chain cttags {
# meta nftrace set 1
ip dscp != 0 counter goto qos_sqm_remap
ip6 dscp != 0 counter goto qos_sqm_remap
# match sets (populated externally by dnsmasq, et al)
ip daddr @bulk4 ip dscp set cs1 comment "bulk4 to CS1"
ip6 daddr @bulk6 ip6 dscp set cs1 comment "bulk6 to CS1"
#ip daddr @besteffort4 ct mark set 0x1000000 comment "besteffort4 to CS0"
#ip6 daddr @besteffort6 ct mark set 0x1000000 comment "besteffort6 to CS0"
ip daddr @video4 ip dscp set af41 comment "video4 to AF41"
ip6 daddr @video6 ip6 dscp set af41 comment "video6 to AF41"
ip daddr @voice4 ip dscp set cs6 comment "voice4 to CS6"
ip6 daddr @voice6 ip6 dscp set cs6 comment "voice6 to CS6"
# individual IP or port rules
ip daddr 17.0.0.0/8 tcp dport { 993, 5223 } ip dscp set cs0 counter comment "Apple Mail and APNS CS0"
udp sport $facetime_ports udp dport $facetime_ports ip dscp set af41 counter comment "Facetime AF41"
udp sport $facetime_ports udp dport $facetime_ports ip6 dscp set af41 counter comment "Facetime AF41"
udp dport $zoom_ports ip dscp set cs3 counter comment "Zoom CS3"
udp dport $zoom_ports ip6 dscp set cs3 counter comment "Zoom CS3"
udp sport 4500 udp dport 4500 ip dscp set cs6 counter comment "WiFi Calling CS6"
udp sport 4500 udp dport 4500 ip6 dscp set cs6 counter comment "WiFi Calling CS6"
tcp dport { 6020-6030 } ip dscp set cs1 counter comment "Comcast Speedtest CS1"
tcp dport { 6020-6030 } ip6 dscp set cs1 counter comment "Comcast Speedtest CS1"
# Convert the current DSCP value to an equivalent conntrack mark using the map
# Then save it in the high bits of the mark for restoration with act_ctinfo
meta nfproto ipv4 ct mark set (@nh,8,8 & 252) >> 2 counter
meta nfproto ipv6 ct mark set ((@nh,0,16 & 4032) >> 6) counter
ct mark set ct mark lshift 26 or 0x2000000
}
}`