NFtables and QoS in 2021

Same here.. was wondering if it was just me.

On a separate note, I grabbed the dnsmasq 2.87test4 code and cross-compiled it for my owrt system. I am going to be testing with it this evening since it does show the --nftset=... argument is available.

Will reply back when I see if there's any success to be shared. :slight_smile:

** UPDATE 1 **
I hit a snag with dnsmasq in that it compiled fine originally, but I realized it was missing the nftset compile time option. When I went to re-compile with that option set, I hit some bumps with required libraries and such which I'm currently working through. Needless to say, not testing has occurred as of yet. More to come...

3 Likes

i has try but doesn't work when same

nft list ruleset do'nt work after each reboot

Do you have any named counters or just anonymous counters? It will only return named counters.
https://wiki.nftables.org/wiki-nftables/index.php/Counters

1 Like

@dlakelan good morning! Could you please point me what i can tweak to increase the performance? I cant reach 1 gigabit with nftables, and many times im reaching only 600-700 mb. With stock 19.07.8 and 21.02.1 i have no problems in reaching 1 gb, without offloading turned on.
Im not using sqm at the moment also.
Just clean openwrt + nftables
Thanks!

Short answer, no. I missed that part of the docs, but I'm a little confused because the way the docs read it almost sounds like the nft list counters command would show all counters. Regardless, I'll try naming some today and test again.

There’s definitely an issue using the map with ip6. I just don’t know if it’s due to the rules I’m using or a flaw in nftables. I even split the map into 2 different ones for ip dscp and ip6 dscp in case it made a difference (it didn’t). I can see the correct TC (ipv6 dscp) value when logging the packets, so that rule works fine, but looking up the ip6 dscp in the map always puts zero in the ct mark and then that gets shifted and OR’d to 33554432 (which is expected if the original mark was still zero).

If anyone else can reproduce this behavior that would be appreciated. Or maybe @dlakelan can test this on his Debian install to see if it’s specific to our version of nftables.

table inet cttags {

        map dscpct4 {
                typeof ip dscp : ct mark
                        elements = {
                                cs0 : 0x00,
                                cs1 : 0x08,
                                cs2 : 0x10,
                                cs3 : 0x18,
                                cs4 : 0x20,
                                cs5 : 0x28,
                                cs6 : 0x30,
                                cs7 : 0x38,
                                be : 0x00,
                                af11 : 0x0a,
                                af12 : 0x0c,
                                af13 : 0x0e,
                                af21 : 0x12,
                                af22 : 0x14,
                                af23 : 0x16,
                                af31 : 0x1a,
                                af32 : 0x1c,
                                af33 : 0x1e,
                                af41 : 0x22,
                                af42 : 0x24,
                                af43 : 0x26,
                                ef : 0x2e
                        }
        }

        map dscpct6 {
                typeof ip6 dscp : ct mark
                        elements = {
                                cs0 : 0x00,
                                cs1 : 0x08,
                                cs2 : 0x10,
                                cs3 : 0x18,
                                cs4 : 0x20,
                                cs5 : 0x28,
                                cs6 : 0x30,
                                cs7 : 0x38,
                                be : 0x00,
                                af11 : 0x0a,
                                af12 : 0x0c,
                                af13 : 0x0e,
                                af21 : 0x12,
                                af22 : 0x14,
                                af23 : 0x16,
                                af31 : 0x1a,
                                af32 : 0x1c,
                                af33 : 0x1e,
                                af41 : 0x22,
                                af42 : 0x24,
                                af43 : 0x26,
                                ef : 0x2e
                        }
        }

        set bulk4 {
                type ipv4_addr
                counter
                comment "Bulk IPv4"
        }

        set bulk6 {
                type ipv6_addr
                counter
                comment "Bulk IPv6"
        }

        set besteffort4 {
                type ipv4_addr
                counter
                comment "BE IPv4"
        }

        set besteffort6 {
                type ipv6_addr
                counter
                comment "BE IPv6"
        }

        set video4 {
                type ipv4_addr
                counter
                comment "Video IPv4"
        }

        set video6 {
                type ipv6_addr
                counter
                comment "Video IPv6"
        }

        set voice4 {
                type ipv4_addr
                counter
                comment "Voice IPv4"
        }

        set voice6 {
                type ipv6_addr
                counter
                comment "Voice IPv6"
        }

        define facetime_ports = { 3478-3497, 16384-16387, 16393-16402 }
        define zoom_ports = { 8801-8810 }

        chain in_dscp {
                type filter hook postrouting priority 0; policy accept;

                oifname $wan ct mark and 0x1c00000 == 0 jump qos_sqm
        }

        chain qos_sqm {
                ct mark and 0x2000000 == 0 counter goto cttags
        }

        chain qos_sqm_remap {
                # Add rules to modify non-zero DSCP incoming from LAN

                # Convert the current DSCP value to an equivalent conntrack mark using the map
                # Then save it in the high bits of the mark for restoration with act_ctinfo
                ct mark set ip dscp map @dscpct4 counter
                ct mark set ip6 dscp map @dscpct6 counter
                ct mark set ct mark lshift 26 or 0x3000000
        }

        chain cttags {
                # meta nftrace set 1
                ip dscp != 0 counter goto qos_sqm_remap
                ip6 dscp != 0 counter goto qos_sqm_remap

                # match sets (populated externally by dnsmasq, et al)
                ip daddr @bulk4 ip dscp set cs1 comment "bulk4 to CS1"
                ip6 daddr @bulk6 ip6 dscp set cs1 comment "bulk6 to CS1"
                #ip daddr @besteffort4 ct mark set 0x1000000 comment "besteffort4 to CS0"
                #ip6 daddr @besteffort6 ct mark set 0x1000000 comment "besteffort6 to CS0"
                ip daddr @video4 ip dscp set af41 comment "video4 to AF41"
                ip6 daddr @video6 ip6 dscp set af41 comment "video6 to AF41"
                ip daddr @voice4 ip dscp set cs6 comment "voice4 to CS6"
                ip6 daddr @voice6 ip6 dscp set cs6 comment "voice6 to CS6"

                # individual IP or port rules
                ip daddr 17.0.0.0/8 tcp dport { 993, 5223 } ip dscp set cs0 counter comment "Apple Mail and APNS CS0"
                udp sport $facetime_ports udp dport $facetime_ports ip dscp set af41 counter comment "Facetime AF41"
                udp sport $facetime_ports udp dport $facetime_ports ip6 dscp set af41 counter comment "Facetime AF41"
                udp dport $zoom_ports ip dscp set cs3 counter comment "Zoom CS3"
                udp dport $zoom_ports ip6 dscp set cs3 counter comment "Zoom CS3"
                udp sport 4500 udp dport 4500 ip dscp set cs6 counter comment "WiFi Calling CS6"
                udp sport 4500 udp dport 4500 ip6 dscp set cs6 counter comment "WiFi Calling CS6"
                tcp dport { 6020-6030 } ip dscp set cs1 counter comment "Comcast Speedtest CS1"
                tcp dport { 6020-6030 } ip6 dscp set cs1 comment "Comcast Speedtest CS1"

                # Convert the current DSCP value to an equivalent conntrack mark using the map
                # Then save it in the high bits of the mark for restoration with act_ctinfo
                ct mark set ip dscp map @dscpct4 counter
                ct mark set ip6 dscp map @dscpct6 counter
                ct mark set ct mark lshift 26 or 0x2000000
        }
}

What's the ruleset you are using? nftables shouldn't be heavier weight than iptables. But the best way to get good performance is to use sets and maps instead of a large number of rules.

This is weird. I don't see any flaws in your code. Maybe try logging the packets? Are only cs0 packets hitting that rule?

Here’s a log from Wednesday when I was testing with AF41 instead of CS1.

Wed Nov 24 22:57:19 2021 kern.warn kernel: [42356.480003] IN=eth0 OUT=eth1 MAC=xx SRC=2601:xxx DST=2001:0558:fe11:0031:0000:0000:0000:0002 LEN=84 TC=136 HOPLIMIT=63 FLOWLBL=259856 PROTO=TCP SPT=65270 DPT=6020 WINDOW=65535 RES=0x00 SYN URGP=0

TC=136 is decimal TOS which is AF41. I just wonder if the IPv6 headers are different than expected for the map lookup.

Older commits that kinda rhyme with this issue:
https://git.netfilter.org/nftables/commit/?id=48632359f4dea5ee2484debba498ba069229e6d0
https://git.netfilter.org/nftables/commit/?id=715c8e7b625a48d3a64d9d2a7f83e33e458b1355

2 Likes

Your stock nftables configuration, for me its looks like the packets are being dropped, but not sure.

Is anyone here a Makefile/OpenWRT build system expert? If so, I could use some help with getting the dnsmasq 2.87test4 build to complete. Would be happy to share a patch for the dnsmasq Makefile to show where I'm at.

I'm willing to work "offline" via PM if that's better for this thread.

I haven't done this myself (my machines are too old to build in a reasonable timeframe), but ldir has a branch where he has incorporated 2.87test4.

https://git.openwrt.org/?p=openwrt/staging/ldir.git;a=commitdiff;h=37c2e397f94ad31cd6f9f5297e68a96204d5d985;hp=cca5b0a869200f161e4c4a4f2e2bb076c8ff2e69

1 Like

Wow! I had zero clue that existed, but oddly my changes I was making are nearly 1-to-1 with his. Guess I am on some sort of a "right" track.

However, I am getting this error when trying to build and I'm out of clues as to how to resolve from here. Any tips are greatly appreciated.

Error:

Summary
ccache_cc -L/home/user/openwrt/staging_dir/toolchain-x86_64_gcc-11.2.0_musl/usr/lib -L/home/user/openwrt/staging_dir/toolchain-x86_64_gcc-11.2.0_musl/lib -DPIC -fpic -specs=/home/user/openwrt/include/hardened-ld-pie.specs -znow -zrelro -flto=jobserver -o dnsmasq cache.o rfc1035.o util.o option.o forward.o network.o dnsmasq.o dhcp.o lease.o rfc2131.o netlink.o dbus.o bpf.o helper.o tftp.o log.o conntrack.o dhcp6.o rfc3315.o dhcp-common.o outpacket.o radv.o slaac.o auth.o ipset.o pattern.o domain.o dnssec.o blockdata.o tables.o loop.o inotify.o poll.o rrfilter.o edns0.o arp.o crypto.o dump.o ubus.o metrics.o hash-questions.o domain-match.o nftset.o -L/home/user/openwrt/staging_dir/target-x86_64_musl/usr/lib -lnetfilter_conntrack -lnfnetlink     -L/home/user/openwrt/staging_dir/target-x86_64_musl/usr/lib -lnettle -lhogweed  -lgmp -lubox -lubus
lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS'
/home/user/openwrt/staging_dir/toolchain-x86_64_gcc-11.2.0_musl/lib/gcc/x86_64-openwrt-linux-musl/11.2.0/../../../../x86_64-openwrt-linux-musl/bin/ld: /home/user/openwrt/tmp/ccHThSZg.ltrans2.ltrans.o: in function `main':
<artificial>:(.text.startup+0xa5d): undefined reference to `nft_ctx_new'
/home/user/openwrt/staging_dir/toolchain-x86_64_gcc-11.2.0_musl/lib/gcc/x86_64-openwrt-linux-musl/11.2.0/../../../../x86_64-openwrt-linux-musl/bin/ld: <artificial>:(.text.startup+0xa85): undefined reference to `nft_ctx_buffer_error'
/home/user/openwrt/staging_dir/toolchain-x86_64_gcc-11.2.0_musl/lib/gcc/x86_64-openwrt-linux-musl/11.2.0/../../../../x86_64-openwrt-linux-musl/bin/ld: /home/user/openwrt/tmp/ccHThSZg.ltrans4.ltrans.o: in function `process_reply.constprop.0':
<artificial>:(.text+0x910e): undefined reference to `nft_run_cmd_from_buffer'
/home/user/openwrt/staging_dir/toolchain-x86_64_gcc-11.2.0_musl/lib/gcc/x86_64-openwrt-linux-musl/11.2.0/../../../../x86_64-openwrt-linux-musl/bin/ld: <artificial>:(.text+0x911d): undefined reference to `nft_ctx_get_error_buffer'
collect2: error: ld returned 1 exit status

cc: @ldir

** UPDATE **
I managed to get it to compile. Happy to share what I have learned if anyone else is going to try it as well. Dnsmasq is now showing nftset in the compile time options. I copied one of my existing ipset=... designations and re-wrote it as nftset=.... When I query the domain I picked, I am getting an error (note: three IPs are returned from the domain I am testing with):

Fri Nov 26 14:53:49 2021 daemon.err dnsmasq[8712]: nftset voice4 Error: syntax error, unexpected '{', expecting string
Fri Nov 26 14:53:49 2021 daemon.err dnsmasq[8712]: nftset voice4 Error: syntax error, unexpected '{', expecting string
Fri Nov 26 14:53:49 2021 daemon.err dnsmasq[8712]: nftset voice4 Error: syntax error, unexpected '{', expecting string

Still going to poke around with it more, but at least I got this far... finally... :unamused:

https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2021q3/015667.html

https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=0e96e068672ee09e90c57c6b6cd0fc2c83d59868

My only ideas are that you need to somehow link with libnftables per the above links.

1 Like

Here we go! The syntax is a little wonky, but this is working:

root@OpenWrt:~# cat /etc/dnsmasq.conf
nftset=/nextdns.io/4#inet#cttags#voice4,6#inet#cttags#voice6
...

root@OpenWrt:~# dig +short nextdns.io
104.26.1.148
104.26.0.148
172.67.72.46
root@OpenWrt:~# dig +short nextdns.io AAAA
2606:4700:20::681a:94
2606:4700:20::681a:194
2606:4700:20::ac43:482e

root@OpenWrt:~# nft list ruleset
...
	set voice4 {
		type ipv4_addr
		counter
		comment "Voice IPv4"
		elements = { 104.26.0.148 counter packets 0 bytes 0, 104.26.1.148 counter packets 0 bytes 0,
			     172.67.72.46 counter packets 0 bytes 0 }
	}

	set voice6 {
		type ipv6_addr
		counter
		comment "Voice IPv6"
		elements = { 2606:4700:20::681a:94 counter packets 0 bytes 0,
			     2606:4700:20::681a:194 counter packets 0 bytes 0,
			     2606:4700:20::ac43:482e counter packets 0 bytes 0 }
	}
...
  • UPDATED: Based on nftset syntax correction from:
1 Like

I thought the syntax was:

nftset=/nextdns.io/4#inet#cttags#voice4,6#inet#cttags#voice6
# Add the IPs of all queries to yahoo.com, google.com, and their
# subdomains to netfilters sets, which is equivalent to
# 'nft add element ip test vpn { ... }; nft add element ip test search { ... }'
#nftset=/yahoo.com/google.com/ip#test#vpn,ip#test#search

# Use netfilters sets for both IPv4 and IPv6:
# This adds all addresses in *.yahoo.com to vpn4 and vpn6 for IPv4 and IPv6 addresses.
#nftset=/yahoo.com/4#ip#test#vpn4
#nftset=/yahoo.com/6#ip#test#vpn6
1 Like

Yup--indeed you are correct. Not sure how I missed that, but it works (still) with the syntax you stated PLUS no errors in the dnsmasq log. Thank you!

Apparently I have been staring at this too long and am missing obvious stuff. Maybe I should take a break :laughing:

...after you share the details of your custom build. :slight_smile:

1 Like

Here we go: https://github.com/Fail-Safe/OpenWrtNFTables/blob/main/dnsmasq287test4.patch

I'm not a Makefile expert, so I'm going to own the fact that line 168 in the Makefile (once patched) is bad form. The pkgconfig path, while working right now, should not be hardcoded as I have it. If anyone can help me correct it to use dynamic paths and make it more proper, please let me know.

1 Like

When doing a speed test, what is the % idle in output of top? Have you enabled receive packet steering? How about irqbalance. What device are you using? How many cores, and what is the CPU speed ?