Nft sets: Terrible performance

add set ip filter fb_ips { type ipv4_addr ; auto-merge; flags interval; }
time nft get element ip filter fb_ips { 52.216.24.236 } >/dev/null
real 0m 0.15s
user 0m 0.10s
sys 0m 0.05s

This is the typical timing to check for a facebook-ip in a set. Measurement done on a MT7620-based device. This looks quite slow to me. Any comments or timing examples from a more capable device ?

Please add precise device id ubus call system board
Given we get approx same init time it might be worth repeating thousand gets to get usable timings where load time plays no role.
(firewalld on rawhide)

time nft list ruleset | time nft -c -f -
real    0m 0.03s
user    0m 0.01s
sys     0m 0.01s
real    0m 0.13s
user    0m 0.05s
sys     0m 0.00s

ubus call system board
{
"kernel": "5.15.162",
"hostname": "........",
"system": "MediaTek MT7620A ver:2 eco:6",
"model": "Zbtlink ZBT-WE826 (16M)",
"board_name": "zbtlink,zbt-we826-16m",
"rootfs_type": "squashfs",
"release": {
"distribution": "OpenWrt",
"version": "23.05.4",
"revision": "r24012-d8dd03c46f",
"target": "ramips/mt7620",
"description": "OpenWrt 23.05.4 r24012-d8dd03c46f"
}
}

I do not consider thousands of gets to be interesting. Deviation is even more important, to consider the worst case. In the experiment mentionend above, the set was only about 2000 members large. Now, trying 10 times the size, results are terrible. For me, sets in nftable are not usable for higher amounts of members. Thus, trying to use ip-sets from iptables. iptables ONLY to be used for ip-sets, in raw table. Not to interfere with all the NFT stuff.
UPDATE: This is from using ip-sets/iptables-raw, for a set of about 20k elements:
time ipset test test_ips 32.240.71.12
Warning: 32.240.71.12 is in set test_ips.
real 0m 0.01s
user 0m 0.00s
sys 0m 0.01s

Populating this ipset using "ipset restore" takes about 5s. Initial load was about 3minutes. I could not create such a large set using NFT. My router rebooted, when trying to load this large set; however, success when loading only about 5000 elements.

Adding element to nftset pumps set to userland, then adjusts , then pumps back to kernel. It takes significantly more RAM than essential for the task.

What does that mean in detail?

See:

Some thoughts about nftables performance

1 Like

Bottom line, some operations are slow with current nft sets (and getting progressively slower as elements count goes up), but mostly there are workarounds. Memory consumption also gets progressively worse with larger sets when populating elements, to the extent where atomic sets replacement becomes not viable on memory-constrained devices. There are workarounds for this as well, although memory consumption is still significantly higher than with ipset, even with the workarounds. The netfilter people are aware of most of the bugs, although I'm not sure that there is a bug filed specifically for the nft get element command.

Use ipset+iptables for large sets, if possible. Now I have it in parallel with nftables. But iptables(+ipset) ONLY for large set, used to block certain incoming packets, in special table "raw". (You might think about blocking DoH-servers ...)
Seems to have no neg. impact, but need more time, to be shure. I have to note, that I do not use fw4 at all, but pure nft-rules + pure iptables rules. When using certain functionality (transparent proxy, captive portal etc.), fw3 or fw4 were/are a PITA.