Netfilter "Flow offload" / HW NAT

@lsiudut Once a connection has been established with the negotiated MTUs, the clients at both ends will remember what is the path's maximum MTU. So they will continue to send requests using the negotiated MTU value until the connection closes. Then the next connection starts again, and since it'll be a new connection, the off-loading engine will not interfere.

Does the above answer your question?

That's the point - new connection should have re-negotiate MSS and succeed, but this is not what's happening unless I deliberately set firewall rules to pass through first packets w/o offloading.

Initially I thought that offload engine jumps in to early but I don't think that's right. For few reasons:

  1. starting from the most trivial one - I've seen it on a very new connections that were not previously initiated,
  2. it seems that offloading handles only the heavy lifting which is the transfer itself - it shuts down once a decision needs to be made (like RST/FIN), easy to verify with tcpdump - it also makes sense as OS needs to purge connection entries from all the tables,
  3. similar to (2) - I also verified with tcpdump that all handshakes are there, also for connections that are eventually stalling, never checked mss explicitly though,
  4. and finally, looking on the source code of mtk offload which I'm using - hash of existing connection (which is submitted to hardware) consist of src/dst ports and src/dst - it's highly unlikely to hit the same ports in consequent connection, see drivers/net/ethernet/mediatek/mtk_offload.c file

If I'm not wrong, you would have applied the MSS clamping rule at the mangle-forward table. I would guess (I may be wrong) that the offload-engine takes over at the start of the INPUT path if you configured it to immediately start processing, thereby bypassing the mangle-forward chain?

You can easily check by reverting to settings to having the offload engine perform its job immediately for new connections, then reset the iptables stats and do your test. If connection count of the MSS clamping rule doesn't increase, that means that it's bypassed, and therefore the PMTU issue.

In this case, either there could be a setting in the offload engine rule to apply MSS clamping, or the off-load engine have to be enhanced. Else, just get the offload engine to take over after a few packets.

1 Like

Sorry, I assumed that you are familiar with how offload in OpenWRT is implemented. I mentioned it few times already. It's not in mangle, but in FORWARD:

-A FORWARD -m comment --comment "!fw3: Traffic offloading" -m conntrack --ctstate RELATED,ESTABLISHED -j FLOWOFFLOAD --hw

As you can see it's ESTABLISHED, thus should apply only after full packet exchange.

Another possibility is that for some reason netfilter marks connections as ESTABLISHED before full 3wh, but I find it unlikely. Will check on that later.

1 Like

These rules are not working, it stalls.

What I experienced so far is that the presence of the issue is not consistent:
If I execute several consecutive & separate (e.g. no connection reuse) curl calls, then sometimes it will work, other times it will stall for pretty long.

Here's the script that I use.
It downloads ~21M from dockerhub (hosted by Cloudflare, multiple IPs based on geo):

repo="library/mysql"
url="blobs/sha256:2a72cbf407d67c7a7a76dd48e432091678e297140dce050ad5eccad918a9f8d6"
token=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:$repo:pull" | jq -r .token)
curl -v https://registry-1.docker.io/v2/$repo/$url -H "Authorization: Bearer $token" -L > /dev/null

1 Like

My PPTP client flow offload issue is finally resolved:

The trick is that it is not enough to install "kmod-nf-nathelper-extra" but the "kmod-nf-nathelper" module also needs to be installed.

Apparently the dependencies changed, as previously installing ""kmod-nf-nathelper-extra" resulted in installing the "kmod-nf-nathelper" as well.

So beware PPTP users, both packages are needed. Now both SW and HW offload works fine.

4 Likes

How to use HW NAT by nftables?

Hi,

is "flow offload" / HW NAT already in the Kernel 4.14.98 or do I need to still install more packages? - It is not clear at the moment and I compiled an Image without luci for my TP-Link WR801v2, which runs the image with wireguard and wpad for mesh instead of wpad-mini. Unfortunately there is no more space for packages kmod-nft-XX. Following information about my device:

 -----------------------------------------------------
 OpenWrt SNAPSHOT, r9330-880f8e6
 -----------------------------------------------------
root@LEDE:~# opkg list-installed
base-files - 197-r9330-880f8e6
busybox - 1.30.0-4
dnsmasq - 2.80-8
dropbear - 2017.75-9
firewall - 2019-01-02-70f8785b-2
fstools - 2018-12-28-af93f4b8-4
fwtool - 1
hostapd-common - 2018-12-02-c2c6c01b-1
ip-tiny - 4.20.0-2
ip6tables - 1.8.2-3
iptables - 1.8.2-3
iw-full - 4.14-1
jshn - 2018-07-25-c83a84af-2
jsonfilter - 2018-02-04-c7e938d6-1
kernel - 4.14.98-1-b0cc2154e4305712751a088e71e866fd
kmod-ath - 4.14.98+4.19.7-1-2
kmod-ath9k - 4.14.98+4.19.7-1-2
kmod-ath9k-common - 4.14.98+4.19.7-1-2
kmod-cfg80211 - 4.14.98+4.19.7-1-2
kmod-gpio-button-hotplug - 4.14.98-2
kmod-ip6tables - 4.14.98-1
kmod-ipt-conntrack - 4.14.98-1
kmod-ipt-core - 4.14.98-1
kmod-ipt-nat - 4.14.98-1
kmod-ipt-offload - 4.14.98-1
kmod-mac80211 - 4.14.98+4.19.7-1-2
kmod-nf-conntrack - 4.14.98-1
kmod-nf-conntrack6 - 4.14.98-1
kmod-nf-flow - 4.14.98-1
kmod-nf-ipt - 4.14.98-1
kmod-nf-ipt6 - 4.14.98-1
kmod-nf-nat - 4.14.98-1
kmod-nf-reject - 4.14.98-1
kmod-nf-reject6 - 4.14.98-1
kmod-udptunnel4 - 4.14.98-1
kmod-udptunnel6 - 4.14.98-1
kmod-wireguard - 4.14.98+0.0.20190123-1
libblobmsg-json - 2018-07-25-c83a84af-2
libc - 1.1.21-1
libgcc1 - 7.4.0-1
libip4tc0 - 1.8.2-3
libip6tc0 - 1.8.2-3
libjson-c2 - 0.12.1-3
libjson-script - 2018-07-25-c83a84af-2
libmnl0 - 1.0.4-2
libnl-tiny - 0.1-5
libpthread - 1.1.21-1
libubox20170601 - 2018-07-25-c83a84af-2
libubus20170705 - 2018-10-06-221ce7e7-1
libuci20130104 - 2018-08-11-4c8b4d6e-2
libuclient20160123 - 2018-11-24-3ba74ebc-1
libxtables12 - 1.8.2-3
logd - 2018-12-18-876c7f5b-1
mtd - 24
netifd - 2019-01-31-5cd7215a-1
odhcp6c - 2019-01-11-d2e247d8-16
odhcpd-ipv6only - 2019-01-16-0a367680-3
openwrt-keyring - 2018-05-18-103a32e9-1
opkg - 2019-01-31-d4ba162b-1
procd - 2018-12-27-e2b055ed-1
swconfig - 12
uboot-envtools - 2018.03-3
ubox - 2018-12-18-876c7f5b-1
ubus - 2018-10-06-221ce7e7-1
ubusd - 2018-10-06-221ce7e7-1
uci - 2018-08-11-4c8b4d6e-2
uclient-fetch - 2018-11-24-3ba74ebc-1
usign - 2015-07-04-ef641914-1
wireguard - 0.0.20190123-1
wireguard-tools - 0.0.20190123-1
wireless-regdb - 2017-10-20-4343d359
wpad - 2018-12-02-c2c6c01b-1
root@LEDE:~# df -h
Filesystem                Size      Used Available Use% Mounted on
/dev/root                 3.0M      3.0M         0 100% /rom
tmpfs                    13.3M    748.0K     12.5M   6% /tmp
/dev/mtdblock3          320.0K    292.0K     28.0K  91% /overlay
overlayfs:/overlay      320.0K    292.0K     28.0K  91% /
tmpfs                   512.0K         0    512.0K   0% /dev
root@LEDE:~# cat /etc/config/firewall

config defaults
	option syn_flood '1'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option drop_invalid '1'
	option forward 'ACCEPT'
	option flow_offloading '1'
        option flow_offloading_hw '1'

config zone
	option name 'lan'
	option input 'ACCEPT'
	option output 'ACCEPT'
	option forward 'ACCEPT'
	option network 'lan LANv6'

config zone
	option name 'wan'
	option output 'ACCEPT'
	option masq '1'
	option mtu_fix '1'
	option input 'ACCEPT'
	option network ' '
	option forward 'ACCEPT'

config rule
	option name 'Allow-DHCP-Renew'
	option src 'wan'
	option proto 'udp'
	option dest_port '68'
	option target 'ACCEPT'
	option family 'ipv4'

config rule
	option name 'Allow-Ping'
	option src 'wan'
	option proto 'icmp'
	option icmp_type 'echo-request'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-IGMP'
	option src 'wan'
	option proto 'igmp'
	option family 'ipv4'
	option target 'ACCEPT'

config rule
	option name 'Allow-DHCPv6'
	option src 'wan'
	option proto 'udp'
	option src_ip 'fc00::/6'
	option dest_ip 'fc00::/6'
	option dest_port '546'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-MLD'
	option src 'wan'
	option proto 'icmp'
	option src_ip 'fe80::/10'
	list icmp_type '130/0'
	list icmp_type '131/0'
	list icmp_type '132/0'
	list icmp_type '143/0'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-ICMPv6-Input'
	option src 'wan'
	option proto 'icmp'
	list icmp_type 'echo-request'
	list icmp_type 'echo-reply'
	list icmp_type 'destination-unreachable'
	list icmp_type 'packet-too-big'
	list icmp_type 'time-exceeded'
	list icmp_type 'bad-header'
	list icmp_type 'unknown-header-type'
	list icmp_type 'router-solicitation'
	list icmp_type 'neighbour-solicitation'
	list icmp_type 'router-advertisement'
	list icmp_type 'neighbour-advertisement'
	option limit '1000/sec'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-ICMPv6-Forward'
	option src 'wan'
	option dest '*'
	option proto 'icmp'
	list icmp_type 'echo-request'
	list icmp_type 'echo-reply'
	list icmp_type 'destination-unreachable'
	list icmp_type 'packet-too-big'
	list icmp_type 'time-exceeded'
	list icmp_type 'bad-header'
	list icmp_type 'unknown-header-type'
	option limit '1000/sec'
	option family 'ipv6'
	option target 'ACCEPT'

config rule
	option name 'Allow-IPSec-ESP'
	option src 'wan'
	option dest 'lan'
	option proto 'esp'
	option target 'ACCEPT'

config rule
	option name 'Allow-ISAKMP'
	option src 'wan'
	option dest 'lan'
	option dest_port '500'
	option proto 'udp'
	option target 'ACCEPT'

config include
	option path '/etc/firewall.user'

config forwarding
	option dest 'wan'
	option src 'lan'

config zone
	option input 'ACCEPT'
	option output 'ACCEPT'
	option name 'wireguard'
	option forward 'ACCEPT'
	option masq '1'
	option mtu_fix '1'
	option network 'wireguard'

config forwarding
	option dest 'lan'
	option src 'wireguard'

config forwarding
	option dest 'wan'
	option src 'wireguard'

config forwarding
	option dest 'wireguard'
	option src 'lan'

config forwarding
	option dest 'wireguard'
	option src 'wan'

root@LEDE:~# 

Is it maybe possible to change the RAM to 64 MB and flash to 16 MB. I got a good offer for the chips and read about upgrading flash. - But I am confused what kind of images I have to use afterwards and if I am still able to compile my own images with Chef Imagebuilder.

My needs are

  1. wireguard
  2. mesh-networks (wifi)
  3. low power consumption
  4. highest possible througput like I read with "Flow offload" / HW NAT

I also have some other devices laying around with 432 and I read the warnings. But I have 2x 841, 1x 740, 1x wa730rev2 and thougt about upgrading them all with some new flash and RAM from here:
RAM and 16 MB Winbond-Flash.

I also already read this about upgrading Flash to 16MB: Flash-Upgrade

SW flowoffload is there by default on 4.14.x images. In file /etc/onfig/firewall add

config defaults
	option flow_offloading '1'

don't think there is support for HW on that device.

1 Like

Where is more documentation or support list and what are the requirements?

Not aware of any docs as such, but here is the only HW plugin support added to this point I think.

How can we be sure that it is working and properly enabled? I added the config option to my firewall config.

Run the following command to see that the rule has been added:

# iptables -v -n -L FORWARD | grep FLOW
63015 7080K FLOWOFFLOAD  all  --  *      *       0.0.0.0/0            0.0.0.0/0            /* !fw3: Traffic offloading */ ctstate RELATED,ESTABLISHED FLOWOFFLOAD

Hmm...I guess not. Must be doing something wrong.

You did of course reboot?

Yes. For sure.

OpenWrt:~# opkg list_installed | egrep 'offload|flow'
kmod-ipt-offload - 4.14.103-1
kmod-nf-flow - 4.14.103-1

Can someone else confirm that software offload breaks ipv6 connectivity on kernel 4.19 after few hours?
I have to disable software offload and reboot the router to get prefix delegation working again.
But most of time it takes sometime until prefix delegation works again.
I guess my isp is blocking something there.
Without software offload prefix delegation works flawless over many days without problems.

// edit2
Hmm..
So i had MLD blocked in my firewall for quite some time now.
Everything was working fine. But my device was losing the delegated prefix for last couple of days.
IPv6 uses multicast for like everything. neighbor discovery, dhcp and so on.
After reenabling MLD the problem with losing of the prefix seems to be fixed.
I also added noserverunicast '1' to my config.
But enabling software flow offload still breaks ipv6 connectivity after some time on the clients.

It seems that the compatibility problem between flow offload and wireguard has finally been solved, as described below.

IPSec still need some special firewall rules in order to work with flow offload, but wireguard has released excellent Android and iOS Apps, so just get rid of IPSec, then everything is wonderful now, thanks all devs.

Hi,
recently,i have a problem with mwan3 and flow offload。
my router is K2P, firmware is OpenWrt 18.06.2, r7676-cddd7b4c77。
I enable flow offload(soft or hw) and mwan3(WAN=pppoe, WANB=dhcp), other function is default,then i access www.speedtest.net to test my internet. sometimes the testing is ok,sometimes it is error,like this
image
when i access other html,sometimes it will pending or failed。

disable the flow offload, everything is ok。
thanks