NAT6 script with netfilter / fw4

(Hello! First time posting in OpenWrt forums; please let me know if there is a better place for this topic.)

I'm trying to translate the nat6 script from https://openwrt.org/docs/guide-user/network/ipv6/ipv6.nat6 to work in OpenWrt 22.03 (thanks for another great release!) with fw4 and without installing legacy iptables tools.

Some parts of this work are easy. For example, I can just use ip6tables-translate to go from

ip6t_add "${postrouting_chain}" -t nat -m comment --comment "!fw3" -j MASQUERADE

to

nft6_add rule ip6 nat "${postrouting_chain}" counter masquerade   comment \"!fw3\"

(with script's nft6_add function being suitably defined, just like ip6t_add currently is.)

However I would appreciate some guidance to finish this translation:

  • Every new rule has a comment starting with !fw3. Should I make my rules have comments starting with !fw4? Do they have special semantic meaning? (FWIW, for stock firewall on OpenWrt 22.03 (output of "nft list ruleset") every nftables rule that has a comment has one that starts with !fw4.)
  • How to define nft6_add? Currently ipt6_add does the following: it calls ip6tables -C to check if the rule exists and, provided it does not, calls ip6tables -I to add one. I asked in #netfilter on Libera and looks like nft userspace utility does not have --check /-C to check if a rule already exists. I am not sufficiently familiar with fw4 to know if making this function explicitly idempotent matters (e.g. maybe the script is only called on an empty firewall/only called once/etc)
  • How should one translate nat6_init?
nat6_init() {
    iptables-save -t nat \
    | sed -e "
        /\sMASQUERADE$/d
        /\s[DS]NAT\s/d
        /\s--match-set\s\S*/s//\06/
        /,BROADCAST\s/s// /" \
    | ip6tables-restore -T nat
}

Looks like it wants to delete all MASQUERADE/DNAT/SNAT targets (makes sense because script later sets up its own MASQUERADE and DNAT rules), but what is the purpose of filtering for ,BROADCAST and --match-set? (Edit: not sure if globally deleting MASQUERADE/DNAT/SNAT targets makes sense: what about networks that have manually-configured targets? I'm pretty sure it should only be doing this for the particular zone(s) this script is active for (i.e. uci set firewall.@zone[1].masq6="1").)

Here is what I do for NAT6 on my device. Note: this is supposed to be used with mwan3, but you can edit that part out and hard-code the list of WAN interfaces in the for loop.

Save this as /etc/firewall.nat6:

#!/bin/sh

source /lib/functions/network.sh

# IPv6 NAT (horrible)
ip6tables -t nat -F PREROUTING
ip6tables -t nat -F POSTROUTING
ULA=$(uci get network.globals.ula_prefix)
for IFACE in $(uci show mwan3 | sed -n '/=interface/s/^mwan3\.\(.*\)=interface/\1/ p') ; do
  network_get_device DEVICE $IFACE || continue
  network_get_prefix6 PREFIX $IFACE || continue
  BITS=${PREFIX##*/}
  if [ "$BITS" -le 48 ] ; then BITS=48 ; fi
  ULA_PART=${ULA%%/*}/$BITS
  FIRST_IP=${PREFIX%%:/*}:1
  echo "Mapping $ULA_PART <-> $PREFIX for $IFACE (on $DEVICE)"
  ip6tables -t nat -A PREROUTING -d $FIRST_IP -j REDIRECT
  ip6tables -t nat -A PREROUTING -d $PREFIX -j NETMAP --to $ULA_PART
  ip6tables -t nat -A POSTROUTING -s $ULA_PART -m conntrack --ctorigdst $PREFIX -j NETMAP --to $PREFIX
  ip6tables -t nat -A POSTROUTING -s $ULA_PART -o $DEVICE -j NETMAP --to $PREFIX
  ip6tables -t nat -A POSTROUTING -o $DEVICE -j MASQUERADE
done

This script does network prefix translation between the ULAs and the public prefix when it can, and NATs all ULAs that are too far from the beginning of the ULA range.

To enable this script, add this to /etc/config/firewall:

config include
	option path '/etc/firewall.nat6'

To persist it after an attended sysupgrade, add this to /etc/sysupgrade.conf:

/etc/firewall.nat6

You would need ip6tables-nft for this to work. On the LAN, please add this to /etc/config/network:

config interface 'lan'
	option device 'br-lan'
	...
	list ip6class 'local'
	option ip6hint '0'   # if you have several LANs, use '0', '10', '20', ... uniquely
	option ip6assign '60'

Note that the above post does not really correspond to your question. let me answer what I can.

Regarding the !fw3 / !fw4 comment, this has the effect that the rules are removed when the firewall is reloaded. For custom rules, this is useful in order to avoid duplicate rules being inserted on every firewall reload.

Regarding the method to check whether the rule already exists, I would say, it's an antipattern. Add a unique comment to each of your rules, and then check whether this comment exists in nft list ruleset.

Regarding the translation of nat6_init, don't translate, because it doesn't translate. First understand what is needed, and then write it. In this case, just insert the rules that implement your form of IPv6 NAT.

What the original tries to do is to get IPv4 NAT rules, remove MASQUERADE/DNAT/SNAT rules (because then the script inserts its own), replace the ipset names by appending "6" (e.g. "rkn" -> "rkn6", so also makes sense - to keep v4 and v6 ipsets separately), and do a bad attempt of removing the BROADCAST from the addrtype match (because BROADCAST is invalid for IPv6).

None of this "translate the existing IPv4 rules" business makes sense at all with nftables.

And remember that you can still use iptables (via ip6tables-nft).

I would like to point out that fw4 supports IPv6 NAT natively, there is no custom scripting required.

For zones you can set option masq6 in addition to option masq. There's also explicit config nat rules which you can combine with the usual selectors (option proto/src_ip/dest_ip/src_port/dest_port/...):

# Example masquerade rule
config nat
  option family ipv6
  option src wan
  option proto all
  option target MASQUERADE

# Example SNAT rule
config nat
  option family ipv6
  option src wan
  option proto all
  option target SNAT
  option snat_ip 1.2.3.4  # at least one of those two 
  option snat_port 1234   # options is required for SNAT
3 Likes

Didn't know that, thanks. Would it be possible to express my case (network prefix translation for lowest-numbered part of the ULA range (i.e. /64 or /56, depending on which ISP is connected), fallback to NAT otherwise) without hard-coding the prefix lengths delegated by ISPs?

1 Like

Thanks for the excellent fw4 explanations!

Wow! I misread nat6_init as filtering output of ip6tables but you are right that it actually tries to patch the iptables IPv4 ruleset into IPv6 one. That's way too automatic and fragile for my taste.

So NPTv6 is automatically done via nftables?

No - not automatically. For NAT6, you can configure it without any scripts. For NPT6, you need a script.

Well i read there are two possible ways to do NPT, one way looks offly similar to the script you posted. While the other method involves the mangle prerouting/postrouting.

Here is a link to mangle table method:

Here is what your official nft replacement method would look like, I haven't actually seen

#!/bin/sh

source /lib/functions/network.sh

nft add chain inet fw4 dstnat_wan
nft insert rule inet fw4 dstnat iifname "eth0" counter jump dstnat_wan comment \"!fw4: Handle wan IPv4/IPv6 dstnat traffic\"
nft add chain inet fw4 dstnat_lan
nft insert rule inet fw4 dstnat iifname "br-lan" counter jump dstnat_lan comment \"!fw4: Handle lan IPv4/IPv6 dstnat traffic\"
nft add chain inet fw4 srcnat_lan
nft insert rule inet fw4 srcnat oifname "br-lan" counter jump srcnat_lan comment \"!fw4: Handle lan IPv4/IPv6 srcnat traffic\"
ULA=$(uci get network.globals.ula_prefix)
for IFACE in $(uci show mwan3 | sed -n '/=interface/s/^mwan3\.\(.*\)=interface/\1/ p'); do # most setups would replace this with wan6
  network_get_device DEVICE $IFACE || continue
  network_get_prefix6 PREFIX $IFACE || continue
  BITS=${PREFIX##*/}
  if [ "$BITS" -le 48 ] ; then BITS=48 ; fi
  ULA_PART=${ULA%%/*}/$BITS
  FIRST_IP=${PREFIX%%:/*}:1
  echo "Mapping $ULA_PART <-> $PREFIX for $IFACE (on $DEVICE)"
  nft add rule inet fw4 dstnat_wan meta nfproto ipv6 ip6 daddr $FIRST_IP counter comment \"!fw4: Redirect IPv6 First Addr traffic\"
  nft add rule inet fw4 dstnat_wan meta nfproto ipv6 ip6 daddr $PREFIX counter dnat ip6 prefix to $ULA_PART comment \"!fw4: Map IPv6 Prefix traffic\"
  nft add rule inet fw4 srcnat_wan meta nfproto ipv6 ip6 saddr $ULA_PART counter ct original daddr $PREFIX snat ip6 prefix to $PREFIX comment \"!fw4: Map IPv6 Prefix traffic\"
  nft add rule inet fw4 srcnat_wan meta nfproto ipv6 ip6 saddr $ULA_PART oif $DEVICE counter snat ip6 prefix to $PREFIX comment \"!fw4: Map IPv6 Prefix traffic\"
  nft add rule inet fw4 srcnat_wan meta nfproto ipv6 oif $DEVICE counter masquerade comment \"!fw4: Masquerade IPv6 traffic\"
done

Some of the rules I have yet to understand the intended purpose, such as

nft add rule inet fw4 dstnat_wan meta nfproto ipv6 ip6 daddr $FIRST_IP counter comment \"!fw4: Redirect IPv6 First Addr traffic\" ##why is this necessary?
 
and

nft add rule inet fw4 srcnat_wan meta nfproto ipv6 ip6 saddr $ULA_PART counter ct original daddr $PREFIX snat ip6 prefix to $PREFIX comment \"!fw4: Map IPv6 Prefix traffic\" ## Are we adding conntrack to the connection?

EDIT: I haven't checked if the rules are translated correctly.

Regarding the rules that you don't understand:

The rule that redirects the first IP in the delegated prefix, it's actually very easy. Without the NPT6 setup (i.e. with normal setup with a delegated prefix), this IP normally gets assigned to br-lan. If you ping or otherwise attempt to connect to this IP from outside, then the router replies (unless the firewall says no). However, with NPT6, this IP is not assigned to anything, while still logically corresponds to the router (i.e. you still want it to be connectable and pingable). The normal prefix-to-prefix translation applies only to the forwarded traffic, and doesn't catch this special case, so that's why the separate rule.

The other rule applies if you have a system in the LAN which tries to connect to something in the delegated prefix. It is a direct equivalent to the NAT loopback rule for IPv4. Without this rule, the packet would travel from the source ULA to the destination in the public prefix, then the router would translate the destination IP to the ULA, then the destination system would reply from its ULA to the source IP, which is the ULA. This reply would therefore bypass the router (because both addresses are in the same subnet) and wreak havoc: the original system would expect the reply to come from the public IP that it connected to, not from the ULA. Translating the source IP using this rule fixes this: the original packet goes from the ULA of the first host to the public IP of the second host, gets translated by the router twice (so that the source IP is now public and the destination IP is the ULA of the second system), then the reply also has to go through the router which undoes this transformation.

1 Like

Yep i pretty much went through line by line and converted your script to nftables. I had to do additional research on the c state rule, but it is comparable to the rule you had. The biggest problem was the iptable-translate didnt know how to interpret most of those rules since they used a relatively newer option of nftables to map one prefix to another via snat/dnat.

The better option is to use mangle rules, but nftables is slow to adapt new options for that. The only way to do it with mangle on nftables is to use bitwise technique.

Thinking about it again, I came to the conclusion that the mangle table cannot be a solution. Here is why: it can only implement stateless NPT, while we do want stateful connection tracking, for the situation when both WANs are up. In other words, we need to track which WAN should be used for every connection, and the solution based on the mangle table only cannot do that.

1 Like

I would have to say that I concur with your argument simply because the mangle rules look like something thrown together to only perform a specific task with the headers.

  nft add rule inet fw4 mangle_prerouting iif $DEVICE ip6 daddr $PREFIX counter ip6 daddr set ip6 daddr and ::ffff:ffff:ffff:ffff or ${ULA%%/*} notrack
  nft add rule inet fw4 mangle_postrouting oif $DEVICE ip6 saddr $ULA_PART ip6 daddr !=$ULA_PART counter oif $DEVICE ip6 saddr set ip6 saddr and ::ffff:ffff:ffff:ffff or ${PREFIX%%/*} notrack

keep notice, this is using the prior scripts variables. we are assuming that the /48 length as well in the bitwise mapping.

Note, above mangle example is just an assumption and may not be the exact required NFTABLE equivalent. It was somewhat sourced together from this reference: