(with script's nft6_add function being suitably defined, just like ip6t_add currently is.)
However I would appreciate some guidance to finish this translation:
Every new rule has a comment starting with !fw3. Should I make my rules have comments starting with !fw4? Do they have special semantic meaning? (FWIW, for stock firewall on OpenWrt 22.03 (output of "nft list ruleset") every nftables rule that has a comment has one that starts with !fw4.)
How to define nft6_add? Currently ipt6_add does the following: it calls ip6tables -C to check if the rule exists and, provided it does not, calls ip6tables -I to add one. I asked in #netfilter on Libera and looks like nft userspace utility does not have --check /-C to check if a rule already exists. I am not sufficiently familiar with fw4 to know if making this function explicitly idempotent matters (e.g. maybe the script is only called on an empty firewall/only called once/etc)
Looks like it wants to delete all MASQUERADE/DNAT/SNAT targets (makes sense because script later sets up its own MASQUERADE and DNAT rules), but what is the purpose of filtering for ,BROADCAST and --match-set? (Edit: not sure if globally deleting MASQUERADE/DNAT/SNAT targets makes sense: what about networks that have manually-configured targets? I'm pretty sure it should only be doing this for the particular zone(s) this script is active for (i.e. uci set firewall.@zone[1].masq6="1").)
Here is what I do for NAT6 on my device. Note: this is supposed to be used with mwan3, but you can edit that part out and hard-code the list of WAN interfaces in the for loop.
Save this as /etc/firewall.nat6:
#!/bin/sh
source /lib/functions/network.sh
# IPv6 NAT (horrible)
ip6tables -t nat -F PREROUTING
ip6tables -t nat -F POSTROUTING
ULA=$(uci get network.globals.ula_prefix)
for IFACE in $(uci show mwan3 | sed -n '/=interface/s/^mwan3\.\(.*\)=interface/\1/ p') ; do
network_get_device DEVICE $IFACE || continue
network_get_prefix6 PREFIX $IFACE || continue
BITS=${PREFIX##*/}
if [ "$BITS" -le 48 ] ; then BITS=48 ; fi
ULA_PART=${ULA%%/*}/$BITS
FIRST_IP=${PREFIX%%:/*}:1
echo "Mapping $ULA_PART <-> $PREFIX for $IFACE (on $DEVICE)"
ip6tables -t nat -A PREROUTING -d $FIRST_IP -j REDIRECT
ip6tables -t nat -A PREROUTING -d $PREFIX -j NETMAP --to $ULA_PART
ip6tables -t nat -A POSTROUTING -s $ULA_PART -m conntrack --ctorigdst $PREFIX -j NETMAP --to $PREFIX
ip6tables -t nat -A POSTROUTING -s $ULA_PART -o $DEVICE -j NETMAP --to $PREFIX
ip6tables -t nat -A POSTROUTING -o $DEVICE -j MASQUERADE
done
This script does network prefix translation between the ULAs and the public prefix when it can, and NATs all ULAs that are too far from the beginning of the ULA range.
To enable this script, add this to /etc/config/firewall:
config include
option path '/etc/firewall.nat6'
To persist it after an attended sysupgrade, add this to /etc/sysupgrade.conf:
/etc/firewall.nat6
You would need ip6tables-nft for this to work. On the LAN, please add this to /etc/config/network:
config interface 'lan'
option device 'br-lan'
...
list ip6class 'local'
option ip6hint '0' # if you have several LANs, use '0', '10', '20', ... uniquely
option ip6assign '60'
Note that the above post does not really correspond to your question. let me answer what I can.
Regarding the !fw3 / !fw4 comment, this has the effect that the rules are removed when the firewall is reloaded. For custom rules, this is useful in order to avoid duplicate rules being inserted on every firewall reload.
Regarding the method to check whether the rule already exists, I would say, it's an antipattern. Add a unique comment to each of your rules, and then check whether this comment exists in nft list ruleset.
Regarding the translation of nat6_init, don't translate, because it doesn't translate. First understand what is needed, and then write it. In this case, just insert the rules that implement your form of IPv6 NAT.
What the original tries to do is to get IPv4 NAT rules, remove MASQUERADE/DNAT/SNAT rules (because then the script inserts its own), replace the ipset names by appending "6" (e.g. "rkn" -> "rkn6", so also makes sense - to keep v4 and v6 ipsets separately), and do a bad attempt of removing the BROADCAST from the addrtype match (because BROADCAST is invalid for IPv6).
None of this "translate the existing IPv4 rules" business makes sense at all with nftables.
And remember that you can still use iptables (via ip6tables-nft).
I would like to point out that fw4 supports IPv6 NAT natively, there is no custom scripting required.
For zones you can set option masq6 in addition to option masq. There's also explicit config nat rules which you can combine with the usual selectors (option proto/src_ip/dest_ip/src_port/dest_port/...):
# Example masquerade rule
config nat
option family ipv6
option src wan
option proto all
option target MASQUERADE
# Example SNAT rule
config nat
option family ipv6
option src wan
option proto all
option target SNAT
option snat_ip 1.2.3.4 # at least one of those two
option snat_port 1234 # options is required for SNAT
Didn't know that, thanks. Would it be possible to express my case (network prefix translation for lowest-numbered part of the ULA range (i.e. /64 or /56, depending on which ISP is connected), fallback to NAT otherwise) without hard-coding the prefix lengths delegated by ISPs?
Wow! I misread nat6_init as filtering output of ip6tables but you are right that it actually tries to patch the iptables IPv4 ruleset into IPv6 one. That's way too automatic and fragile for my taste.
Well i read there are two possible ways to do NPT, one way looks offly similar to the script you posted. While the other method involves the mangle prerouting/postrouting.
EDIT: I haven't checked if the rules are translated correctly.
Regarding the rules that you don't understand:
The rule that redirects the first IP in the delegated prefix, it's actually very easy. Without the NPT6 setup (i.e. with normal setup with a delegated prefix), this IP normally gets assigned to br-lan. If you ping or otherwise attempt to connect to this IP from outside, then the router replies (unless the firewall says no). However, with NPT6, this IP is not assigned to anything, while still logically corresponds to the router (i.e. you still want it to be connectable and pingable). The normal prefix-to-prefix translation applies only to the forwarded traffic, and doesn't catch this special case, so that's why the separate rule.
The other rule applies if you have a system in the LAN which tries to connect to something in the delegated prefix. It is a direct equivalent to the NAT loopback rule for IPv4. Without this rule, the packet would travel from the source ULA to the destination in the public prefix, then the router would translate the destination IP to the ULA, then the destination system would reply from its ULA to the source IP, which is the ULA. This reply would therefore bypass the router (because both addresses are in the same subnet) and wreak havoc: the original system would expect the reply to come from the public IP that it connected to, not from the ULA. Translating the source IP using this rule fixes this: the original packet goes from the ULA of the first host to the public IP of the second host, gets translated by the router twice (so that the source IP is now public and the destination IP is the ULA of the second system), then the reply also has to go through the router which undoes this transformation.
Yep i pretty much went through line by line and converted your script to nftables. I had to do additional research on the c state rule, but it is comparable to the rule you had. The biggest problem was the iptable-translate didnt know how to interpret most of those rules since they used a relatively newer option of nftables to map one prefix to another via snat/dnat.
The better option is to use mangle rules, but nftables is slow to adapt new options for that. The only way to do it with mangle on nftables is to use bitwise technique.
Thinking about it again, I came to the conclusion that the mangle table cannot be a solution. Here is why: it can only implement stateless NPT, while we do want stateful connection tracking, for the situation when both WANs are up. In other words, we need to track which WAN should be used for every connection, and the solution based on the mangle table only cannot do that.
I would have to say that I concur with your argument simply because the mangle rules look like something thrown together to only perform a specific task with the headers.
nft add rule inet fw4 mangle_prerouting iif $DEVICE ip6 daddr $PREFIX counter ip6 daddr set ip6 daddr and ::ffff:ffff:ffff:ffff or ${ULA%%/*} notrack
nft add rule inet fw4 mangle_postrouting oif $DEVICE ip6 saddr $ULA_PART ip6 daddr !=$ULA_PART counter oif $DEVICE ip6 saddr set ip6 saddr and ::ffff:ffff:ffff:ffff or ${PREFIX%%/*} notrack
keep notice, this is using the prior scripts variables. we are assuming that the /48 length as well in the bitwise mapping.
Note, above mangle example is just an assumption and may not be the exact required NFTABLE equivalent. It was somewhat sourced together from this reference: