Firewall post-routing masquerade NAT flags

For upstream IPv4 NAT, it would be useful to be able to specify options appended to the masquerade rule in chain srcnat_wan and the like.

In particular, it would be good for paranoid types like myself when a zone has option masq '1' in /etc/config/firewall to also have UCI options to choose the random or fully-random nftables NAT flags.

Icing on the cake would be if this were supported in LuCI and also for NAT6. :slight_smile:

That reduces nat traversal success rate

Why would that be? There's never a guarantee that the masqueraded port used on the router that the remote end sees will be the same as the port from which the internal host is sending datagrams.

Where ports are opened by the client end like non-PASV FTP we have contrack helpers and these days STUN/TURN.

If two hosts downstream send from the same port number one of them is going to have to be given a different one on the upstream endpoint for full-cone which is quite common. Even with nftables' more restrictive policy it's not unlikely that two downstream hosts will contact the same remote addr:port endpoint and if they do it using the same local port then one has to lose. Technically it's NAPT not NAT isn't it?

Or am I missing something?

At least
ref https://www.rfc-editor.org/rfc/rfc4787
ref https://www.rfc-editor.org/rfc/rfc5382
ref https://www.rfc-editor.org/rfc/rfc6888

You can nfs-mount and do one standard non-nat ipsec session via default linux default nat.
Also exchanging 2 syn+ack establishes a tcp connection

Anyway if you want to try your way (fully random equals random for a while now):
/etc/nftables.d/whatever.nft

chain srcnat_wan {
        meta nfproto ipv4 meta l4proto { tcp , udp } masquerade to :1024-65535 random
}

The option as such would be good, say *BSD uses random ports by default and loose modes are config option, then some games dont work. Go figure who is better.

Thanks for the reply. I wouldn't have thought running NFS through NAT is very common - not sure I'd want to expose NFS on a publicly-accessible interface any more than I'd expose SMB/CIFS, even with IP-based access restrictions. That' what routed VPNs are for. (I use WireGuard for everything these days - if in client/server 'dial-in' scenarios, pretty much NAT-proof at the client end and at most simple UDP-port-forrward at the other.)

It's not so much what nft rules to add, it's having the option for the rule that OpenWrt's firewall4 adds to add NAT flags. Munging the chains - even the user ones in /etc/nftables.d/10-custom-filter-chains.nft - seems a bit fragile to changes in how the firewall service converts the UCI to nft. And to me forgetting I've got rules in /etc/nftables.d when using UCI or LuCI later!

Anyway, it was just a suggestion. I'm not going to get bent out of shape if it's not deemed appropriate/important enougn.

Cheers.

Well, if you did not know nfs has little to do with smb, you need 2 connections keeping source ports.

Nothing at all will happen with that rule, if you rename ip4 wan interface you will have unreachable rule hanging around doing nothing.

The point I was making was that I'd not exposed any file server on an external interface, whether that be NFS, NetBIOS/SMB/CIFS, AFP or even IP-encapsulated-IPX/SPX. The details of the protocols were not the issue.

It is not "exposing" anything - source port numbers are encoded in protocol, and to connect you need to maintain port numbers after nat. I had an idea to implement "nat types" out of recommended, but that is like bulk of parameters set in sync, still only approaching, not fully satisfying the "rfc requirements"

How about adding "option masq '2'" that does the random thing?

+1 for fully-random support. Looking at the output of conntrack -S, I see many dropped packets due to NAT port conflicts (insert_failed counters). This is a known issue with the kernel’s default allocation strategy that uses an incremental counter, packets being processed by different CPUs have a chance to allocate the same port, and as this is done in a softirq, the kernel just drops the packet rather than unwind all the work to try a different port. For TCP, this introduces a delay for the SYN retry, for other things like DNS it can much worse since there is no retry attempt.

This article, while focused on container NAT use cases, does a great job of explaining the problem.

https://tech.new-work.se/a-reason-for-unexplained-connection-timeouts-on-kubernetes-docker-abd041cf7e02

fully random equals random , it is hard to determine which rule filled the conntrack and could be replaced with notrack gadget (just rewriting ports and addresses both ways) or rate-limited.
Usually one doubles conntrack capacity and hopes for the best.
Today I learned something new that port clashes makes insert_failed, 10x more often with windows and <5000 source ports....

Add to /etc/nftables.d/something.nft

chain srcnat_wan {
# expand windows dns src port range
        ct state new meta nfproto ipv4 meta l4proto udp ct original proto-dst 53 ct original proto-src 1024-5000 masquerade to :1024-65535 random
}

alternative is dns intercept (see wiki)

I want to add that PFsense/OPNsense use random source ports in NAT by default. See source ports and static ports in their docs.

If this is implemented, there has to be a mechanism to opt out, i.e. it should be possible to create a firewall rule to don’t modify the source port for some traffic. This is crucial for some protocols like SIP.

ct state new meta nfproto ipv4 meta l4proto udp ct original proto-src 5060  masquerade to :ct original proto-src

(th sport / udp sport etc are valid too, just a zip slower)

Did more research, this would be pfsense emulation (by ways of iptables-extensions(8)) in nftables

        chain srcnat_wan {                                                                                                                                                                                                                                                   
                meta nfproto ipv4 udp sport 500 masquerade                                                                                                                                                                                       
                meta nfproto ipv4 meta l4proto {tcp, udp, dccp, sctp} masquerade to :th sport map {1-511 : 1-511 , 512-1023 : 1-1023, 1024-65535 : 1024-65535} random                                                                                           
}

But it glitches in a completely different ways on both 24 (rule display is faulty, the rule itself is ok) and 25 (input rejected because first key is not a port)

Would make more sense to resolve "next port in range" not as X++ but as random in kernel.

I ended up settling on something like this, as using fully-random without exclusions ended up breaking NAT hole punching.

meta l4proto tcp counter masquerade fully-random
udp dport { 53, 443 } counter masquerade fully-random
counter masquerade

I randomize all TCP traffic, since that is a majority of outbound client traffic and does not need to be mapped back to the same port, UDP for DNS and QUIC since neither of those protocols need full cone NAT, and then all other UDP ports and protocols I leave alone so they can try to maintain the original port in case inbound replies from other hosts are needed.

I still get a small amount of insert_failed unfortunately, but it's a lot less than it used to be.