MAP portsets not fully utilized and only 16 ports available due to discrepancies between connlimit and NAT?

Hello,
I am using OpenWrt 19.07 on Buffalo WZR-HP-AG300H.
The ISP I recently switched to provides IPv4 over IPv6 via MAP-E draft 3. The 4in6 tunnel portion works flawlessly, but the MAP requires that the outgoing source port of IPv4 connections must be restricted to a set of statically allocated ports in order to statelessly share the IPv4 address among other users.

In my case 15 sets of 16 ports = 240 ports are allocated. (First and last four bits of the port is permuted, excluding first 16 ports to avoid well-known range) Unfortunately the discontinuous allocation makes it complicated to utilize existing NAT implementations in firewalls.

The MAP package that comes with OpenWrt base repo implements this by creating a separate firewall rule for each of the portset, and attempts to distribute connections for each portset with connlimit. The first NAT rule SNATs connections to the first 16 ports allocated, and when all of the 16 ports per destination are used up it will not match anymore due to --connlimit-upto 16, which causes it to fall through to the next rule which SNATs to the next allocated 16 ports.

However if my understanding is correct I believe this implementation is flawed, because connlimit does not count TCP_CONNTRACK_{TIME_WAIT,CLOSE} conntrack states at nf_conncount.c:already_closed() while NAT does in nf_conntrack_tuple_taken(). This causes the firewall rules to consider the first portset to be available when counting with connlimit, while the NAT does not, which causes the port allocation to fail without ever utilizing any portsets other than the first one.

Below is the entire output of conntrack -L --src-nat | grep 203.0.113.1, when a private host 172.24.245.144 tried to connect to a website on 203.0.113.1 via a shared global address 192.0.2.1. (IPv4 addresses are replaced with bogus ones just in case)

tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45984 dport=443 packets=13 bytes=2048 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7269 packets=13 bytes=5544 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45988 dport=443 packets=48 bytes=4034 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7270 packets=46 bytes=55682 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45978 dport=443 packets=10 bytes=1881 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7267 packets=10 bytes=3487 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45966 dport=443 packets=12 bytes=1999 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7277 packets=11 bytes=3877 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45964 dport=443 packets=9 bytes=1822 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7276 packets=9 bytes=619 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45972 dport=443 packets=8 bytes=1023 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7264 packets=8 bytes=567 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45958 dport=443 packets=15 bytes=2164 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7273 packets=14 bytes=4755 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45980 dport=443 packets=17 bytes=2277 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7268 packets=16 bytes=6435 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45956 dport=443 packets=13 bytes=2054 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7272 packets=12 bytes=3056 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45960 dport=443 packets=13 bytes=2082 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7274 packets=13 bytes=4010 [ASSURED] mark=0 use=2
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45970 dport=443 packets=8 bytes=1023 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7279 packets=8 bytes=567 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45976 dport=443 packets=11 bytes=1933 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7266 packets=11 bytes=2439 [ASSURED] mark=0 use=1
tcp      6 7 CLOSE src=172.24.245.144 dst=203.0.113.1 sport=45968 dport=443 packets=7 bytes=853 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7278 packets=5 bytes=411 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45962 dport=443 packets=53 bytes=4407 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7275 packets=48 bytes=44883 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45952 dport=443 packets=74 bytes=5586 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7271 packets=83 bytes=78571 [ASSURED] mark=0 use=1
tcp      6 117 TIME_WAIT src=172.24.245.144 dst=203.0.113.1 sport=45974 dport=443 packets=8 bytes=1023 src=203.0.113.1 dst=192.0.2.1 sport=443 dport=7265 packets=8 bytes=567 [ASSURED] mark=0 use=1

Below is a part of the output of iptables-save -t nat, showing rules generated by map.sh.

-A POSTROUTING -o map-wan4 -p icmp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 0" -j SNAT --to-source 192.0.2.1:7264-7279
-A POSTROUTING -o map-wan4 -p tcp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 1" -j SNAT --to-source 192.0.2.1:7264-7279
-A POSTROUTING -o map-wan4 -p udp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 2" -j SNAT --to-source 192.0.2.1:7264-7279
-A POSTROUTING -o map-wan4 -p icmp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 3" -j SNAT --to-source 192.0.2.1:11360-11375
-A POSTROUTING -o map-wan4 -p tcp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 4" -j SNAT --to-source 192.0.2.1:11360-11375
-A POSTROUTING -o map-wan4 -p udp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 5" -j SNAT --to-source 192.0.2.1:11360-11375
-A POSTROUTING -o map-wan4 -p icmp -m connlimit --connlimit-upto 16 --connlimit-mask 32 --connlimit-daddr -m comment --comment "!fw3: ubus:wan4[map] nat 6" -j SNAT --to-source 192.0.2.1:15456-15471
...

/etc/config/network of MAP interface (IP addresses are fake and are inconsistent with ones shown previously)

config interface 'wan4'
	option proto 'map'
	option peeraddr '2001:db8:f00::64'
	option ipaddr '192.0.0.0'
	option ip4prefixlen '15'
	option ip6prefix '2001:db8::'
	option ip6prefixlen '31'
	option ealen '25'
	option psidlen '8'
	option offset '4'
	option mtu '1460'

As it can be seen from conntrack output, only the first portset 7264...7279 (0x1c60...0x1c6f) is utilized, because the connlimit count is not full while NAT cannot allocate any more ports yet. The web browser on 172.24.245.144 is stalled waiting for other connections it started to establish.

One workaround I've seen is to distribute connections between portsets in a round-robin fashion with -m statistic --mode nth (probably equivalent to multiple --to-source statements pre 2.6.11 kernel according to iptables-extensions(8)) by MARKing packets, which alleviates the problem significantly and certainly allows more than 16 TCP connections per IPv4 host, but still does not utilize ports in the most reliable way because the selected portset could be full while other is available.

To summarize,
*My IPv4 connection over MAP-E works most of the time, but are limited to 16 TCP connections per IPv4 destination in most cases despite having 240 ports available and some IPv4-only websites fail to load due to this.
*Is there any better way to implement MAP on OpenWrt, or Linux netfilter/iptables/nftables in general, ideally as efficient as if the port allocation was continuous,
*Or if not are there any suggestions on how to implement the round-robin method inside the MAP package? Once I learn more about the internals of OpenWrt I will try to contribute.

This is my first port as a newcomer, I hope this post was appropriate.