I did some benchmark - quite surprising result. I am pretty sure I did something wrong.
The test was done with a ar9344 platform (dir 835). This cpu is rather old which is exactly what I needed to measure the ipv6 no nat performance.
Result:
both lan, ipv4
Server listening on 5201
Accepted connection from 192.168.5.227, port 49800
[ 5] local 192.168.5.203 port 5201 connected to 192.168.5.227 port 49801
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 106 MBytes 892 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 942 Mbits/sec
[ 5] 2.00-3.00 sec 112 MBytes 942 Mbits/sec
[ 5] 3.00-4.00 sec 112 MBytes 942 Mbits/sec
[ 5] 4.00-5.00 sec 112 MBytes 942 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 942 Mbits/sec
[ 5] 6.00-7.00 sec 112 MBytes 942 Mbits/sec
[ 5] 7.00-8.00 sec 112 MBytes 942 Mbits/sec
[ 5] 8.00-9.00 sec 112 MBytes 942 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 942 Mbits/sec
[ 5] 10.00-10.05 sec 6.05 MBytes 937 Mbits/sec
both lan, ipv6
Accepted connection from fe80::4d5:5a31:4769:eb39, port 49834
[ 5] local fe80::3d70:2aa5:2ab7:3a92 port 5201 connected to fe80::4d5:5a31:4769:eb39 port 49835
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 105 MBytes 881 Mbits/sec
[ 5] 1.00-2.00 sec 111 MBytes 929 Mbits/sec
[ 5] 2.00-3.00 sec 111 MBytes 929 Mbits/sec
[ 5] 3.00-4.00 sec 111 MBytes 929 Mbits/sec
[ 5] 4.00-5.00 sec 111 MBytes 929 Mbits/sec
[ 5] 5.00-6.00 sec 111 MBytes 929 Mbits/sec
[ 5] 6.00-7.00 sec 111 MBytes 929 Mbits/sec
[ 5] 7.00-8.00 sec 111 MBytes 929 Mbits/sec
[ 5] 8.00-9.00 sec 111 MBytes 929 Mbits/sec
[ 5] 9.00-10.00 sec 111 MBytes 929 Mbits/sec
[ 5] 10.00-10.05 sec 5.69 MBytes 930 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-10.05 sec 0.00 Bytes 0.00 bits/sec sender
[ 5] 0.00-10.05 sec 1.08 GBytes 924 Mbits/sec receiver
wan lan bridged, ipv4 (full sirq)
Accepted connection from 192.168.5.227, port 49849
[ 5] local 192.168.5.203 port 5201 connected to 192.168.5.227 port 49850
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 103 MBytes 866 Mbits/sec
[ 5] 1.00-2.00 sec 112 MBytes 941 Mbits/sec
[ 5] 2.00-3.00 sec 105 MBytes 879 Mbits/sec
[ 5] 3.00-4.00 sec 109 MBytes 917 Mbits/sec
[ 5] 4.00-5.00 sec 109 MBytes 912 Mbits/sec
[ 5] 5.00-6.00 sec 112 MBytes 942 Mbits/sec
[ 5] 6.00-7.00 sec 108 MBytes 908 Mbits/sec
[ 5] 7.00-8.00 sec 92.8 MBytes 778 Mbits/sec
[ 5] 8.00-9.00 sec 89.7 MBytes 753 Mbits/sec
[ 5] 9.00-10.00 sec 112 MBytes 942 Mbits/sec
[ 5] 10.00-10.05 sec 5.47 MBytes 939 Mbits/sec
wan lan bridged, ipv6 (full sirq)
Accepted connection from fe80::4d5:5a31:4769:eb39, port 49843
[ 5] local fe80::3d70:2aa5:2ab7:3a92 port 5201 connected to fe80::4d5:5a31:4769:eb39 port 49844
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 104 MBytes 873 Mbits/sec
[ 5] 1.00-2.00 sec 111 MBytes 928 Mbits/sec
[ 5] 2.00-3.00 sec 110 MBytes 920 Mbits/sec
[ 5] 3.00-4.00 sec 111 MBytes 928 Mbits/sec
[ 5] 4.00-5.00 sec 110 MBytes 922 Mbits/sec
[ 5] 5.00-6.00 sec 110 MBytes 924 Mbits/sec
[ 5] 6.00-7.00 sec 110 MBytes 921 Mbits/sec
[ 5] 7.00-8.00 sec 111 MBytes 928 Mbits/sec
[ 5] 8.00-9.00 sec 110 MBytes 921 Mbits/sec
[ 5] 9.00-10.00 sec 109 MBytes 918 Mbits/sec
[ 5] 10.00-10.05 sec 5.49 MBytes 924 Mbits/sec
nat, ipv4 (full sirq)
[ 4] local 192.168.5.203 port 62774 connected to 192.168.2.2 port 5201
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 24.5 MBytes 205 Mbits/sec
[ 4] 1.00-2.00 sec 24.2 MBytes 203 Mbits/sec
[ 4] 2.00-3.00 sec 24.6 MBytes 207 Mbits/sec
[ 4] 3.00-4.00 sec 24.4 MBytes 204 Mbits/sec
[ 4] 4.00-5.00 sec 24.5 MBytes 205 Mbits/sec
[ 4] 5.00-6.00 sec 22.2 MBytes 187 Mbits/sec
[ 4] 6.00-7.00 sec 23.2 MBytes 195 Mbits/sec
[ 4] 7.00-8.00 sec 22.6 MBytes 190 Mbits/sec
[ 4] 8.00-9.00 sec 23.0 MBytes 193 Mbits/sec
[ 4] 9.00-10.00 sec 22.4 MBytes 188 Mbits/sec
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.00 sec 236 MBytes 198 Mbits/sec sender
[ 4] 0.00-10.00 sec 236 MBytes 198 Mbits/sec receiver
ipv6 wan - lan (full sirq)
Accepted connection from 2001:db80::1, port 49928
[ 5] local 2001:db80:1::3a92 port 5201 connected to 2001:db80::1 port 49929
[ ID] Interval Transfer Bandwidth
[ 5] 0.00-1.00 sec 26.0 MBytes 218 Mbits/sec
[ 5] 1.00-2.00 sec 27.5 MBytes 230 Mbits/sec
[ 5] 2.00-3.00 sec 27.6 MBytes 232 Mbits/sec
[ 5] 3.00-4.00 sec 27.4 MBytes 230 Mbits/sec
[ 5] 4.00-5.00 sec 27.6 MBytes 231 Mbits/sec
[ 5] 5.00-6.00 sec 27.4 MBytes 230 Mbits/sec
[ 5] 6.00-7.00 sec 27.6 MBytes 231 Mbits/sec
[ 5] 7.00-8.00 sec 27.3 MBytes 229 Mbits/sec
[ 5] 8.00-9.00 sec 27.5 MBytes 231 Mbits/sec
[ 5] 9.00-10.00 sec 27.3 MBytes 229 Mbits/sec
[ 5] 10.00-10.06 sec 1.78 MBytes 232 Mbits/sec
..What? I want to highlight the result for "ipv6 wan - lan" and "wan lan bridged, ipv6 (full sirq)"
There's no NAT going on in "ipv6 wan - lan (full sirq)" - how is it different from "wan lan bridged, ipv6 (full sirq)"? Ideally one should be able to achieve bridge like performance across wan lan as the ipv6 over bridged wan-lan shows that the hardware is capable of doing so. What is the networking stack doing extra in "ipv6 wan - lan (full sirq)" that causes its performance to equal that of ipv4 where there are more processing needed?
*I did "ipv6 wan - lan (full sirq)" test both with and without firewall, but it did not have much impact in the iperf result.