Netfilter "Flow offload" / HW NAT

It seems that MT7621's HW OFFLOAD break the network under latest snapshot. I can ping from LAN to WAN, but can not open websites.

It happens under latest snapshot and 19.07 snapshot, 18.06.4 works well.

Resurrecting this old thread because it's pretty impossible to figure out what's the current status (wiki page needed?).

My RBM33G (MT7621) takes advantage of flow offloading on OpenWrt 19.07.4, r11208-ce6496d796; I get 10-15% more performance (eyes-measure, no exact methodology, no benchmarks). What about other platforms? (if any)

ar71xx - ???
ar79 - ???
mipsel - ???
...

mt7621 minor improvements and using vlans with hw offlload enabled breaks.

So not really usable, plus the latest openwrt using 5.4 has no hw offload currently.

Really ?! I didn't test properly, but I noticed 10-15% less soft irqs (using top) when flow offloading was enabled. I couldn't notice any difference using sw or hw offloading, but something was happening.

Ehm, is it a definitive feature drop, or temporary lack of flow offloading given the kernel version bump?

ar71xx/ath79: with software flow offload, a TP-Link Archer C7 goes from 250 Mbps to 700 Mbps, but all instances of software-based flow offload currently break long-lived idle TCP connections. The timeout is for some reason always 120s (and ignores the net.netfilter.nf_conntrack_tcp_timeout_established sysctl value), see Software flow offloading and conntrack timeouts for a more detailed report.

1 Like

It was added back to work with DSA. On MT7622 it can do 940 mbps NAT over PPPoE at 0% CPU load.

"On MT7622" - on which router exactly?

Tested on Banana Pi R64

I don't think MT7622 is a good measure if flow offload working better or not
ARM Cortex A53 is too fast a CPU to detect much changes in performance.

I am testing flow offload on MT7621 and seems to be stucked at 700-800Mbps

It's definitively visible, on sirq.

Without flow offload:

CPU: 0% usr 0% sys 0% nic 56% idle 0% io 0% irq 43% sirq

With flow offload:

CPU: 0% usr 0% sys 0% nic 99% idle 0% io 0% irq 0% sirq

How did you get that? Zero sirq!?

EDIT: the following tests aren't useful as I was running iperf on the router. They only show that sw offloading is working, but can't say anything about hw offloading.

My RBM33G have some benefit but still gets most of sirq and sys when hw offloading is enabled:

Mem: 67660K used, 186060K free, 1544K shrd, 1732K buff, 16228K cached
CPU:   0% usr  25% sys   0% nic  37% idle   0% io   0% irq  36% sirq
Load average: 1.18 0.40 0.13 4/87 18703
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
18694 18685 root     R     1148   0%  25% iperf3 -s

[ 5] 0.00-94.00 sec 7.99 GBytes 730 Mbits/sec 4 sender

Sw offload gives better throughput and higher cpu usage.

Mem: 54752K used, 198968K free, 236K shrd, 1588K buff, 10140K cached
CPU:   1% usr  25% sys   0% nic  29% idle   0% io   0% irq  43% sirq
Load average: 0.94 0.50 0.21 3/93 3205
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
 3205  3196 root     R     1136   0%  25% iperf3 -s

[ 5] 0.00-205.90 sec 21.5 GBytes 898 Mbits/sec 70 sender

No offload gives worst throughput and cpu usage:

Mem: 55576K used, 198144K free, 236K shrd, 1588K buff, 10104K cached
CPU:   0% usr  26% sys   0% nic  43% idle   0% io   0% irq  29% sirq
Load average: 0.72 0.40 0.19 4/93 3225
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
 3215  3207 root     R     1132   0%  25% iperf3 -s

[ 5] 0.00-255.19 sec 20.7 GBytes 697 Mbits/sec 79 sender

Note: OpenWrt 19.07.4, r11208-ce6496d796. I rebooted between one test and the other. Iperf client is gbit ethernet connected directly to one of the rbm33g ethernet ports; the 3 ports are separated using vlans.

Do not run iperf3 on your router. Flow offloading does not work that way.

2 Likes

His router has a ARM Cortex A53 core, our MT7621ATs are MIPS1004Kc
I am getting the same numbers as you but the thing is I have seen better numbers on MT7621 before this where 1Gbps Throughput works so I am not sure what caused the regression.
I run jperf standalone client/server

BTW, MIPS1004Kc? I'm using MIPS24Kc binaries! Should I switch to MIPS1004Kc?

Doesn't matter Openwrt label them MIPS24kc but the binaries are MIPS32R2 so they are compatible

Ok, to recap:

  • the MT7621 is partly broken in 19.07.4 but works in trunk thanks to nbd's feature rewrite.
  • ARM Cortex A53 is working.
  • ... ?

What about ar71/ar79? There are a ton of atheros/qualcom devices out there! On the Mikrotik HAP AC (ar71xx) was not working at all; even sw offloading.

1 Like

Hardware Flow Offloading only works on Mediatek SoC (for example MT7621) and as far as I know, only on 19.07.4 or 19.07-snapshot (also in old versions of 18.06.x), not in 5.4 kernel. Sirq usage should be 0% and give speeds of over 900Mbps.

It's working correctly? I am currently on 19.07.4 but I still have not been able to test it because I use vpn policy routing and it seems to give problems with more than one routing table.

With Software Flow Offloading you get faster speeds with less CPU usage than without it. As far as I know, it doesn't give any problem.

2 Likes

ok, cool missed that one.