I believe flow offloading really just helps speeding up processing of packets that are traversing the router, packet originating or terminating at the router should not see a big improvement, as far as I know.
Well, I've just read up on it again and that is what it should do, true.
But for some reason the tests above are about 100 Mbits/s faster on MT7621.
TCP is 776 Mbits/s with offloading and 674 Mbits/s without.
UDP is 323 Mbits/s with offloading and 234 Mbits/s without.
I ran the test just now.
No idea what is going on there, maybe some side effect of sorts?
I've managed to get my patches into vanilla kernel about month back, so the openwrt will start use them automatically when it gets to next future stable branch (5.4+ probably, .. long time ahead ).
But about a week ago one of my patch was backported to all stable trees including the one openwrt is using, so I guessed I could rebase my patches.
The autobalancing function was removed for the vanilla, so you must to change the interrupts by hand, for example:
(the value is bitmask, just google "kernel bitmask affinity" or something like this).
Copy the files into target/linux/lantiq/patches-4.19 of your openwrt directory and reconfigure the kernel. The kernel I based this is 4.19.65 and the openwrt is current git version (old about a day). Don't forget pastebin may eat the last empty line.
Too bad it was a great thing for the router. It automatically set the interrupts between the two VPEs but no more. Every thing is mapped to the VPE 0 even though the smp_affinity settings are set to 3 meaning use both cpus.
Well there were objections towards it. And the switching VPE every interrupt event has big overhead (the code must migrate from one VPE to the other). There is always possibility to fix irqbalance daemon, one time remapping of the interrupts between cores or just patch the balancing code (just few lines).
As of now, the wifi performance is identical whether the IRQ balancing is working or not. IMO it wouldn't really make a big difference in performance because CPU gets utilized in full if you use Samba v4 and at the moment I did an iperf3 test for 2 minutes and I am getting 56MB/s avg for receiving data from router and 110MB/s for sending data to the router. As I am using my phone I cant really test whether the CPU is being utilized in full but the load average was about 2.5 for a 2 minute udp data transfer each way separately.
So I've been using your secondary VPE IRQ patches successfully for a while - just manually punting vrx200_rx, ath9k and ath10k onto the second CPU.
Since your earlier patches have been forgotten in the mailing list and your revised (non-auto balancing) patches are now upstream (well done on that), have you considered raising a pull request on OpenWRT github for including these in kernel 4.19 on git?
Just been dealing with the packet steering script issue, which mostly affects VDSL2 users who should be getting above 100Mb/s. See PR2553 for fixing it to use all CPUs but also disable it by default (in preference to proper IRQ balancing).
Well, 19.07.1 is performing fine for LAN->DSL WAN vrx200/HH5A with the packet steering scripts and software flow offloading, however due to the ptm and ath10k pinning to CPU0, the WLAN0->DSL WAN often tops out ~85 MB/s, only sometimes reaching 100-104 MB/s to tease that it is capable.
I believe manually punt Ath10k away from DSL (ptm_mailbox_isr) will fix this, so I am going to try and test these patches against 19.07.1; Has anyone used these with 18.07.1 for HH5A sucessfully yet to fix 5G WiFi->DSL performance?
I have bad expirience in VDSL -> Wlan using 4.19.65 patches on W8970
It barely touch 23Mb in speedtest net even with software offload and irq mapping . Got ~30Mb in 18.104.22.168 software offloading and packet steering @achmar16 I'm looking forward to test irq-smp patches with recent 19.07.01
You need to understand that 0904 patch is for Ethernet driver and not for VDSL. I think you need to check your LAN to LAN performance on the router itself. For that use two instances of iperf3 in TCP mode on 127.0.0.1, one would be the server and the other would be the client on the same router. You can also check your CPU load while iperf3 test is happening through htop. I am using the patches 0901 and 0902 with v19.07.1 and the best I am getting is 4mb/s with Samba v2 and 100% CPU load while transferring the files.
One more thing, if you want increased speed then I suggest you change your build environment to v18.06.x and apply the patches. AFAIR I was able to get 10mb/s at some point in v18.06.