From what I understood of the Fast Path code, IPv6 should also get 'acceleration', as the basic idea of the Fast Path code is to bypass unnecessary filtering checks once connection has been established. So regardless of whether there's NAT involved, Fast Path should accelerate connections, unless the firewall is turned off.
some checks would need to be added (some ifdef ?), because if you try to disable ipv6 in the entire build system, sfe would fail to compile
Will look into it indepth when free
Is it possible to compile a fast path on the 17.01.5 snapshot with the kernel 4.4.147, while trying to compile I get an error, router wdr3600 ?
I'm using this.
Thanks for checking it!
When you say you bumped the version you mean that it was in the new August release? Because when I tried to sysupgrade it gave an error saying that the version wasn't generic or something like that.
Am I doing something wrong?
Similar thing happened to me yesterday:
- compiled an image from snapshot openwrt 2 weeks ago (flashed it at that time)
- compiled an image from snapshot openwrt yesterday and tried to flash it: didn't work with the above message
As it turned out, in my case at least (Archer C7 v2), that they renamed the model name in ".config", and that's why the error came up. The solution for me was to force flash it via ssh ("sysupgrade -F name.bin").
I'm not saying that you can do the same though, it can brick your router!
WNDR4300 V1 LEDE-17.01
make error: ‘struct sk_buff’ has no member named ‘fast_forwarded’ make: Entering directory '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/linux-4.4.151' CC [M] /home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.o /home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c: In function 'sfe_ipv4_recv_udp': /home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c:1366:5: error: 'struct sk_buff' has no member named 'fast_forwarded' skb->fast_forwarded = 1; ^ /home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c: In function 'sfe_ipv4_recv_tcp': /home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c:1909:5: error: 'struct sk_buff' has no member named 'fast_forwarded' skb->fast_forwarded = 1; ^ scripts/Makefile.build:269: recipe for target '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.o' failed make: *** [/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.o] Error 1 Makefile:1421: recipe for target '_module_/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe' failed make: *** [_module_/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe] Error 2 make: Leaving directory '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/linux-4.4.151' Makefile:117: recipe for target '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/.built' failed make: *** [/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/.built] Error 2 make: Leaving directory '/home/lixiao/lede/source/package/lean/shortcut-fe' package/Makefile:105: recipe for target 'package/lean/shortcut-fe/compile' failed make: *** [package/lean/shortcut-fe/compile] Error 2 make: Leaving directory '/home/lixiao/lede/source' package/Makefile:101: recipe for target '/home/lixiao/lede/source/staging_dir/target-mips_24kc_musl-1.1.16/stamp/.package_compile' failed make: *** [/home/lixiao/lede/source/staging_dir/target-mips_24kc_musl-1.1.16/stamp/.package_compile] Error 2 make: Leaving directory '/home/lixiao/lede/source' /home/lixiao/lede/source/include/toplevel.mk:205: recipe for target 'world' failed make: *** [world] Error 2
I see what you're saying, but the same happens to IPv6 like it happened to IPv4 before your "tweaks" to the firmware, Sirq with IPv4 stays at approximately 30-40% while IPv6 goes to 99% slowing down my 200mbps connection to 110mbps.
Over the last two updates I saw no changes to the IPv6's lack of acceleration.
Have you been working on something @gwlim or am I missing something?
If it is like you say it is @quarky why doesn't it work?
My SFE patches has modifications to work when PBR is applied to packets to be routed. Apparently that change also made IPv6 getting accelerated while before that it didn’t. I only made changes to the fast-classifier CM tho. I did not use sfe-cm. As my changes worked for what I needed to, I didn’t spend too much time trying to understand why.
Likely due to the existing fast-classifier code unable to correctly find the destination output interface for IPv6 packets.
So what's your opinion?
What should I do?
You can try compiling your own build with my SFE patches applied.
Yesterday I installed juppin's build on my archer c7 v2 and managed to do a speedtest that reached 572mbps down and 610mbps up (testing the switch, not wifi). The fastest I'd managed with the normal snapshot build was 350/275. Granted a better test would have been to try and switch between two computers rather than download packets over the internet, but it feels like an improvement.
Next I'm going to try installing the breed bootloader on the device and will follow Pedro's instructions to overclock the Archer c7 to 1000MHz. I'll either end up with a brick or with a router than can finally do some level of justice to our new 1gbps fiber connection.
@czyzczyz it would be really cool to measure how much improvement a ~30% overclock brings in terms of packet routing performance.
Regarding flashing the bootloader, if you follow instructions step by step and the router doesn't lose power while flashing, everything will be ok
It looks like replies to that overclocking thread are closed. Pity, as I've got some results to post. They're probably a little relevant to this thread, as I'd assume they'll impact the performance of software flow offloading.
I managed to get the breed bootloader installed and got my CPU running at 1000MHz. When I set the memory speed to Pedro's 760MHz if left me with a router that handed out an IP address but no gateway and was inaccessible (I tried setting a static IP on my computer, but no dice). So I booted back into breed, dropped the memory speed to 600MHz, and now everything appears to be great. I've got an Archer c7 v2 with a CPU at 1000MHz that's running openwrt snapshot and appears to be doing so with no issues. The router doesn't even feel warm.
Results of "openssl speed md5 sha1 sha256 sha512" before and after overclock:
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 3238.06k 11667.18k 31559.12k 54389.72k 68505.94k sha1 3501.39k 10919.23k 26035.01k 39378.22k 46105.34k sha256 3924.35k 9100.04k 15783.04k 19461.24k 20831.87k sha512 814.69k 3257.65k 4490.40k 6058.87k 6750.87k
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes md5 5565.14k 18416.74k 47276.66k 77263.15k 95332.93k sha1 5662.74k 16754.37k 38145.64k 55831.23k 63852.92k sha256 5705.37k 12839.87k 21992.41k 26999.59k 28838.87k sha512 1136.98k 4552.83k 6248.48k 8448.00k 9375.74k
So a ~39% increase in speed of those calculations?
I ran a stopwatch on my phone for both the before and after runs, and clocked the first run at 1:02 and the second at 1:03 -- I'd assume those are both the same 1 minute of actual time plus a bit of human mechanical latency getting to the 'stop' button.
I did a couple of speed tests and got into the 4-600s for up and down, but it's not really the best test given that the limiting factor may not be the speed of the overclocked Archer c7 v2 with software flow offloading but instead the limit may be due to the speed provided by whichever server the speedtest was using. At the very least, those numbers are around the same as I was getting before the overclock and are better than I ever achieved before installing the firmware that included flow offloading. I'll do some more speed tests after I replace my current router with this one and set it up as the main router on my network and if it returns some better numbers I'll report them.
Ethernet switch speeds appear to have improved with overclocking combined with flow offloading -- this is a high score for today:
Here's a google spreadsheet showing the result of a bunch of tests, including speed tests of the following configurations:
- Cable internet (300mbps down, 20mbps up), Archer c7 v4 running the 08-22-2018 snapshot of openwrt
- 1Gbps Fiber internet, Archer c7 v4 running the 08-22-2018 snapshot of openwrt
- 1Gbps Fiber internet, Archer c7 v2 running juppin's snapshot build with flow offloading, stock 720MHz CPU speed)
- 1Gbps Fiber internet, Archer c7 v2 running juppin's snapshot build with flow offloading, CPU overclocked to 1000MHz, memory overclocked to 600MHz.
And a little chart (included in the spreadsheet). Flow offloading led to a clear increase in speed. Conclusions are a little foggy due to the low sample size (I've only got one non-overclocked flow offloading speed test data point) and an unreliable testing method that depends on the speed of remote servers. But I think I'm seeing a bit of a gain from overclocking.
can you test the difference for overclocked without flow offload ?
I can try some more tests later today if I get time. I've got an ODROID-XU4 that ought to be able to saturate things with iperf to cut remote server speed variability out of the results.
But if you look at that one spike in the chart labelled "v2_fiber_720_flow", that's the existing measurement with flow offloading but no overclocking.
On a different note - here's a test result with flow offloading and overclocking of the router's 5GHz 802.11ac speed. I'm fairly distant from the router but with no walls in between and am getting a connection at -93dBm noise and -66dBm RSSI.
(edit - replaced a 271/238 test with a newer 373/293 wifi test result - computer in same position with same connection strength)
It would be interesting to see how the Fast Path patches compare (in benchmark test) with the “official” flow offload that’s now part of the 4.14+ kernel.