Qualcomm Fast Path For LEDE


#671

From what I understood of the Fast Path code, IPv6 should also get 'acceleration', as the basic idea of the Fast Path code is to bypass unnecessary filtering checks once connection has been established. So regardless of whether there's NAT involved, Fast Path should accelerate connections, unless the firewall is turned off.


#672

some checks would need to be added (some ifdef ?), because if you try to disable ipv6 in the entire build system, sfe would fail to compile


#673

Will look into it indepth when free


#674

Hi @gwlim
Is it possible to compile a fast path on the 17.01.5 snapshot with the kernel 4.4.147, while trying to compile I get an error, router wdr3600 ?


#675

I'm using this.


#676

Hey!
Thanks for checking it!
When you say you bumped the version you mean that it was in the new August release? Because when I tried to sysupgrade it gave an error saying that the version wasn't generic or something like that.
Am I doing something wrong?
Thanks!


#677

Similar thing happened to me yesterday:

  • compiled an image from snapshot openwrt 2 weeks ago (flashed it at that time)
  • compiled an image from snapshot openwrt yesterday and tried to flash it: didn't work with the above message

As it turned out, in my case at least (Archer C7 v2), that they renamed the model name in ".config", and that's why the error came up. The solution for me was to force flash it via ssh ("sysupgrade -F name.bin").

I'm not saying that you can do the same though, it can brick your router!


#678

WNDR4300 V1 LEDE-17.01

make error: ‘struct sk_buff’ has no member named ‘fast_forwarded’
make[4]: Entering directory '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/linux-4.4.151'
  CC [M]  /home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.o
/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c: In function 'sfe_ipv4_recv_udp':
/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c:1366:5: error: 'struct sk_buff' has no member named 'fast_forwarded'
  skb->fast_forwarded = 1;
     ^
/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c: In function 'sfe_ipv4_recv_tcp':
/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.c:1909:5: error: 'struct sk_buff' has no member named 'fast_forwarded'
  skb->fast_forwarded = 1;
     ^
scripts/Makefile.build:269: recipe for target '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.o' failed
make[5]: *** [/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/sfe_ipv4.o] Error 1
Makefile:1421: recipe for target '_module_/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe' failed
make[4]: *** [_module_/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe] Error 2
make[4]: Leaving directory '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/linux-4.4.151'
Makefile:117: recipe for target '/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/.built' failed
make[3]: *** [/home/lixiao/lede/source/build_dir/target-mips_24kc_musl-1.1.16/linux-ar71xx_nand/shortcut-fe/.built] Error 2
make[3]: Leaving directory '/home/lixiao/lede/source/package/lean/shortcut-fe'
package/Makefile:105: recipe for target 'package/lean/shortcut-fe/compile' failed
make[2]: *** [package/lean/shortcut-fe/compile] Error 2
make[2]: Leaving directory '/home/lixiao/lede/source'
package/Makefile:101: recipe for target '/home/lixiao/lede/source/staging_dir/target-mips_24kc_musl-1.1.16/stamp/.package_compile' failed
make[1]: *** [/home/lixiao/lede/source/staging_dir/target-mips_24kc_musl-1.1.16/stamp/.package_compile] Error 2
make[1]: Leaving directory '/home/lixiao/lede/source'
/home/lixiao/lede/source/include/toplevel.mk:205: recipe for target 'world' failed
make: *** [world] Error 2

#679

Hey!

I see what you're saying, but the same happens to IPv6 like it happened to IPv4 before your "tweaks" to the firmware, Sirq with IPv4 stays at approximately 30-40% while IPv6 goes to 99% slowing down my 200mbps connection to 110mbps.
Over the last two updates I saw no changes to the IPv6's lack of acceleration.
Have you been working on something @gwlim or am I missing something?
If it is like you say it is @quarky why doesn't it work?

Thanks!


#680

My SFE patches has modifications to work when PBR is applied to packets to be routed. Apparently that change also made IPv6 getting accelerated while before that it didn’t. I only made changes to the fast-classifier CM tho. I did not use sfe-cm. As my changes worked for what I needed to, I didn’t spend too much time trying to understand why.

Likely due to the existing fast-classifier code unable to correctly find the destination output interface for IPv6 packets.


#681

So what's your opinion?
What should I do?


#682

You can try compiling your own build with my SFE patches applied.


#683

Yesterday I installed juppin's build on my archer c7 v2 and managed to do a speedtest that reached 572mbps down and 610mbps up (testing the switch, not wifi). The fastest I'd managed with the normal snapshot build was 350/275. Granted a better test would have been to try and switch between two computers rather than download packets over the internet, but it feels like an improvement.

Next I'm going to try installing the breed bootloader on the device and will follow Pedro's instructions to overclock the Archer c7 to 1000MHz. I'll either end up with a brick or with a router than can finally do some level of justice to our new 1gbps fiber connection.


#684

@czyzczyz it would be really cool to measure how much improvement a ~30% overclock brings in terms of packet routing performance.

Regarding flashing the bootloader, if you follow instructions step by step and the router doesn't lose power while flashing, everything will be ok

Good luck! :slight_smile:


#685

It looks like replies to that overclocking thread are closed. Pity, as I've got some results to post. They're probably a little relevant to this thread, as I'd assume they'll impact the performance of software flow offloading.

I managed to get the breed bootloader installed and got my CPU running at 1000MHz. When I set the memory speed to Pedro's 760MHz if left me with a router that handed out an IP address but no gateway and was inaccessible (I tried setting a static IP on my computer, but no dice). So I booted back into breed, dropped the memory speed to 600MHz, and now everything appears to be great. I've got an Archer c7 v2 with a CPU at 1000MHz that's running openwrt snapshot and appears to be doing so with no issues. The router doesn't even feel warm.

Results of "openssl speed md5 sha1 sha256 sha512" before and after overclock:

before:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5               3238.06k    11667.18k    31559.12k    54389.72k    68505.94k
sha1              3501.39k    10919.23k    26035.01k    39378.22k    46105.34k
sha256            3924.35k     9100.04k    15783.04k    19461.24k    20831.87k
sha512             814.69k     3257.65k     4490.40k     6058.87k     6750.87k

after:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
md5               5565.14k    18416.74k    47276.66k    77263.15k    95332.93k
sha1              5662.74k    16754.37k    38145.64k    55831.23k    63852.92k
sha256            5705.37k    12839.87k    21992.41k    26999.59k    28838.87k
sha512            1136.98k     4552.83k     6248.48k     8448.00k     9375.74k

So a ~39% increase in speed of those calculations?

I ran a stopwatch on my phone for both the before and after runs, and clocked the first run at 1:02 and the second at 1:03 -- I'd assume those are both the same 1 minute of actual time plus a bit of human mechanical latency getting to the 'stop' button.

I did a couple of speed tests and got into the 4-600s for up and down, but it's not really the best test given that the limiting factor may not be the speed of the overclocked Archer c7 v2 with software flow offloading but instead the limit may be due to the speed provided by whichever server the speedtest was using. At the very least, those numbers are around the same as I was getting before the overclock and are better than I ever achieved before installing the firmware that included flow offloading. I'll do some more speed tests after I replace my current router with this one and set it up as the main router on my network and if it returns some better numbers I'll report them.


#686

Ethernet switch speeds appear to have improved with overclocking combined with flow offloading -- this is a high score for today:
30%20PM

Here's a google spreadsheet showing the result of a bunch of tests, including speed tests of the following configurations:

  • Cable internet (300mbps down, 20mbps up), Archer c7 v4 running the 08-22-2018 snapshot of openwrt
  • 1Gbps Fiber internet, Archer c7 v4 running the 08-22-2018 snapshot of openwrt
  • 1Gbps Fiber internet, Archer c7 v2 running juppin's snapshot build with flow offloading, stock 720MHz CPU speed)
  • 1Gbps Fiber internet, Archer c7 v2 running juppin's snapshot build with flow offloading, CPU overclocked to 1000MHz, memory overclocked to 600MHz.

And a little chart (included in the spreadsheet). Flow offloading led to a clear increase in speed. Conclusions are a little foggy due to the low sample size (I've only got one non-overclocked flow offloading speed test data point) and an unreliable testing method that depends on the speed of remote servers. But I think I'm seeing a bit of a gain from overclocking.

31%20PM


#687

can you test the difference for overclocked without flow offload ?


#688

I can try some more tests later today if I get time. I've got an ODROID-XU4 that ought to be able to saturate things with iperf to cut remote server speed variability out of the results.

But if you look at that one spike in the chart labelled "v2_fiber_720_flow", that's the existing measurement with flow offloading but no overclocking.

On a different note - here's a test result with flow offloading and overclocking of the router's 5GHz 802.11ac speed. I'm fairly distant from the router but with no walls in between and am getting a connection at -93dBm noise and -66dBm RSSI.

21%20AM

(edit - replaced a 271/238 test with a newer 373/293 wifi test result - computer in same position with same connection strength)


#689

It would be interesting to see how the Fast Path patches compare (in benchmark test) with the “official” flow offload that’s now part of the 4.14+ kernel.


#690

For TP-Link 1043ND V1 gwlim's FastPath image performs a way better than the ath79 Flow offload. Especially in PPPoE environment.

FastPath:
DHCP: 641 Mbits/sec
PPPoE: 443 Mbits/sec

K4.14 Flow offload: see juppin's ath79 build topic