Qualcomm Fast Path For LEDE

WiFi is part of lan, under interfaces it is bridged to lan, most likely it is another issue.
I am guessing it has to be the switch driver
http://lists.infradead.org/pipermail/lede-dev/2017-March/006930.html

Thanks for thinking with me. I flashed the original 17.01.2 firmware (downloaded from the site) 2 days ago (after getting dropping issues for the 2nd time) and it runs fine since then.
That's why I thought I'll try to separate the problem with selecting just couple of patches of yours once at a time.

Try removing this patch
https://github.com/gwlim/mips74k-ar71xx-lede-patch/blob/lede-17.01/patch/059-increase-ag71xx-tx-ring-size.patch
Possibly related to this
https://patchwork.ozlabs.org/patch/743498/

Thanks, gwlim, I'll try to compile without it. (I already forked your repo :slight_smile: )

I compiled without it and flashed it, we will see. I'll report back a week later if all goes fine. Thanks!

Could you guys post output of following commands:
cat /sys/fast_classifier/debug_info
cat /sys/fast_classifier/exceptions

I've also made it compile in k4.9 trunk
https://github.com/dissent1/r7800/commit/5ae4a7425aac4211550e86060e556e6a3d5435cc

root@lede:~# cat /sys/fast_classifier/exceptions 
NO_IIF = 76348
CT_NO_CONFIRM = 22356
TCP_NOT_ASSURED = 20573
TCP_NOT_ESTABLISHED = 414
UNKNOW_PROTOCOL = 1807
NO_SRC_DEV = 190
NO_DEST_DEV = 7811580
WAIT_FOR_ACCELERATION = 54399
UPDATE_PROTOCOL_FAIL = 414
CT_DESTROY_MISS = 58743

And what's your 1st string of debug_info output?
size=1899 offload=0 offload_no_match=0 offloaded=0 done=0 offloaded_fail=0 done_fail=0

root@lede:~# cat /sys/fast_classifier/debug_info
size=73 offload=0 offload_no_match=0 offloaded=0 done=0 offloaded_fail=0 done_fail=0
....list of ip address mac address etc...

As I dig deeper into fast-classifier more questions are rising. Have you verified if it actually gets invoked?

cat /sys/fast_classifier/debug_info shows a lot of tracked connections with none of them offloaded, connmark = 000000 and very low counter of hits. Connections size gets constantly rising with each new connection, but the code for freeing the connection entry and sfe_connections_size-- is never invoked... causing mem leak?
Another thing: cat /sys/fast_classifier/exceptions shows connections awaiting acceleration counter that also does not seem to decrease, only increases.

Is fast-classifier actually needed?

@gwlim I'm trying your build out on an Archer C7v2, but loading the fast_classifier module causes my SQM to stop working (or at least bufferfloat gets very bad). Any hints as to what I might be doing wrong?

Guys, what's about IPQ8074 (802.11ax)? Does lede/Openwrt support this chip? I notice Linux kernel got qcom.ipq8074.pinctrl support?

Fast Path doesn't mean you can use SQM on Gigabit.
If you set 900Mbps on SQM it is = to SQM Off because your Router still cannot process SQM at 900Mbps
On WDR4300v1 OC@730MHZ
cake+layer_cake = 571 Mbits/sec
fq_codel+simple.qos = 663 Mbits/sec
Definitely higher than no fastpath but not gigabit either
For this problem you can try overclocking your router if you want

Also I have tried dslreports too, on different web browsers I get different results.
At different time it connects to different servers and I get different results.
And I don't want to register for something as simple as a speedtest
So I will not take dslreports results too seriously
As long as I get consistent ping times in my games I am satisfied.

Hi @gwlim, I'm not testing it on a gigabit setting (my humble connection is 12/2.5mbps),

On dslreports, without fast_classifier, I get about 10.5/2.3 with low bufferfloat. Once I load the fast_classifier module, the connection goes to 12/2.5ish with large bufferfloat. I'm applying it to eth0 (which I think it's correct) with 11300/2300 kbps using cake/layer_cake, although fq_codel/simple gave me similar results.

If I load fast_classifier while testing on dslreports (modprobe fast_classifier in the middle of the test), it will immediately raise the connection speed and raise the bufferfloat. But I feel I might be doing something wrong.

You should be applying it to eth0.2 the WAN Interface
Also Fast Path saturate the connection quickly than SQM can process, I always got a hump in speedtest especially downloads.
In uploads it is fine.
Perhaps since you are on 12Mbps fast path does not do anything for you.

This is somewhat to be expected, given that this speedtest is HTML5 based and is pretty much executed in the browser. So different results on different browsers not noteworthy. The more relevant test is, do you get similar numbers when doing repeated tests (against similarly close servers)? Please also note the detailed results will also contain an estimate of the browser's speed and suitability for the test.

It picks the servers by default based on RTT/proximity and load, but if you do not want that, you are just one free registration away from selecting the set of servers from which your test should be served (so you do not explicitly pick the server, but you can restrict the set from which the test chooses). Oh you can also, and this alone puts the dslreports speedtest into its own category for open speedtests, select the number of measurement streams per direction. And potentially relevant to this topic you can set the test for 1Gbps+ speeds...

You can actually configure a lot of the tests's options even without a registration (I believe everything short of the test duration), just without a registration these changes will not be persistent.

Honestly, I am not sure, you actually looked that carefully at the test to begin with... :wink: Maybe it is worth your time to revisit some of these issues again?

Well, good point, as long as you have a relevant test for latency under load all speedtests basically become "toys" :wink:

Best Regards

Okay upon further investigation which connection manager manages the offload - fast-classifier or shortcut-fe-cm depends on which one is loaded first in modprobe squence. If you load fast-classifier before sfe-cm you'll notice offloaded values rising in debug_info and it really represents what's going on under the hood.
These 2 connection managers grab the same hook in dev.c and it seems that only 1 should be loaded at the same time. so it's better to test which one performs better.

According to source code sfe-cm is for offloading the bridge interface. Offloading nat is assumed to be done in fast-classifier. If you echo 1 > /sys/fast-classifier/skip_to_bridge_ingress it will also offload the bridge.
But that's not the case in your setup, fast-classified should be loaded without shortcut-fe-cm or before shortcut-fe-cm, otherwise you don't get nat offloading, because sfe-cm gets all the connections and passes it by if it's not a bridge interface...

Edit: seems like sfe-cm should be completely dropped... and in this case there is no need in so much netfilter modifications because there's only 1 listener left - fast-classifier. The code of fast-classifier should be adjusted as well to work with unmodified netfilter, but that's not a problem. This will be a more generic solution that may has a chance to get into upstream.

If you actually bothered to test you will know all 3 components are required to achieve Gigabit NAT for MIPS74kc AR9344 and above.
And fast-classifier has a dependency on Shortcut-fe-cm if you bothered to check the makefile.
The dependency is there for a reason.
Don't understand why are you second guessing the code when all you need to do is test the solution.

Ah, so that's your approach - if it visually works it doesn't matter what's really happening...
I'll remind what's happening:

That's not a single offlloaded connection that went through all the checks of fast-classifier driver. Why is all this combination working and providing such benefit - a question. Maybe it fully bypasses the kernel network stack as an undefined behavior.

But again it's up to you to use this as it is, but others should be aware that your solution is not working as as defined by the driver.

Instead of throwing words you'd better check the code. Current solution just lacks something and the defined result may not be as high.

And no, fast-classified does not depend on sfe-cm, it has shared header with sfe-cm, sfe and sfe-ipv6, but does not depend on sfe-cm. If you check the code fast-classified is a standalone and extended version of sfe-cm

1 Like