Qualcomm Fast Path For LEDE

The SFE gains are non-linear for different SOC and they are surprising. My NAT speed results are:

TPLink TL-WDR4900:

    Normal LEDE: 460 Mbps
    LEDE with SFE: 630 Mbps

Netgear WNDR3800:

    Normal LEDE: 230 Mbps
    LEDE with SFE: 910 Mbps

Perhaps MIPS benefit more from the code compared to MPC.

Try to unload netlink and related packages (nf tables?) seems some package is reserving the hook
If so then we need that big netfilter patch I wanted to avoid


# /etc/init.d/nlbwmon stop
# rmmod nf_conntrack_netlink
# modprobe fast-classifier

The fast-classifier kmod did load, no more 1 module could not be probed. But the router died several seconds after that.

I'm reverting the debug commit, and going to try again.

No, don't, seems that is the reason.
I'll add the patch, enable it and adjust the commit. Sigh.

After revert the debug commit, the router no longer died, and shortcut-fe seems working(sirq is much lower, 30% vs 90% when downloading at 100Mbits/s).


# head -n1 /sys/fast_classifier/debug_info 
size=38 offload=0 offload_no_match=0 offloaded=0 done=0 offloaded_fail=10 done_fail=3

Is that normal?

Edit: nlbwmon and kmod-nf-conntrack-netlink also removed.

Yes it is, it was because of netlink :slight_smile: but still we need to patch kernel a bit more for such situations not to happen if someone installs nf-netlink package. Try now with all your packages installed please.

net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event':
net/netfilter/nf_conntrack_netlink.c:648:23: error: 'item' undeclared (first use in this function)
  struct nf_conn *ct = item->ct;

Buggy qsdk patch put changes in wrong place. Updated the commit in previous post.

edit: I've found 1 more bug in that patch that's not on surface. I'll clean things once again now for sure.

edit: try now, updated previous post.

The next patch would create the file net/netfilter/nf_conntrack_rtcache.c,
which already exists!  Applying it anyway.
patching file net/netfilter/nf_conntrack_rtcache.c
Hunk #1 FAILED at 1.
1 out of 1 hunk FAILED -- saving rejects to file net/netfilter/nf_conntrack_rtcache.c.rej
Patch failed!  Please fix /home/azuwis/src/lede/target/linux/generic/patches-4.4/953-net-conntrack-events-support-multiple-registrant.patch!

Would you build a firmware for my 1041N v2 to test? Thank you.

Updated, removed leftover

@dissent1: You seem to have taken lead of this thread; started by @gwlim. Did you basically confirm that the kernel module "shortcut-fe-cm" is not required for SFE to work?

Unload module nf-conntrack-netlink and fast-classifier module load fine.

Great Work!

Yes, I'm pretty confident in that. Upon deeper digging into the code I can conclude that you should use either sfe-cm or fast-classifier. Fast-classifier is a clone of sfe-cm with additional functions and is preferable. It has some additional checks, adds statistics and allows bridge offloading as well:
echo 1 > /sys/fast_classifier/skip_to_bridge_ingress

These both modules shouldn't be used together because fast-classifier decides to offload when the certain connection hits 128 packets (you can adjust it with /sys/fast_classifier/offload_at_pkts), but sfe-cm does it at the moment, so packets don't go to fast-classifier anymore because the offloading rule is already created by sfe-cm. It's a race condition.

You should be able to load the fast-classifier along with netlink if you use my latest commit. Please confirm it's important.

Ok. I'm going to test and respond in a few hours.

BUMP, Can anyone confirm if the FastPath should or is compiled to work on the x86_64 image?
If not yet, will it be ported?

Cherry-picked 93ae487 on LEDE 97eb8ab, run tested on Mercury MW4530R v1(ar71xx mips_24kc), both nlbwmon(which depends on kmod-nf-conntrack-netlink) and shortcut-fe worked.

# head -n1 /sys/fast_classifier/debug_info 
size=62 offload=0 offload_no_match=0 offloaded=0 done=0 offloaded_fail=50 done_fail=38

No error appeared on dmesg.

One thing I noticed is shortcut-fe will break SQM, at last in my case with sch_cake.

ISP advertise the Internet bandwidth is 100M, and I was limiting my download speed to 80000K in SQM:

config queue 'wan'
	option qdisc 'fq_codel'
	option qdisc_advanced '0'
	option ingress_ecn 'ECN'
	option egress_ecn 'ECN'
	option qdisc_really_really_advanced '0'
	option itarget 'auto'
	option etarget 'auto'
	option linklayer 'none'
	option interface 'pppoe-wan'
	option upload '4400'
	option script 'piece_of_cake.qos'
	option enabled '1'
	option download '80000'

Tested using iperf3 and ping:

  • SQM enabled, shortcut-fe enabled: iperf3 93.3 Mbits/sec, ping 21.8ms
  • SQM enabled, shortcut-fe disabled: iperf3 75.2 Mbits/sec, ping 6.0ms
  • SQM disabled, shortcut-fe enabled: iperf3 93.4 Mbits/sec, ping 18.3ms
  • SQM disabled, shortcut-fe disabled: iperf3 93.1 Mbits/sec, ping 18.4ms

Shortcut-fe enabled means modprobe fast-classifier, shortcut-fe disabled means rmmod fast-classifier.

Edit: Shortcut-fe only breaks SQM ingress, egress works fine.

Yes it's presumably architecture independent, you just pick a package and it's compiled to your target

Yes it's a known issue, not sure if it's easily fixable. Kong from dd-wrt says that it requires qos kernel modification and he may try to do it.

@dissent1 I will only use it when it will be available as a part of the main line or the stablex86_64 image version.