It's already in the Qualcomm Code Aurora git repo IIRC, and they occasionally push things upstream, but I'm not sure how much traction this would get for landing in the mainline kernel.
many thanks for your instructions!
Implemented it on Ubiquiti Edgerouter ER-X with LEDE latest trunk and get easily 930MBit NAT-Performance (quick test).
You can either use:
git reset --hard HEAD
or just run make menuconfig again and unselect the two kmod's, then build
You can also simply up-apply "revert" the commit as a patch
patch -R -p 1 -i patchfile
wget https://patch-diff.githubusercontent.com/raw/lede-project/source/pull/1269.patch
git apply --ignore-space-change --ignore-whitespace 1269.patch
then in menuconfig
Kernel Modules > Network Support > kmod-fast-classifier and kmod-shortcut-fe (not kmod-shortcut-fe-cm)
I just did this, and the build was fine for my Archer C7v2. However, I had a problem with my VPN setup (IPsec roadwarrior config using Strongswan): I can VPN into my router just fine, but I could not connect to an RDP session behind my router (rdp-client --> ipsec tunnel over internet --> VPN server on router --> RDP server in local LAN)
The rdp connection would start, but then disconnect after about 10 seconds. I needed to do "rmmod fast-classifier" to get everything stable again.
Before the rmmod, fast path seemed to be working ok:
root@router ~# cat /sys/fast_classifier/exceptions
NO_IIF = 12682
NO_CT = 1
CT_NO_CONFIRM = 727
TCP_NOT_ASSURED = 186
WAIT_FOR_ACCELERATION = 10445
CT_DESTROY_MISS = 1220
root@router ~# head -n1 /sys/fast_classifier/debug_info
size=84 offload=0 offload_no_match=0 offloaded=41 done=39 offl_dbg_msg_fail=41 done_dbg_msg_fail=39
Any idea what could be wrong?
I'm afraid I don't know, working fine here with an OpenVPN server running, but I've not used strongswan before (or even know much about it). Might be better contacting the upstream project on CodeAurora with the issue.
Thanks for the hint, but isn't this working with 'lede-17.01' branch? (I just tried out and there's no such option in Kernel config.)
Worked for me, did you follow the instructions explicitly?
Also, are you configuring the kernel directly (which is possible), or running make menuconfig in the root Lede checkout directory?
One thing - you may need to move the patches from hack-4.4 to patches-4.4 (look in the patch to see the full path)
That was it and also: 4.9 modifications have to be removed completely from it to be able to apply the patch on current lede-17.01 branch. (I haven't compiled it yet, but it should be fine.)
Thanks for your help!
So, do we know exactly how SFE should behave with SQM?
Thanks
is sqm and fastpath not the opposing tradeoffs?
as in:
sqm: use more cpu to better manage scare capacity/bandwith
fastpath: use less cpu to better handle high capacity (because cpu is too slow otherwise)
No, because no matter how much the bandwidth is it will be full at some point.
I've now enabled SQM, remember that SQM is only on your WAN interface, whereas fastpath will accelerate things locally as well, so not necessarily competing.
Question, if you run SQM on your WAN interface, is the sirq load during a (saturating) speedtest lower with FAst Path enabled or not. If yes, by what magnitude. And final question is SQM@WAN without FastPath already throttling your internet (asked differently, does SQM alone already ax out your CPU)?
I am trying to figure out what to recommend to sqm users (obviously without needing to test fast path myself -EOUTOFTIME)
Best Regards
It's difficult to monitor in reality as I'm running on an R7800 which doesn't really break a sweat handling the 80/20 FTTC product I'm on even when fastpath is disabled.
FastPath comes into its own for local file transfer though, but then, that doesn't have SQM applied.
-ECATCH22
I compiled finally yesterday lede-17.01 branch with his patch on Archer C5 v1, and debug numbers barely increase. If I disable it, SQM numbers start to grow. (SQM@WAN)
@moeller0, unfortunately I have a crappy connection (76/20 Mbps) so I can't really test your cases.
Now I compile gwlim's current version, I'll only enable kmod-fast-classifier and see how it behaves with SQM.
I know it might be too much to ask. But, did anyone successfully build with this patch on 4MB
flash?
@clyang, I didn't try yet. Technically shouldn't be a problem. Memory should be enough if you made your own lite (4MB) image. But..which device you have in mind with 4MB and gigabit switch? On standard 100mbps Ethernet, there isn't any real performance increase, except a lower SIRQ, so maybe wifi might benefit from that.
Forgive me since I'm a beginner. I can't really tell if it's working for me either, but I'll post my stats anyway. I'm on an extremely bad connection compared to everyone else here at 10Mbps down and 1Mbps up. I recently was playing around with SQM so the stats for it are a bit younger, but it was running the whole time alongside fast-classifier before tweaking.
root@Downstairs:~# uptime
13:06:20 up 17:51, load average: 0.06, 0.01, 0.00
root@Downstairs:~# cat /sys/fast_classifier/debug_info
size=36 offload=0 offload_no_match=0 offloaded=3338 done=3329 offl_dbg_msg_fail=3338 done_dbg_msg_fail=3329
(Then MAC addresses and local IPs and ports on the left and outside IPs on the right)
root@Downstairs:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0
qdisc cake 802e: dev eth0 root refcnt 2 bandwidth 900Kbit diffserv3 dual-srchost nat rtt 200.0ms noatm overhead 18 via-ethernet mpu 64
Sent 38928241 bytes 405502 pkt (dropped 410, overlimits 495872 requeues 0)
backlog 0b 0p requeues 0
memory used: 386848b of 4Mb
capacity estimate: 900Kbit
Bulk Best Effort Voice
thresh 56248bit 900Kbit 225Kbit
target 323.0ms 20.2ms 80.7ms
interval 646.0ms 210.2ms 161.5ms
pk_delay 0us 15.2ms 6.7ms
av_delay 0us 3.0ms 1.4ms
sp_delay 0us 114us 29us
pkts 0 403270 2642
bytes 0 38964390 372875
way_inds 0 1432 0
way_miss 0 5309 456
way_cols 0 0 0
drops 0 410 0
marks 0 0 0
sp_flows 0 0 0
bk_flows 0 1 0
un_flows 0 0 0
max_len 0 1514 590
qdisc ingress ffff: dev eth0 parent ffff:fff1 ----------------
Sent 883176955 bytes 997916 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
Sent 114947366 bytes 320182 pkt (dropped 0, overlimits 0 requeues 3)
backlog 0b 0p requeues 3
maxpacket 549 drop_overlimit 0 new_flow_count 6 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc noqueue 0: dev wlan0.sta1 root refcnt 2
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc cake 802f: dev ifb4eth0 root refcnt 2 bandwidth 9Mbit besteffort dual-dsthost nat wash rtt 200.0ms noatm overhead 18 via-ethernet mpu 64
Sent 895657214 bytes 996912 pkt (dropped 1004, overlimits 898838 requeues 0)
backlog 0b 0p requeues 0
memory used: 371008b of 4Mb
capacity estimate: 9Mbit
Tin 0
thresh 9Mbit
target 10.0ms
interval 200.0ms
pk_delay 166us
av_delay 10us
sp_delay 2us
pkts 997916
bytes 897147779
way_inds 44213
way_miss 5403
way_cols 0
drops 1004
marks 3
sp_flows 0
bk_flows 1
un_flows 0
max_len 1514
Reason why I applied this was to improve wireless transfer speeds with my NAS.
Fast-Classifier with Shorcut-FE only. Obtained from Dissent1's RFC commit.
EDIT:
Pasting my Bufferbloat results (8 Streams Down, 4 Streams up [as much as my cable connection will alow]; High Res Bufferbloat, 30 Secs Upload and Download, Dodge Compression Enabled). Don't know if this will help, @moeller0. Relatively quiet network allowed me to test further. Fast-Classifier Debug info rose to 3800+ in 3 hrs with around 2 active clients at the moment. @philjohn
(ignore the A+ forgot to change it a while ago prior to upgrading when it was still 5mbps max)