many thanks for your instructions!
Implemented it on Ubiquiti Edgerouter ER-X with LEDE latest trunk and get easily 930MBit NAT-Performance (quick test).
many thanks for your instructions!
You can either use:
git reset --hard HEAD
or just run make menuconfig again and unselect the two kmod's, then build
You can also simply up-apply "revert" the commit as a patch
patch -R -p 1 -i patchfile
git apply --ignore-space-change --ignore-whitespace 1269.patch
then in menuconfig
Kernel Modules > Network Support > kmod-fast-classifier and kmod-shortcut-fe (not kmod-shortcut-fe-cm)
I just did this, and the build was fine for my Archer C7v2. However, I had a problem with my VPN setup (IPsec roadwarrior config using Strongswan): I can VPN into my router just fine, but I could not connect to an RDP session behind my router (rdp-client --> ipsec tunnel over internet --> VPN server on router --> RDP server in local LAN)
The rdp connection would start, but then disconnect after about 10 seconds. I needed to do "rmmod fast-classifier" to get everything stable again.
Before the rmmod, fast path seemed to be working ok:
root@router ~# cat /sys/fast_classifier/exceptions NO_IIF = 12682 NO_CT = 1 CT_NO_CONFIRM = 727 TCP_NOT_ASSURED = 186 WAIT_FOR_ACCELERATION = 10445 CT_DESTROY_MISS = 1220 root@router ~# head -n1 /sys/fast_classifier/debug_info size=84 offload=0 offload_no_match=0 offloaded=41 done=39 offl_dbg_msg_fail=41 done_dbg_msg_fail=39
Any idea what could be wrong?
I'm afraid I don't know, working fine here with an OpenVPN server running, but I've not used strongswan before (or even know much about it). Might be better contacting the upstream project on CodeAurora with the issue.
Thanks for the hint, but isn't this working with 'lede-17.01' branch? (I just tried out and there's no such option in Kernel config.)
Worked for me, did you follow the instructions explicitly?
Also, are you configuring the kernel directly (which is possible), or running make menuconfig in the root Lede checkout directory?
One thing - you may need to move the patches from hack-4.4 to patches-4.4 (look in the patch to see the full path)
That was it and also: 4.9 modifications have to be removed completely from it to be able to apply the patch on current lede-17.01 branch. (I haven't compiled it yet, but it should be fine.)
Thanks for your help!
So, do we know exactly how SFE should behave with SQM?
is sqm and fastpath not the opposing tradeoffs?
sqm: use more cpu to better manage scare capacity/bandwith
fastpath: use less cpu to better handle high capacity (because cpu is too slow otherwise)
No, because no matter how much the bandwidth is it will be full at some point.
I've now enabled SQM, remember that SQM is only on your WAN interface, whereas fastpath will accelerate things locally as well, so not necessarily competing.
Question, if you run SQM on your WAN interface, is the sirq load during a (saturating) speedtest lower with FAst Path enabled or not. If yes, by what magnitude. And final question is SQM@WAN without FastPath already throttling your internet (asked differently, does SQM alone already ax out your CPU)?
I am trying to figure out what to recommend to sqm users (obviously without needing to test fast path myself -EOUTOFTIME)
It's difficult to monitor in reality as I'm running on an R7800 which doesn't really break a sweat handling the 80/20 FTTC product I'm on even when fastpath is disabled.
FastPath comes into its own for local file transfer though, but then, that doesn't have SQM applied.
I compiled finally yesterday lede-17.01 branch with his patch on Archer C5 v1, and debug numbers barely increase. If I disable it, SQM numbers start to grow. (SQM@WAN)
@moeller0, unfortunately I have a crappy connection (76/20 Mbps) so I can't really test your cases.
Now I compile gwlim's current version, I'll only enable kmod-fast-classifier and see how it behaves with SQM.
I know it might be too much to ask. But, did anyone successfully build with this patch on
@clyang, I didn't try yet. Technically shouldn't be a problem. Memory should be enough if you made your own lite (4MB) image. But..which device you have in mind with 4MB and gigabit switch? On standard 100mbps Ethernet, there isn't any real performance increase, except a lower SIRQ, so maybe wifi might benefit from that.
Forgive me since I'm a beginner. I can't really tell if it's working for me either, but I'll post my stats anyway. I'm on an extremely bad connection compared to everyone else here at 10Mbps down and 1Mbps up. I recently was playing around with SQM so the stats for it are a bit younger, but it was running the whole time alongside fast-classifier before tweaking.
root@Downstairs:~# uptime 13:06:20 up 17:51, load average: 0.06, 0.01, 0.00 root@Downstairs:~# cat /sys/fast_classifier/debug_info size=36 offload=0 offload_no_match=0 offloaded=3338 done=3329 offl_dbg_msg_fail=3338 done_dbg_msg_fail=3329
(Then MAC addresses and local IPs and ports on the left and outside IPs on the right)
root@Downstairs:~# tc -s qdisc qdisc noqueue 0: dev lo root refcnt 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc cake 802e: dev eth0 root refcnt 2 bandwidth 900Kbit diffserv3 dual-srchost nat rtt 200.0ms noatm overhead 18 via-ethernet mpu 64 Sent 38928241 bytes 405502 pkt (dropped 410, overlimits 495872 requeues 0) backlog 0b 0p requeues 0 memory used: 386848b of 4Mb capacity estimate: 900Kbit Bulk Best Effort Voice thresh 56248bit 900Kbit 225Kbit target 323.0ms 20.2ms 80.7ms interval 646.0ms 210.2ms 161.5ms pk_delay 0us 15.2ms 6.7ms av_delay 0us 3.0ms 1.4ms sp_delay 0us 114us 29us pkts 0 403270 2642 bytes 0 38964390 372875 way_inds 0 1432 0 way_miss 0 5309 456 way_cols 0 0 0 drops 0 410 0 marks 0 0 0 sp_flows 0 0 0 bk_flows 0 1 0 un_flows 0 0 0 max_len 0 1514 590 qdisc ingress ffff: dev eth0 parent ffff:fff1 ---------------- Sent 883176955 bytes 997916 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc fq_codel 0: dev eth1 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn Sent 114947366 bytes 320182 pkt (dropped 0, overlimits 0 requeues 3) backlog 0b 0p requeues 3 maxpacket 549 drop_overlimit 0 new_flow_count 6 ecn_mark 0 new_flows_len 0 old_flows_len 0 qdisc noqueue 0: dev br-lan root refcnt 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc noqueue 0: dev wlan0 root refcnt 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc noqueue 0: dev wlan0.sta1 root refcnt 2 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 qdisc cake 802f: dev ifb4eth0 root refcnt 2 bandwidth 9Mbit besteffort dual-dsthost nat wash rtt 200.0ms noatm overhead 18 via-ethernet mpu 64 Sent 895657214 bytes 996912 pkt (dropped 1004, overlimits 898838 requeues 0) backlog 0b 0p requeues 0 memory used: 371008b of 4Mb capacity estimate: 9Mbit Tin 0 thresh 9Mbit target 10.0ms interval 200.0ms pk_delay 166us av_delay 10us sp_delay 2us pkts 997916 bytes 897147779 way_inds 44213 way_miss 5403 way_cols 0 drops 1004 marks 3 sp_flows 0 bk_flows 1 un_flows 0 max_len 1514
Reason why I applied this was to improve wireless transfer speeds with my NAS.
Fast-Classifier with Shorcut-FE only. Obtained from Dissent1's RFC commit.
Pasting my Bufferbloat results (8 Streams Down, 4 Streams up [as much as my cable connection will alow]; High Res Bufferbloat, 30 Secs Upload and Download, Dodge Compression Enabled). Don't know if this will help, @moeller0. Relatively quiet network allowed me to test further. Fast-Classifier Debug info rose to 3800+ in 3 hrs with around 2 active clients at the moment. @philjohn
(ignore the A+ forgot to change it a while ago prior to upgrading when it was still 5mbps max)
The offloaded count of 3338 in debug_info shows that it's working.
Check that number keeps increasing, but looks like you're good to go.
Which dissent1's version have you applied to which branch? And was SQM enabled on WAN interface?
I tried this pull request with lede-17.01 branch with SQM applied to WAN, but they didn't work together well.
Current gwlim's version when only fast-classifier compiled with SQM@WAN:
- it only "accelerated" 1 out of 2 separated VPN connection (don't ask why )
Current gwlim's version with both fast-classifier and shortcut-fe-cm compiled and SQM@WAN and
rmmod shortcut-fe-cm (!!!):
- both separated VPN connections were accelerated
So, for now, I added this into
rc.local with B.b) gwlim's version:
echo 1 > /sys/fast_classifier/skip_to_bridge_ingress rmmod shortcut-fe-cm
Thanks for both of you your work!
EDIT: It turned out that both B.a) and B.b) is wrong with gwlim's patch.