Ok good news, SW and HW offloading seem to work again, but as advised by @nikito7 I performed a reboot
basic 520 Mbit/s
SW 675 Mbit/s
HW 975 Mbit/s
I will leave the router for a couple of days to test the stability.
Ok good news, SW and HW offloading seem to work again, but as advised by @nikito7 I performed a reboot
basic 520 Mbit/s
SW 675 Mbit/s
HW 975 Mbit/s
I will leave the router for a couple of days to test the stability.
So a reboot is required after toggling the firewall setting? Is grep OFFLOAD /proc/net/nf_conntrack
displaying any results even before the reboot?
I didn't try both before rebooting.
I'm using a R6220 with OpenWrt SNAPSHOT r18781-8d8d26ba42 / LuCI Master git-22.025.79016-22e2bfb
WAN : 1 Gbit/s IPv4 only
With current snapshot don't need reboot
I have just done tests with and without HW offloading. I can confirm it doesn't need reboot to toggle.
Hi all.
21h running.
So far, everything is working as expected
I'll keep it running until 24h.
Does HW offload work with PPPoE and/or VLAN?
EDIT: Ignore nonsense below and see my post several posts down. Software and hardware offloading are working on my MT7621 ER-X just fine.
I'd say "No." MT7621 offloading still does not work. At least it does not do anything on my ER-X.
The following speed tests on my ER-X are in order:
Average CPU load over all 4 CPU's (threads actually - it's only a dual core) was ~57% in all three tests.
Latency is nothing to write home about with SQM disabled.
If I run CAKE, I cannot get more than about ~100 Mbps download. Without latency degrading, I cannot get more than ~10 Mbps upload with SQM, which seems weird, because the CPU can handle 100 Mbps on the download. But it does drop latency to ~35ms so at least there is that.
OpenWrt SNAPSHOT, r18785-8072bf3322
-----------------------------------------------------
root@ER-X:~# speedtest-netperf.sh
2022-02-11 13:55:52 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
Download: 469.54 Mbps
Latency: [in msec, 60 pings, 0.00% packet loss]
Min: 25.043
10pct: 83.626
Median: 91.570
Avg: 90.871
90pct: 98.602
Max: 118.500
CPU Load: [in % busy (avg +/- std dev), 57 samples]
cpu0: 76.8 +/- 5.2
cpu1: 41.2 +/- 4.9
cpu2: 57.8 +/- 5.9
cpu3: 52.6 +/- 5.0
Overhead: [in % used of total CPU available]
netperf: 40.3
.............................................................
Upload: 22.43 Mbps
Latency: [in msec, 61 pings, 0.00% packet loss]
Min: 16.432
10pct: 68.120
Median: 119.730
Avg: 121.320
90pct: 163.118
Max: 237.840
CPU Load: [in % busy (avg +/- std dev), 58 samples]
cpu0: 5.8 +/- 2.4
cpu1: 7.6 +/- 3.7
cpu2: 0.8 +/- 0.9
cpu3: 6.6 +/- 3.3
Overhead: [in % used of total CPU available]
netperf: 0.9
root@ER-X:~# speedtest-netperf.sh
2022-02-11 13:58:43 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
Download: 473.15 Mbps
Latency: [in msec, 60 pings, 0.00% packet loss]
Min: 13.494
10pct: 64.737
Median: 78.561
Avg: 76.163
90pct: 84.895
Max: 87.701
CPU Load: [in % busy (avg +/- std dev), 57 samples]
cpu0: 93.5 +/- 2.6
cpu1: 77.6 +/- 3.8
cpu2: 22.6 +/- 9.5
cpu3: 33.4 +/- 8.3
Overhead: [in % used of total CPU available]
netperf: 43.8
.............................................................
Upload: 23.11 Mbps
Latency: [in msec, 61 pings, 0.00% packet loss]
Min: 24.566
10pct: 74.462
Median: 124.533
Avg: 122.029
90pct: 159.199
Max: 202.953
CPU Load: [in % busy (avg +/- std dev), 58 samples]
cpu0: 8.6 +/- 2.9
cpu1: 8.1 +/- 2.9
cpu2: 0.2 +/- 0.5
cpu3: 0.2 +/- 0.4
Overhead: [in % used of total CPU available]
netperf: 1.0
root@ER-X:~# speedtest-netperf.sh
2022-02-11 14:01:27 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
Download: 471.98 Mbps
Latency: [in msec, 60 pings, 0.00% packet loss]
Min: 35.867
10pct: 83.508
Median: 90.189
Avg: 89.929
90pct: 96.680
Max: 104.288
CPU Load: [in % busy (avg +/- std dev), 57 samples]
cpu0: 70.8 +/- 5.6
cpu1: 38.4 +/- 6.2
cpu2: 56.6 +/- 5.7
cpu3: 63.4 +/- 5.4
Overhead: [in % used of total CPU available]
netperf: 40.1
.............................................................
Upload: 23.07 Mbps
Latency: [in msec, 61 pings, 0.00% packet loss]
Min: 30.433
10pct: 72.117
Median: 113.173
Avg: 116.564
90pct: 159.658
Max: 197.925
CPU Load: [in % busy (avg +/- std dev), 58 samples]
cpu0: 7.7 +/- 3.1
cpu1: 6.5 +/- 3.1
cpu2: 2.8 +/- 1.7
cpu3: 3.7 +/- 2.1
Overhead: [in % used of total CPU available]
netperf: 0.9
root@ER-X:~#
Can you please check the output of grep OFFLOAD /proc/net/nf_conntrack
during any of these test cases? And check the the flowtable was properly initialized: `nft list flowtables?
Can't tell, I don't use both.
Without any offloading checked, no results.
With software offloading enabled, here is a snippet of the grep output:
ipv4 2 tcp 6 src=10.23.40.236 dst=104.16.248.249 sport=53642 dport=443 packets=7805 bytes=920995 src=104.16.248.249 dst=172.x.x.x sport=443 dport=53642 packets=6339 bytes=1847039 [OFFLOAD] mark=0 zone=0 use=3
ipv6 10 udp 17 src=2603:6081:8e00:00a7:x:x:x:x dst=2a03:2880:f02c:010e:face:b00c:0000:0002 sport=41186 dport=443 packets=2 bytes=1413 src=2a03:2880:f02c:010e:face:b00c:0000:0002 dst=2603:6081:8e00:00a7:x:x:x:x sport=443 dport=41186 packets=15 bytes=6877 [OFFLOAD] mark=0 zone=0 use=3
ipv6 10 tcp 6 src=2603:6081:8e00:00a7:x:x:x:x dst=2a03:2880:f011:001e:face:b00c:0000:2825 sport=47030 dport=443 packets=243 bytes=43664 src=2a03:2880:f011:001e:face:b00c:0000:2825 dst=2603:6081:8e00:00a7:x:x:x:x sport=443 dport=47030 packets=316 bytes=42269 [OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=10.23.43.196 dst=52.87.247.190 sport=64091 dport=31006 packets=21 bytes=1259 src=52.87.247.190 dst=172.x.x.x sport=31006 dport=64091 packets=14 bytes=584 [OFFLOAD] mark=0 zone=0 use=3
ipv6 10 udp 17 src=2603:6081:8e00:00a7:x:x:x:x dst=2606:4700:0000:0000:0000:0000:6812:1690 sport=55471 dport=443 packets=61 bytes=9656 src=2606:4700:0000:0000:0000:0000:6812:1690 dst=2603:6081:8e00:00a7:x:x:x:x sport=443 dport=55471 packets=157 bytes=181332 [OFFLOAD] mark=0 zone=0 use=3
and after a test with hardware offloading checked, these are the last few lines of grep output:
ipv4 2 udp 17 src=10.23.43.126 dst=208.83.246.21 sport=35384 dport=53 packets=1 bytes=62 src=208.83.246.21 dst=172.x.x.x sport=53 dport=35384 packets=1 bytes=146 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv6 10 udp 17 src=2603:6081:8e00:00a7:x:x:x:x dst=2607:f8b0:4002:0c08:0000:0000:0000:005f sport=58900 dport=443 packets=2 bytes=2556 src=2607:f8b0:4002:0c08:0000:0000:0000:005f dst=2603:6081:8e00:00a7:x:x:x:x sport=443 dport=58900 packets=22 bytes=6246 [OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=10.23.40.236 dst=35.186.227.140 sport=46182 dport=443 packets=1 bytes=60 src=35.186.227.140 dst=172.x.x.x sport=443 dport=46182 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=10.23.40.106 dst=34.107.221.82 sport=37906 dport=80 packets=1 bytes=60 src=34.107.221.82 dst=172.x.x.x sport=80 dport=37906 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv6 10 udp 17 src=2603:6081:8e00:00a7:x:x:x:x dst=2607:f8b0:4002:0c06:0000:0000:0000:0063 sport=59571 dport=443 packets=2 bytes=1519 src=2607:f8b0:4002:0c06:0000:0000:0000:0063 dst=2603:6081:8e00:00a7:x:x:x:x sport=443 dport=59571 packets=13 bytes=10655 [OFFLOAD] mark=0 zone=0 use=3
The version of LuCI I', using is: LuCI Master git-22.025.79016-22e2bfb
So some flows are supposedly getting offloaded, but they all look like forwarded traffic.
Btw, the flowtable does not include lo
so I am not surprised that localhost-generated traffic is not offloaded at all. Your console outputs above indicate that you ran those tests on the device?
Correct. Otherwise I'm on WiFi to an AP connected to the device that tops out ~230 Mbps.
I did also run some tests looking at cpu load in htop earlier using iperf3 with the ER-X set to both client and server and running traffic between the ER-X and an AP (an EA8500) to see if changing offload settings had any effect on CPU load - it did not.
With the ER-X as the sever, throughput topped out ~500 Mbps and CPU load was topped out as well (on at least 1-2 threads). In the other direction, throughput was higher using the EA8500 as the server (I don't recall exactly, something like ~800 Mbps) and traffic between two AP's connected through the ER-X went at line rate (~935 Mbps) with no CPU usage to speak of.
OK. I'm a little slow. I thought I was being lazy (I did not want to walk to a wired PC upstairs) and clever (running the speed tests on the device and iperf3 between AP's). Well, it isn't the first time I've been both lazy and ignorant. I've gotten used to it
I ran a speed test from a wired PC with no offloading, software offloading and hardware offloading while monitoring ER-X CPU usage in htop. I can now confirm hardware offloading works as expected.
CPU usage is pretty near zero while downloading ~470 Mbps if hardware offloading is checked. If just software offloading is checked, CPU usage is ~30% and if no offloading is checked, CPU usage is ~64%.
I just did a new build (SNAPSHOT r18792-337e942290 2022-02-11).
I can also confirm that HW Flow Offload is now working with Firewall4 and Kernel 5.10 (monitored with htop while doing a heavy download @ 350Mbps).
I will start monitoring for the next days if the random reboots (that existed with Kernel 5.10 + Firewall3 + HW Flow Offload) are also solved.
After 24h, nothing to report, everything is working fine.
@jow i think it's not your work but have on github pull request
it's is the best for finish all work on offload.
I start test now with kernel 5.10 + nftables.
Yes it does work, but not as good as in OpenWrt 19.07.
Tested with DIR-860L B1 + 1000/300 PPPoE:
OpenWrt 19.07 uses 0% sirq if hardware offload is enabled.
nft list flowtables
:
table inet fw4 {
flowtable ft {
hook ingress priority filter
devices = { lan1, lan2, lan3, lan4 }
flags offload
}
}
grep OFFLOAD /proc/net/nf_conntrack|tail
:
ipv4 2 tcp 6 src=192.168.x.y dst=185.72.16.27 sport=48234 dport=8080 packets=1 bytes=60 src=185.72.16.27 dst=87.97.93.137 sport=8080 dport=48234 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168..x.z dst=13.49.168.130 sport=55408 dport=8008 packets=275 bytes=22771 src=13.49.168.130 dst=87.97.93.137 sport=8008 dport=55408 packets=1779 bytes=450556 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168.x.y dst=142.250.27.108 sport=51730 dport=993 packets=80 bytes=4552 src=142.250.27.108 dst=87.97.93.137 sport=993 dport=51730 packets=39 bytes=2726 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168.x.y dst=185.72.16.27 sport=48230 dport=8080 packets=1 bytes=60 src=185.72.16.27 dst=87.97.93.137 sport=8080 dport=48230 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168.x.y dst=185.72.16.27 sport=48214 dport=8080 packets=1 bytes=60 src=185.72.16.27 dst=87.97.93.137 sport=8080 dport=48214 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168.x.y dst=185.72.16.27 sport=48218 dport=8080 packets=1 bytes=60 src=185.72.16.27 dst=87.97.93.137 sport=8080 dport=48218 packets=1 bytes=60 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168.x.y dst=3.126.186.102 sport=34044 dport=443 packets=97 bytes=10694 src=3.126.186.102 dst=87.97.93.137 sport=443 dport=34044 packets=757 bytes=139469 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168.x.y dst=139.59.210.197 sport=33300 dport=443 packets=1 bytes=60 src=139.59.210.197 dst=87.97.93.137 sport=443 dport=33300 packets=1599 bytes=852777 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168..x.z dst=51.195.89.38 sport=40984 dport=12020 packets=1 bytes=60 src=51.195.89.38 dst=87.97.93.137 sport=12020 dport=40984 packets=1077 bytes=333683 [HW_OFFLOAD] mark=0 zone=0 use=3
ipv4 2 tcp 6 src=192.168..x.z dst=62.4.9.11 sport=58402 dport=80 packets=3 bytes=164 src=62.4.9.11 dst=87.97.93.137 sport=80 dport=58402 packets=2 bytes=112 [OFFLOAD] mark=0 zone=0 use=3
Tested with Xiaomi R3G v1 + 1000/200 IPoE:
OpenWrt SNAPSHOT r18792-337e942290 (5.10.96)
I set HW and reboot. After reboot
Looks like your build is newer than mine:
And yes, rebooted and retested, and got the same not 0% sirq.
Ohh, got it!
My ISP uses "IP packets encapsulated in PPP, which is in turn encapsulated in Ethernet" aka PPPoE. And not IPoE!
So PPPoE is still not fully hardware offloaded (as it was in 19.07).