jow
February 12, 2022, 12:47am
62
Not all flows are immediately HW offloaded upon switching it on, established conntrack connections will remain happening in software. A reboot enforces a fresh conntrack table, that might explain the sirq discrepancy after reboots.
@xabolcs - check /proc/net/nf_conntrack
and compare flows with [OFFLOAD]
and [HW_OFFLOAD]
vs. ones without. Maybe you do have traffic (non-TCP or UDP?) that is not offloaded for some reason.
Another thing I noticed - the offload table doesn’t include the wan port in your case
3 Likes
xabolcs
February 12, 2022, 1:10am
63
Does it have to include?
My DIR-860L's /etc/config/network
is:
config interface 'loopback'
option device 'lo'
option proto 'static'
option ipaddr '127.0.0.1'
option netmask '255.0.0.0'
config globals 'globals'
option packet_steering '1'
option ula_prefix 'fdba:caae:47a2::/48'
config device
option name 'br-lan'
option type 'bridge'
list ports 'lan1'
list ports 'lan2'
list ports 'lan3'
list ports 'lan4'
config device
option name 'lan1'
option macaddr 'e4:6f:13:xx:xx:x0'
config device
option name 'lan2'
option macaddr 'e4:6f:13:xx:xx:x0'
config device
option name 'lan3'
option macaddr 'e4:6f:13:xx:xx:x0'
config device
option name 'lan4'
option macaddr 'e4:6f:13:xx:xx:x0'
config interface 'lan'
option device 'br-lan'
option proto 'static'
option ipaddr '192.168.xx.xx'
option netmask '255.255.255.0'
option ip6assign '60'
config device
option name 'wan'
option macaddr 'e4:6f:13:xx:xx:x3'
config interface 'wan'
option device 'wan'
option proto 'pppoe'
option username 'username'
option password 'password'
option ipv6 'auto'
option keepalive '80 20'
option dns '1.1.1.1'
config interface 'wan6'
option device 'wan'
option proto 'dhcpv6'
Mushoz
February 12, 2022, 8:48am
64
I can confirm that the old 19.07 releases would have 0% sirq usage, whereas new builds still have quite a bit of CPU usage. This is also with a PPPoE connection. @jow it seems that PPPoE isn't fully offloaded like it was on 19.07. Is this a bug, or some sort of limitation of DSA?
1 Like
jow
February 12, 2022, 9:06am
65
Maybe. Is the layer 2 wan interface absent from the flowtable as well in your case? If so, try dumping the ruleset, add the wan interface (not the pppoe-x one but the actual ethernet port) manually to the flow table declaration and apply the modified ruleset, then retest.
nft list ruleset > /tmp/rules.nft
vi /tmp/rules.nft
nft -f /tmp/rules.nft
2 Likes
xabolcs
February 12, 2022, 9:54am
66
Confirmed! Adding wan
port to the offlodad table makes sirq drop to 0~2% while capping out my 1000/300 PPPoE ISP speeds.
nft list flowtables
table inet fw4 {
flowtable ft {
hook ingress priority filter
devices = { lan1, lan2, lan3, lan4, wan }
flags offload
}
}
4 Likes
jow
February 12, 2022, 10:01am
67
Great. Will add a fix for this tonight. Can you please post the output of ls -l /sys/class/net/pppoe-*/
- just want to confirm something.
5 Likes
xabolcs
February 12, 2022, 10:02am
68
-r--r--r-- 1 root root 4096 Feb 11 23:07 addr_assign_type
-r--r--r-- 1 root root 4096 Feb 11 23:07 addr_len
-r--r--r-- 1 root root 4096 Feb 11 23:07 address
-r--r--r-- 1 root root 4096 Feb 11 23:07 broadcast
-rw-r--r-- 1 root root 4096 Feb 11 23:07 carrier
-r--r--r-- 1 root root 4096 Feb 11 23:07 carrier_changes
-r--r--r-- 1 root root 4096 Feb 11 23:07 carrier_down_count
-r--r--r-- 1 root root 4096 Feb 11 23:07 carrier_up_count
-r--r--r-- 1 root root 4096 Feb 11 23:07 dev_id
-r--r--r-- 1 root root 4096 Feb 11 23:07 dev_port
-r--r--r-- 1 root root 4096 Feb 11 23:07 dormant
-r--r--r-- 1 root root 4096 Feb 11 23:07 duplex
-rw-r--r-- 1 root root 4096 Feb 11 23:07 flags
-rw-r--r-- 1 root root 4096 Feb 11 23:07 gro_flush_timeout
-rw-r--r-- 1 root root 4096 Feb 11 23:07 ifalias
-r--r--r-- 1 root root 4096 Feb 11 23:07 ifindex
-r--r--r-- 1 root root 4096 Feb 11 23:07 iflink
-r--r--r-- 1 root root 4096 Feb 11 23:07 link_mode
-rw-r--r-- 1 root root 4096 Feb 11 23:07 mtu
-r--r--r-- 1 root root 4096 Feb 11 23:07 name_assign_type
-rw-r--r-- 1 root root 4096 Feb 11 23:07 napi_defer_hard_irqs
-rw-r--r-- 1 root root 4096 Feb 11 23:07 netdev_group
-r--r--r-- 1 root root 4096 Feb 11 23:07 operstate
-r--r--r-- 1 root root 4096 Feb 11 23:07 phys_port_id
-r--r--r-- 1 root root 4096 Feb 11 23:07 phys_port_name
-r--r--r-- 1 root root 4096 Feb 11 23:07 phys_switch_id
-rw-r--r-- 1 root root 4096 Feb 11 23:07 proto_down
drwxr-xr-x 4 root root 0 Feb 11 23:07 queues
-r--r--r-- 1 root root 4096 Feb 11 23:07 speed
drwxr-xr-x 2 root root 0 Feb 11 23:07 statistics
lrwxrwxrwx 1 root root 0 Feb 11 23:07 subsystem -> ../../../../class/net
-r--r--r-- 1 root root 4096 Feb 11 23:07 testing
-rw-r--r-- 1 root root 4096 Feb 11 23:07 threaded
-rw-r--r-- 1 root root 4096 Feb 11 23:07 tx_queue_len
-r--r--r-- 1 root root 4096 Feb 11 23:07 type
-rw-r--r-- 1 root root 4096 Feb 11 23:07 uevent
It looks like the other virtual device, lo
:
ls -l /sys/class/net/lo/
-r--r--r-- 1 root root 4096 Feb 11 23:06 addr_assign_type
-r--r--r-- 1 root root 4096 Feb 11 23:06 addr_len
-r--r--r-- 1 root root 4096 Feb 11 23:06 address
-r--r--r-- 1 root root 4096 Feb 11 23:06 broadcast
-rw-r--r-- 1 root root 4096 Feb 11 23:06 carrier
-r--r--r-- 1 root root 4096 Feb 11 23:06 carrier_changes
-r--r--r-- 1 root root 4096 Feb 11 23:06 carrier_down_count
-r--r--r-- 1 root root 4096 Feb 11 23:06 carrier_up_count
-r--r--r-- 1 root root 4096 Feb 11 23:06 dev_id
-r--r--r-- 1 root root 4096 Feb 11 23:06 dev_port
-r--r--r-- 1 root root 4096 Feb 11 23:06 dormant
-r--r--r-- 1 root root 4096 Feb 11 23:06 duplex
-rw-r--r-- 1 root root 4096 Feb 11 23:06 flags
-rw-r--r-- 1 root root 4096 Feb 11 23:06 gro_flush_timeout
-rw-r--r-- 1 root root 4096 Feb 11 23:06 ifalias
-r--r--r-- 1 root root 4096 Feb 11 23:06 ifindex
-r--r--r-- 1 root root 4096 Feb 11 23:06 iflink
-r--r--r-- 1 root root 4096 Feb 11 23:06 link_mode
-rw-r--r-- 1 root root 4096 Feb 11 23:06 mtu
-r--r--r-- 1 root root 4096 Feb 11 23:06 name_assign_type
-rw-r--r-- 1 root root 4096 Feb 11 23:06 napi_defer_hard_irqs
-rw-r--r-- 1 root root 4096 Feb 11 23:06 netdev_group
-r--r--r-- 1 root root 4096 Feb 11 23:06 operstate
-r--r--r-- 1 root root 4096 Feb 11 23:06 phys_port_id
-r--r--r-- 1 root root 4096 Feb 11 23:06 phys_port_name
-r--r--r-- 1 root root 4096 Feb 11 23:06 phys_switch_id
-rw-r--r-- 1 root root 4096 Feb 11 23:06 proto_down
drwxr-xr-x 4 root root 0 Feb 11 23:06 queues
-r--r--r-- 1 root root 4096 Feb 11 23:06 speed
drwxr-xr-x 2 root root 0 Feb 11 23:06 statistics
lrwxrwxrwx 1 root root 0 Feb 11 23:06 subsystem -> ../../../../class/net
-r--r--r-- 1 root root 4096 Feb 11 23:06 testing
-rw-r--r-- 1 root root 4096 Feb 11 23:06 threaded
-rw-r--r-- 1 root root 4096 Feb 11 23:06 tx_queue_len
-r--r--r-- 1 root root 4096 Feb 11 23:06 type
-rw-r--r-- 1 root root 4096 Feb 11 23:06 uevent
1 Like
jow
February 12, 2022, 7:52pm
69
Fixes pushed with
committed 07:41PM - 11 Feb 22 UTC
53caa1a fw4: resolve zone layer 2 devices for hw flow offloading
9fe58f5 fw4: re… work and fix family inheritance logic
8795296 tests: mocklib: fix infinite recursion in wrapped print()
281b1bc tests: change mocked wan interface type to PPPoE
93b710d tests: mocklib: forward compatibility change
1a94915 fw4: only stage reflection rules if all required addrs are known
5c21714 fw4: add device iifname/oifname matches to DSCP and MARK rules
3eacc97 tests: adjust 01_ruleset test case to latest changes
Signed-off-by: Jo-Philipp Wich <jo@mein.io>
(Upstream fix https://git.openwrt.org/?p=project/firewall4.git;a=commitdiff;h=53caa1a762125a71389a486aa913e4fbdf3650cf )
6 Likes
xabolcs
February 12, 2022, 7:58pm
70
Thank you!
Going to test with the next nightly build!
Mushoz
February 13, 2022, 2:10pm
72
What issues are back? You really need to be more descriptive if you want developers to be able to look into said issues
2 Likes
which issues, which build, which device ?
nikito7
February 13, 2022, 4:14pm
74
Model
Zbtlink ZBT-WG3526 (16M)
Architecture
MediaTek MT7621 ver:1 eco:3
Target Platform
ramips/mt7621
Crash or reboot
Yesterday snapshot or 1 day older
nikito7:
Zbtlink ZBT-WG3526
I have noticed that everyone reporting a crash/reboot has a mt7621AT (dual core) CPU, while my R6220 has a mt7621ST (single core) CPU. I never had any crash/reboot with it. I can't tell if this has any importance, but CPU cores balance should be involved ?
1 Like
eginnc
February 13, 2022, 5:58pm
76
I've been running r18785-8072bf3322 for a couple days on a MT7621AT (2C/4T) with SW offload only enabled. No crashes or reboots so far. HW offload is not enabled - I'm running SQM.
I'm getting decent throughput (~185 Mbps) using fq_codel/simple.qos with this snapshot, and with irqbalance enabled the load spreads across all four threads fairly well. CAKE still tops out ~100 Mbps (and tends to max out a single CPU thread). With fq_codel dropping latency nearly as much as CAKE, I think I'll stay here for awhile.
xabolcs
February 13, 2022, 8:45pm
77
No problem yet, almost 2 days uptime.
BusyBox v1.35.0 (2022-02-10 10:55:22 UTC) built-in shell (ash)
_______ ________ __
| |.-----.-----.-----.| | | |.----.| |_
| - || _ | -__| || | | || _|| _|
|_______|| __|_____|__|__||________||__| |____|
|__| W I R E L E S S F R E E D O M
-----------------------------------------------------
OpenWrt SNAPSHOT, r18785-8072bf3322
-----------------------------------------------------
| Machine: D-Link DIR-860L B1 |
| Uptime: 1d, 20:20:31 |
| Load: 0.07 0.13 0.09 |
| Flash: total: 5.4MB, free: 3.9MB, used: 28% |
| Memory: total: 117.4MB, free: 81.1MB, used: 30% |
| WAN: xx.xx.xx.xx, proto: pppoe |
| LAN: 192.168.xx.xx, leases: xx |
-----------------------------------------------------
Will check the latest r18801-56256259a1
SNAPSHOT soon.
1 Like
nikito7
February 13, 2022, 8:55pm
78
I'm testing now with 'Packet Steering' disabled
dsouza
February 13, 2022, 9:14pm
79
Archer C6 v3.2 here (SNAPSHOT r18792-337e942290, kernel 5.10.96, firewall4 2022-02-07.1-a0518b6d-1).
HW flow offload enabled, Packet steering enabled, IPv4 only (no IPv6), DHCP WAN (not PPPoE), no SQM.
HW flow offload working OK (350Mbps download with no significant CPU usage).
Rock solid for 2 days now.
1 Like
daniel
February 13, 2022, 11:35pm
81
hrindr13:
this one?
Yes that will work. Just make sure to connect only GND, RX and TX, do not connect VCC. RTC and CTS are unused.