SQM not working in 22.03.3

Hi,

I've been using OpenWRT on the Raspberry Pi 4B for 2.5 years. I've been using SQM since I first started using OpenWRT on the Raspberry Pi.

Up till today, the firmware that I was running was 21.02.1. This weekend I decided to finally upgrade to 22.03.3

I took a backup from the 21.02.1 installation, then I compiled my own image, and then I swapped out the SD cards and uploaded the 21.02.1 backup file onto the new 22.03.3 image.

I configure SQM from Luci. Everything seems to be working in version 22.03.3 - as far as I can tell (and I can get to the Internet) except that when SQM is enabled, shaping doesn't work. My current ISP speed is 900 mbps down and 24 mbps up. The way I set up OpenWRT on the Raspberry Pi 4B is that I use one interface for both WAN and LAN. I had created a VLAN interface (this was done 2.5 years ago) eth0.10 (VLAN 10) and assigned WAN to it. In my setup, LAN is on interface eth0. So, WAN is tagged with VLAN10 and LAN is untagged. I have a Layer 3 switch assisting this setup. The cable modem is plugged into an access switch port that is in VLAN10. The Raspberry Pi 4B is connected to another switch port that is configured as a trunk port with VLAN 10 being tagged over the trunk port and VLAN 150 being untagged (native VLAN). The rest of the ports on the switch are configured either as access ports (in any VLAN other than VLAN10) or as trunk ports (with VLAN10 excluded).

When I configured SQM, I assigned it to interface eth0.10. SQM shaping bandwidth applied to download worked perfectly fine in OpenWRT version 21.02.1 as well as in previous versions. The shaping would limit the download to the amount I would specify (or a little lower). However, when SQM is applied to interface eth0.10 in OpenWRT 22.03.3, shaping no longer works. Regardless of what I set for the download bandwidth, I get the full 900+ Mbps (904, 905, 907, etc.).

I've tried to disable / re-enabled SQM, I tried to remove VLAN interface eth0.10 and then re-create it, I restarted interface WAN assigned to eth0.10. I can't get shaping to work when SQM is applied to interface eth0.10. Just as a troubleshooting step, I used the same shaping policy (800000 kbps down and 24 kbps up) and applied it to interface eth0 (my LAN interface), and the shaping worked. When i tried to run a test from speedtest.com, I got the download speed in the low 20s Mbps. So, it appears that the shaping works when applied to a physical interface but it doesn't work when applied to a VLAN interface (whether it's because it's a subinterface on the physical interface or whether it's because it's set with a VLAN tag on it).

Here's my SQM config:

root@firewall:/etc/config# cat sqm

config queue
	option download '800000'
	option upload '23000'
	option debug_logging '0'
	option verbosity '5'
	option qdisc 'cake'
	option script 'piece_of_cake.qos'
	option qdisc_advanced '1'
	option squash_dscp '1'
	option squash_ingress '1'
	option ingress_ecn 'ECN'
	option egress_ecn 'NOECN'
	option qdisc_really_really_advanced '1'
	option iqdisc_opts 'mpu 64'
	option eqdisc_opts 'mpu 64'
	option linklayer 'ethernet'
	option overhead '18'
	option enabled '1'
	option interface 'eth0.10'

Here's my network config

root@firewall:/etc/config# cat network

config interface 'loopback'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'
	option device 'lo'

config globals 'globals'
	option ula_prefix 'fd73:79f7:33c5::/48'
	option packet_steering '1'

config interface 'lan'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.150.1'
	option device 'eth0'

config interface 'MRA'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.220.1'
	option device 'eth0.220'

config interface 'ColLab_1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.240.1'
	option device 'eth0.241'

config interface 'VOICE_HOME'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.250.1'
	option device 'eth0.250'

config interface 'UniFi_MGMT'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.2.1'
	option device 'eth0.2'

config interface 'MIDI'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.255.1'
	option device 'eth0.255'

config route
	option target '172.18.224.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'
	option interface 'lan'

config route
	option interface 'lan'
	option target '172.19.224.0/24'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'

config route
	option interface 'lan'
	option target '10.100.100.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'

config route
	option interface 'lan'
	option target '10.46.3.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'

config interface 'UniFi_CFG'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.1.1'
	option device 'eth0.1'

config interface 'DATA_HOME'
	option proto 'static'
	option ipaddr '192.168.200.1'
	option netmask '255.255.255.0'
	option device 'eth0.200'

config interface 'ColLab_2'
	option proto 'static'
	option ipaddr '192.168.241.1'
	option netmask '255.255.255.0'
	option device 'eth0.240'

config interface 'WG'
	option proto 'wireguard'
	list addresses '10.0.0.1/32'
	option listen_port '59575'

config wireguard_WG
	option description 'Client'
	option route_allowed_ips '1'
	option persistent_keepalive '25'
	list allowed_ips '10.0.0.2/32'

config route
	option interface 'lan'
	option target '192.168.234.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.254'

config device
	option name 'eth0.2'
	option type '8021q'

config device
	option name 'eth0.1'
	option type '8021q'

config device
	option type '8021q'
	option ifname 'eth0'
	option vid '10'
	option name 'eth0.10'

config interface 'WAN'
	option proto 'dhcp'
	option device 'eth0.10'
	option hostname '*'

Here's the SQM logs:

root@firewall:/tmp/run/sqm# cat eth0.10.start-sqm.log
start-sqm: Log for interface eth0.10: Sun Mar 12 16:31:30 EDT 2023

Sun Mar 12 16:31:30 EDT 2023: Starting.
Starting SQM script: piece_of_cake.qos on eth0.10, in: 800000 Kbps, out: 23000 Kbps
fn_exists: function candidate name: sqm_start
fn_exists: TYPE_OUTPUT: sqm_start: not found
fn_exists: return value: 1
Using generic sqm_start_default function.
fn_exists: function candidate name: sqm_prepare_script
fn_exists: TYPE_OUTPUT: sqm_prepare_script is a function
fn_exists: return value: 0
sqm_start_default: starting sqm_prepare_script
cmd_wrapper: COMMAND: /sbin/ip link add name SQM_IFB_9306c type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link add name SQM_IFB_9306c type ifb
cmd_wrapper: COMMAND: /sbin/tc qdisc replace dev SQM_IFB_9306c root cake
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc replace dev SQM_IFB_9306c root cake
QDISC cake is useable.
cmd_wrapper: COMMAND: /sbin/ip link set dev SQM_IFB_9306c down
cmd_wrapper: ip: SUCCESS: /sbin/ip link set dev SQM_IFB_9306c down
cmd_wrapper: COMMAND: /sbin/ip link delete SQM_IFB_9306c type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link delete SQM_IFB_9306c type ifb
cmd_wrapper: COMMAND: /sbin/ip link add name SQM_IFB_aa3e6 type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link add name SQM_IFB_aa3e6 type ifb
cmd_wrapper: COMMAND: /sbin/tc qdisc replace dev SQM_IFB_aa3e6 root cake
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc replace dev SQM_IFB_aa3e6 root cake
QDISC cake is useable.
cmd_wrapper: COMMAND: /sbin/ip link set dev SQM_IFB_aa3e6 down
cmd_wrapper: ip: SUCCESS: /sbin/ip link set dev SQM_IFB_aa3e6 down
cmd_wrapper: COMMAND: /sbin/ip link delete SQM_IFB_aa3e6 type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link delete SQM_IFB_aa3e6 type ifb
sqm_start_default: Starting piece_of_cake.qos
ifb associated with interface eth0.10: 
Currently no ifb is associated with eth0.10, this is normal during starting of the sqm system.
cmd_wrapper: COMMAND: /sbin/ip link add name ifb4eth0.10 type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link add name ifb4eth0.10 type ifb
fn_exists: function candidate name: egress
fn_exists: TYPE_OUTPUT: egress is a function
fn_exists: return value: 0
egress
cmd_wrapper: tc: invocation silenced by request, FAILURE either expected or acceptable.
cmd_wrapper: COMMAND: /sbin/tc qdisc del dev eth0.10 root
cmd_wrapper: tc: FAILURE (2): /sbin/tc qdisc del dev eth0.10 root
cmd_wrapper: tc: LAST ERROR: RTNETLINK answers: No such file or directory
LLA: default link layer adjustment method for cake is cake
cake link layer adjustments:  overhead 18 mpu 0
cmd_wrapper: COMMAND: /sbin/tc qdisc add dev eth0.10 root cake bandwidth 23000kbit overhead 18 mpu 0 besteffort
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc add dev eth0.10 root cake bandwidth 23000kbit overhead 18 mpu 0 besteffort
sqm_start_default: egress shaping activated
cmd_wrapper: COMMAND: /sbin/ip link add name SQM_IFB_6fc4e type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link add name SQM_IFB_6fc4e type ifb
cmd_wrapper: COMMAND: /sbin/tc qdisc replace dev SQM_IFB_6fc4e ingress
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc replace dev SQM_IFB_6fc4e ingress
QDISC ingress is useable.
cmd_wrapper: COMMAND: /sbin/ip link set dev SQM_IFB_6fc4e down
cmd_wrapper: ip: SUCCESS: /sbin/ip link set dev SQM_IFB_6fc4e down
cmd_wrapper: COMMAND: /sbin/ip link delete SQM_IFB_6fc4e type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link delete SQM_IFB_6fc4e type ifb
fn_exists: function candidate name: ingress
fn_exists: TYPE_OUTPUT: ingress is a function
fn_exists: return value: 0
ingress
cmd_wrapper: tc: invocation silenced by request, FAILURE either expected or acceptable.
cmd_wrapper: COMMAND: /sbin/tc qdisc del dev eth0.10 handle ffff: ingress
cmd_wrapper: tc: FAILURE (2): /sbin/tc qdisc del dev eth0.10 handle ffff: ingress
cmd_wrapper: tc: LAST ERROR: RTNETLINK answers: Invalid argument
cmd_wrapper: COMMAND: /sbin/tc qdisc add dev eth0.10 handle ffff: ingress
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc add dev eth0.10 handle ffff: ingress
cmd_wrapper: tc: invocation silenced by request, FAILURE either expected or acceptable.
cmd_wrapper: COMMAND: /sbin/tc qdisc del dev ifb4eth0.10 root
cmd_wrapper: tc: FAILURE (2): /sbin/tc qdisc del dev ifb4eth0.10 root
cmd_wrapper: tc: LAST ERROR: RTNETLINK answers: No such file or directory
LLA: default link layer adjustment method for cake is cake
cake link layer adjustments:  overhead 18 mpu 0
cmd_wrapper: COMMAND: /sbin/tc qdisc add dev ifb4eth0.10 root cake bandwidth 800000kbit overhead 18 mpu 0 besteffort wash
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc add dev ifb4eth0.10 root cake bandwidth 800000kbit overhead 18 mpu 0 besteffort wash
cmd_wrapper: COMMAND: /sbin/ip link set dev ifb4eth0.10 up
cmd_wrapper: ip: SUCCESS: /sbin/ip link set dev ifb4eth0.10 up
cmd_wrapper: COMMAND: /sbin/tc filter add dev eth0.10 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb4eth0.10
cmd_wrapper: tc: SUCCESS: /sbin/tc filter add dev eth0.10 parent ffff: protocol all prio 10 u32 match u32 0 0 flowid 1:1 action mirred egress redirect dev ifb4eth0.10
sqm_start_default: ingress shaping activated
piece_of_cake.qos was started on eth0.10 successfully

root@firewall:/tmp/run/sqm# cat eth0.10.stop-sqm.log
stop-sqm: Log for interface eth0.10: Sun Mar 12 16:31:30 EDT 2023

Sun Mar 12 16:31:30 EDT 2023: Stopping.
Stopping SQM on eth0.10
ifb associated with interface eth0.10: ifb4eth0.10
cmd_wrapper: COMMAND: /sbin/tc qdisc del dev eth0.10 ingress
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc del dev eth0.10 ingress
cmd_wrapper: COMMAND: /sbin/tc qdisc del dev eth0.10 root
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc del dev eth0.10 root
cmd_wrapper: COMMAND: /sbin/tc qdisc del dev ifb4eth0.10 root
cmd_wrapper: tc: SUCCESS: /sbin/tc qdisc del dev ifb4eth0.10 root
/usr/lib/sqm/stop-sqm: ifb4eth0.10 shaper deleted
cmd_wrapper: COMMAND: /sbin/ip link set dev ifb4eth0.10 down
cmd_wrapper: ip: SUCCESS: /sbin/ip link set dev ifb4eth0.10 down
cmd_wrapper: COMMAND: /sbin/ip link delete ifb4eth0.10 type ifb
cmd_wrapper: ip: SUCCESS: /sbin/ip link delete ifb4eth0.10 type ifb
/usr/lib/sqm/stop-sqm: ifb4eth0.10 interface deleted
root@firewall:/tmp/run/sqm# cat eth0.10.state
ALL_MODULES="sch_cake sch_ingress act_mirred cls_fw cls_flow cls_u32 sch_htb"
AUTOFLOW="0"
CUR_DIRECTION="ingress"
DOWNLINK="800000"
EECN="NOECN"
EGRESS_CAKE_OPTS="besteffort"
ELIMIT=""
EQDISC_OPTS=""
ESHAPER_BURST_DUR_US="1000"
ESHAPER_QUANTUM_DUR_US="1000"
ETARGET=""
IECN="ECN"
IFACE="eth0.10"
IGNORE_DSCP_INGRESS="1"
ILIMIT=""
INGRESS_CAKE_OPTS="besteffort wash"
INSMOD="/sbin/modprobe -q"
IP="ip_wrapper"
IP6TABLES="ip6tables_wrapper"
IP6TABLES_BINARY=""
IPTABLES="iptables_wrapper"
IPTABLES_ARGS="-w 1"
IPTABLES_BINARY="/usr/sbin/iptables-nft"
IPT_MASK="0xff"
IPT_TRANS_LOG="/var/run/sqm/eth0.10.iptables.log"
IP_BINARY="/sbin/ip"
IQDISC_OPTS=""
ISHAPER_BURST_DUR_US="1000"
ISHAPER_QUANTUM_DUR_US="1000"
ITARGET=""
LIMIT="1001"
LINKLAYER="ethernet"
LLAM="default"
OUTPUT_TARGET="/var/run/sqm/eth0.10.start-sqm.log"
OVERHEAD="18"
QDISC="cake"
SCRIPT="piece_of_cake.qos"
SHAPER_BURST_DUR_US="1000"
SHAPER_QUANTUM_DUR_US="1000"
SILENT="0"
SQM_DEBUG="1"
SQM_DEBUG_LOG="/var/run/sqm/eth0.10.start-sqm.log"
SQM_DEBUG_STEM="/var/run/sqm/eth0.10"
SQM_START_LOG="/var/run/sqm/eth0.10.start-sqm.log"
SQM_STOP_LOG="/var/run/sqm/eth0.10.stop-sqm.log"
SQM_VERBOSITY_MAX="10"
SQM_VERBOSITY_MIN="0"
STAB_MPU="0"
STAB_MTU="2047"
STAB_TSIZE="512"
TARGET="5ms"
TC="tc_wrapper"
TC_BINARY="/sbin/tc"
UPLINK="23000"
VERBOSITY_DEBUG="8"
VERBOSITY_ERROR="1"
VERBOSITY_INFO="5"
VERBOSITY_SILENT="0"
VERBOSITY_TRACE="10"
VERBOSITY_WARNING="2"
ZERO_DSCP_INGRESS="1"

I think I've reached the limit of my knowledge with OpenWRT here, so I desperately need help. Thank you!

Here's another strange thing. The "devices" that I hadn't tried to reconfigure in Luci (under Network -> Interfaces -> Devices) were grayed out. They also didn't show in the file /etc/config/network

However, after I clicked the "Configure" button, they would get ungrayed in Luci, and after I did it with all remaining interfaces and Saved/Applied, these interfaces got populated in /etc/config/network

You can compare the contents of the /etc/config/network file with these "devices" being grayed out (see in the post above) and once they were no longer grayed out, see here:

root@firewall:/etc/config# cat network

config interface 'loopback'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'
	option device 'lo'

config globals 'globals'
	option ula_prefix 'fd73:79f7:33c5::/48'
	option packet_steering '1'

config interface 'lan'
	option proto 'static'
	option netmask '255.255.255.0'
	option ip6assign '60'
	option ipaddr '192.168.150.1'
	option device 'eth0'

config interface 'MRA'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.220.1'
	option device 'eth0.220'

config interface 'ColLab_1'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.240.1'
	option device 'eth0.241'

config interface 'VOICE_HOME'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.250.1'
	option device 'eth0.250'

config interface 'UniFi_MGMT'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.2.1'
	option device 'eth0.2'

config interface 'MIDI'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.255.1'
	option device 'eth0.255'

config route
	option target '172.18.224.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'
	option interface 'lan'

config route
	option interface 'lan'
	option target '172.19.224.0/24'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'

config route
	option interface 'lan'
	option target '10.100.100.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'

config route
	option interface 'lan'
	option target '10.46.3.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.154'

config interface 'UniFi_CFG'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.1.1'
	option device 'eth0.1'

config interface 'DATA_HOME'
	option proto 'static'
	option ipaddr '192.168.200.1'
	option netmask '255.255.255.0'
	option device 'eth0.200'

config interface 'ColLab_2'
	option proto 'static'
	option ipaddr '192.168.241.1'
	option netmask '255.255.255.0'
	option device 'eth0.240'

config interface 'WG'
	option proto 'wireguard'
	list addresses '10.0.0.1/32'
	option listen_port '59575'

config wireguard_WG
	option description 'client'
	option route_allowed_ips '1'
	option persistent_keepalive '25'
	list allowed_ips '10.0.0.2/32'

config route
	option interface 'lan'
	option target '192.168.234.0'
	option netmask '255.255.255.0'
	option gateway '192.168.150.254'

config device
	option name 'eth0.2'
	option type '8021q'
	option ifname 'eth0'
	option vid '2'

config device
	option name 'eth0.1'
	option type '8021q'

config device
	option type '8021q'
	option ifname 'eth0'
	option vid '10'
	option name 'eth0.10'
	option ipv6 '0'

config interface 'WAN'
	option proto 'dhcp'
	option device 'eth0.10'
	option hostname '*'

config device
	option name 'eth0'

config device
	option name 'eth0.200'
	option type '8021q'
	option ifname 'eth0'
	option vid '200'

config device
	option name 'eth0.220'
	option type '8021q'
	option ifname 'eth0'
	option vid '220'

config device
	option name 'eth0.240'
	option type '8021q'
	option ifname 'eth0'
	option vid '240'

config device
	option name 'eth0.241'
	option type '8021q'
	option ifname 'eth0'
	option vid '241'

config device
	option name 'eth0.250'
	option type '8021q'
	option ifname 'eth0'
	option vid '250'

config device
	option name 'eth0.255'
	option type '8021q'
	option ifname 'eth0'
	option vid '255'

config device
	option name 'WG'

You have to install iptables-nft.
If when you try to install it it complains that iptables-zz-legacy already provides iptables, then you have to remove iptables-zz-legacy first...........

Could you post the output of:

  1. tc -d qdisc
  2. tc -s qdisc

please to see what/if some cake instances actually got instantiated and how they are configured.

Here you go:

root@firewall:/etc/config# tc -d qdisc
qdisc noqueue 0: dev lo root refcnt 2 
qdisc cake 8045: dev eth0 root refcnt 6 bandwidth 23Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms noatm overhead 18 
qdisc ingress ffff: dev eth0 parent ffff:fff1 ---------------- 
qdisc noqueue 0: dev WG root refcnt 2 
qdisc noqueue 0: dev eth0.200 root refcnt 2 
qdisc noqueue 0: dev eth0.220 root refcnt 2 
qdisc noqueue 0: dev eth0.240 root refcnt 2 
qdisc noqueue 0: dev eth0.241 root refcnt 2 
qdisc noqueue 0: dev eth0.250 root refcnt 2 
qdisc noqueue 0: dev eth0.255 root refcnt 2 
qdisc noqueue 0: dev eth0.2 root refcnt 2 
qdisc noqueue 0: dev eth0.10 root refcnt 2 
qdisc cake 8046: dev ifb4eth0 root refcnt 2 bandwidth 800Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100ms noatm overhead 18 
root@firewall:/etc/config# 
root@firewall:/etc/config# 
root@firewall:/etc/config# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc cake 8045: dev eth0 root refcnt 6 bandwidth 23Mbit besteffort triple-isolate nonat nowash no-ack-filter split-gso rtt 100ms noatm overhead 18 
 Sent 268428511 bytes 832514 pkt (dropped 3516, overlimits 928758 requeues 2) 
 backlog 0b 0p requeues 2
 memory used: 3366744b of 4Mb
 capacity estimate: 23Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       46 /    1518
 average network hdr offset:           14

                  Tin 0
  thresh         23Mbit
  target            5ms
  interval        100ms
  pk_delay        246us
  av_delay         21us
  sp_delay          2us
  backlog            0b
  pkts           837173
  bytes       275448584
  way_inds        15336
  way_miss        13975
  way_cols            0
  drops            3516
  marks              22
  ack_drop            0
  sp_flows            7
  bk_flows            1
  un_flows            0
  max_len         13626
  quantum           701

qdisc ingress ffff: dev eth0 parent ffff:fff1 ---------------- 
 Sent 255710286 bytes 816134 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev WG root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.200 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.220 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.240 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.241 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.250 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.255 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.10 root refcnt 2 
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 0b 0p requeues 0
qdisc cake 8046: dev ifb4eth0 root refcnt 2 bandwidth 800Mbit besteffort triple-isolate nonat wash no-ack-filter split-gso rtt 100ms noatm overhead 18 
 Sent 267390846 bytes 816134 pkt (dropped 0, overlimits 69562 requeues 0) 
 backlog 0b 0p requeues 0
 memory used: 27520b of 15140Kb
 capacity estimate: 800Mbit
 min/max network layer size:           46 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                  Tin 0
  thresh        800Mbit
  target            5ms
  interval        100ms
  pk_delay         13us
  av_delay          2us
  sp_delay          1us
  backlog            0b
  pkts           816134
  bytes       267390846
  way_inds         9265
  way_miss        14574
  way_cols            0
  drops               0
  marks               0
  ack_drop            0
  sp_flows            7
  bk_flows            1
  un_flows            0
  max_len         11728
  quantum          1514

1 Like

@bluewavenet

I already have iptables-nft installed:

root@firewall:/etc/config# opkg list-installed iptables-nft
iptables-nft - 1.8.7-7

Trying to brainstorm here. I see why @bluewavenet mentioned the "iptables-nft" package. According to this article:

Since OpenWrt 22.03, fw4 is used by default, and it generates nftables rules.

This same article says that the easiest way to install all the modules (I have them all installed, by the way), is to install the package nftables. I don't have the package nftables installed.

root@firewall:/etc/config# opkg list-installed nftables
root@firewall:/etc/config# 

Is it what I need to install? What should I have selected in menu confiig to properly install nftables and all of its components?

So, I've done some more troubleshooting. It seems that the package nft-qos conflicts with the package sqm-scripts. So, I went ahead and removed the package "nft-qos" and then I did the following:

root@firewall:/etc/config# opkg remove luci-app-sqm --force-depends
Removing package luci-app-sqm from root...
root@firewall:/etc/config# opkg remove sqm-scripts
Removing package sqm-scripts from root...

root@firewall:/etc/config# opkg install sqm-scripts
Installing sqm-scripts (1.5.2-1) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/packages/sqm-scripts_1.5.2-1_all.ipk
Configuring sqm-scripts.
root@firewall:/etc/config# opkg install luci-app-sqm
Installing luci-app-sqm (git-23.063.28820-8c4562d) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/luci/luci-app-sqm_git-23.063.28820-8c4562d_all.ipk
Configuring luci-app-sqm.
uci: Entry not found
uci: Entry not found

Then, I decided to install another package that listed sqm in its name:

root@firewall:/etc/config# opkg install collectd-mod-sqm
Installing collectd-mod-sqm (5.12.0-33) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/packages/collectd-mod-sqm_5.12.0-33_aarch64_cortex-a72.ipk
Installing libltdl7 (2.4.6-2) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/base/libltdl7_2.4.6-2_aarch64_cortex-a72.ipk
Installing collectd (5.12.0-33) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/packages/collectd_5.12.0-33_aarch64_cortex-a72.ipk
Installing collectd-mod-exec (5.12.0-33) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/packages/collectd-mod-exec_5.12.0-33_aarch64_cortex-a72.ipk
Configuring libltdl7.
Configuring collectd.
Configuring collectd-mod-exec.
Configuring collectd-mod-sqm.

I tried to use the SQM with the default setting except for changing the interface from eth1 to eith0.10:

config queue 'eth1'
	option download '85000'
	option upload '10000'
	option qdisc 'cake'
	option linklayer 'none'
	option enabled '1'
	option interface 'eth0.10'
	option debug_logging '0'
	option verbosity '5'
	option script 'layer_cake.qos'

Then I restarted SQM:

root@firewall:/etc/config# /etc/init.d/sqm restart
SQM: Stopping SQM on eth0.10
SQM: Starting SQM script: layer_cake.qos on eth0.10, in: 85000 Kbps, out: 10000 Kbps
SQM: layer_cake.qos was started on eth0.10 successfully

Unfortunately, I'm still getting the full download speed on the WAN interface.

Also, hoping maybe I can configure shaping using nft-qos, I tried the following:

root@firewall:~# opkg install luci-app-nft-qos
Installing luci-app-nft-qos (git-22.026.38400-f30b673) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/luci/luci-app-nft-qos_git-22.026.38400-f30b673_all.ipk
Installing nft-qos (1.0.6-4) to root...
Downloading https://downloads.openwrt.org/releases/22.03.3/packages/aarch64_cortex-a72/packages/nft-qos_1.0.6-4_all.ipk
Configuring nft-qos.
Configuring luci-app-nft-qos.
Collected errors:
 * pkg_hash_check_unresolved: cannot find dependency kernel (= 5.10.161-1-61a27be69d8fdf73a5b94559a7a5730a) for kmod-nft-netdev
 * pkg_hash_fetch_best_installation_candidate: Packages for kmod-nft-netdev found, but incompatible with the architectures configured
 * pkg_hash_check_unresolved: cannot find dependency kernel (= 5.10.161-1-61a27be69d8fdf73a5b94559a7a5730a) for kmod-nft-bridge
 * pkg_hash_fetch_best_installation_candidate: Packages for kmod-nft-bridge found, but incompatible with the architectures configured

So, what is this cannot find dependency kernel message? Why am I pulling the packages not compatible with my kernel? This is not a snapshot image that I'm running. This is a stable image.

My kernel version:

root@firewall:~# uname -r
5.10.161

I've found the solution.

Software flow offloading was enabled. Either I enabled it a while back (at least a year ago) and it didn't have any effect on SQM in OpenWRT 21.02.1, or OpenWRT 22.03.3 compiles with this option enabled. After I disabled software flow offloading, shaping in SQM started working.

Read this post for the details.

1 Like

This is odd, as your cake instances do see packets and at least the egress instance also dropped packets according to the statistics...

Interesting, as far as I heard software flow offloading did work with sqm in the past. However I never tried that myself. I would not be amazed if the problem is related to your use of one ethernet interface for both WAN amd LAN and probably only using an explicit VLAN for the WAN traffic. Maybe try to configure the LAN side to use VLAN11 (also requires a matching change in the managed switch) and see whether software flow offloading still "knocks out" sqm?

I vaguely remember enabling software offloading a year ago in version 21.02.1. I remember reading up on what it did, and decided to enable it and try it. SQM definitely worked properly with software offloading in 21.02.1, but shaping doesn’t work properly with software offloading in version 22.03.3 (I tried several times to enable software offloading and disable it and tested the download speed with SQM enabled. The results are conclusive and easily replicable).

I used the untagged VLAN on the openWRT LAN interface on purpose. This way I can unplug the switch, plug a computer directly into the Raspberry Pi’s single Ethernet port and be able to configure OpenWRT without having to enable VLAN tagging on the computer’s NIC.

I did not doubt that you had good reasons to configure it that way, I was just curius whether this uncommon configuration might be part of the problem, if you are fine with software offloading disabled just go for this. And given the rPi4B's general capabilities I see no need for software flow offloading especially in a single NIC configuration where you you essentially already have a hard limit of ~940 Mbps for the sum of concurrent up- and download traffic.

What I’m learning, though, is that the Raspberry Pi 4B (at least its first generation release that has 1.5 GHz clocked cores) can’t handle SQM-based shaping when it needs to push over 800 Mbps bidirectionally (like in my case). Additionally, with SQM disabled and the bidirectional bandwidth exceeding 900 mbps, the Raspberry Pi drops about 1.2 - 1.5% of packets during a speed test. Re-enabling SQM stops the packet drop, but can’t forward more than 810 mbps bidirectionally. I’ve only discovered it in the last few days, but this behavior is consistent across versions 21.02.1 and 22.03.3.

Tonight I will try overclocking the Raspberry Pi 4B to 2.0 GHz to see if it can handle above 810 Mbps of bidirectional traffic on a single Gigabit Ethernet interface with SQM shaping enabled. I will also test the overclocked Raspberry Pi 4B with SQM disabled to see if it can forward 900 and above Mbps bidirectionally without dropping packets.

It’s only recently that my setup has approached the capacity of a single Raspberry Pi’s CPU core assigned to the interface and the capacity of a one-interface setup to handle all the bandwidth (download and upload) available to me from my ISP (Comcast). I remained on the Comcast’s mid tier, but they now boosted the bandwidth on this tier to 800 mbps download (officially) with me getting over 900 Mbps most of the time. Because Comcast keeps the upload speeds low (25 Mbps - give or take), I can still theoretically use my one-interface setup as long as the overclocked Raspberry Pi 4 B can handle this amount of bidirectional traffic.

So this is not totally unexpected, if your speedtest numbers are net goodput numbers... as Gigabit ethernet has a gross rate per direction of 1000 Mbps, but in your configuration each packet needs to traverse both the ingress and egress side of the ethernet link and hidden data like ACK packets also eat into the total capacity.

So assuming e.g. you measure 900Mbps download and 23 Mbps upload capacity if measured unidirectionally, what you have in reality is something like:
download 900Mbps:
ACK traffic volume (assuming 1 ACK every two full MTU packets) 900 * 1/40 = 22.5 Mbps
(900+22.5) * ((1500+38+4) / (1500-20-20)) = 974.31 Mbps gross traffic
if we add the 23 upload we see:
(900+22.5+23+23/40) * ((1500+38+4) / (1500-20-20)) = 999.21 Mbps

So this kind of traffic simply eats up all your ethernet capacity...

Once you push that over the link without sqm the ethernet queue first start to run up and then over, and it is that over run that causes the packet loss. However packet loss during a saturating speedtest is per se not a bad thing as it is what signals the flows to slow down as they reached (exceeded) the available capacity. It id just that a dumb FIFO tends to start out not dropping enough only to switch to dropping almost everything. That is where AQMs like cake fq_codel help, instead of waiting until the last moment to start dropping and then having to drop like crazy, these AQM start dripping gently considerably earlier thereby gently helping the capacity seeking flows to find that capacity with too much dropping carnage.

BTW, not knowing your exact throughput numbers the above is only a hypothesis/illustration how to check whether a test might have been ethernet limited or not.

If you repost your current sqm configuration we can also estimate what goodput you might measure maximally...

Regarding your CPU issue, have a look at both
/proc/interrupts and /proc/softirq
before and after a throughput test, maybe too much stuff piles on the same CPU maxing it out. If some of this stuff could be moved to other CPUs either via CPU affinity or via irqbalance you might be able to see more throughput...

Or you simply invest into a TP-Link UE300 usb3 gigabit ethernet dongle and use it as WAN interface...

If you look at the calculations above, it might be that your Pi is not CPU limited but that you are already maxing out your single ethernet interface...

I will know if my Pi is CPU limited tonight when I overclock it to 2.0 GHz.

I intentionally moved away from a USB-based Gigabit Ethernet dongle. That was my initial setup, but I went to a one-interface setup because I historically didn't have a good experience with USB-emulated Ethernet adapters on small-server hardware in my lab. With a USB3 dongle, there is no way to arrive at consistent latency with specific SQM settings. I need my connection to be rock-solid with consistent latency and bandwidth, and that's not something I was able to get on the Raspberry Pi 4B with a USB Gigabit Ethernet dongle.

My next project is to ditch the Raspberry Pi 4B and go with the NanoPi R6S with the RK3588S CPU, 8 GB RAM, 32 GB eMMC, and two 2.5 Gbps NICs. I am upgrading my network to 2.5 Gbps (replacing APs, the switches, and going above 1 Gbps on my WAN connection).

With a 2.5 Gbps LAN interface on the NanoPi R6S, I will be able to replace my enterprise Cisco 3560 L3 switch with a 2.5 Gbps-per-port managed L2 switch and do L3 switching in OpenWRT. I don't have a lot of traffic on my network L3 switched, but I have some (and I need multiple VLANs because I have a few labs in my home office). So, now that an affordable device with a 2.5 Gbps LAN interface is available, I can flatten my network by getting rid of the L3 switch and make it easier and faster to perform topology changes on my network depending on the project that I am working on.

I'm looking forward to all the issues that I will initially encounter trying to run a flavor of OpenWRT (FriendlyWRT) on the NanoPi R6S. :slight_smile:

If you are happy to run somewhat critical infrastructure like your internet access router outside its design envelope, I am not stopping you, but I am also not sure I would consider that wise either :wink:

While I have not tried it myself, others here in the forum have done so sucessfully. However not all USB ethernet dongles are equally useful, the tp-link ue300 seems to belong to the better set of the pack. Ans folks are y=using it a WAN interface with sqm, as far as I can tell pretty successful.

I would buy this, if you would use the awful "share one NIC for WAN and LAN" combination instead... but your network your choice...

While I can not say anything about that device (yet), I can understand you not going the usb3 dongle route if you are already planning a replacement, however in that case simply set your sqm download traffic shaper to 500Mbps* and be done with (also set the per-packet overhead to 42 bytes and you should be golden)... unless you enjoy the tweaking of you configuration.
The only reason IMHO to set the shaper higher than that is if you routinely require have big and time-critical bulk data downloads, otherwise 500 or 900 Mbps will have little impact on your networks usability (which for most use-cases is dominated by responsiveness under working conditions, aka latency increase under load)

*) 500 gross rate with the correct overhead accounting guarantees that downloads can at best hog around 50% of the NIC's ingress and egress rate, which should give you enough headroom to not care.

I see you practice a very conservative approach to all things life. I'm half the way there, as in my quarter-of-a-century IT carrier, I've grown much more conservative about my hardware choices (not that I was ever super liberal).

Overclocking the Raspberry Pi 4B to 2.0 GHz has been proven completely safe and reliable for at least 2 years now. There are multiple posts on this all over the web, and it even runs stably overclocked when using the Raspberry Pi OS with a GUI or even Ubuntu with a GUI. So, I'm 99.9% sure that running OpeWRT on the Raspberry Pi 4B cores overclocked to 2.0 GHz will be no problem. Whether or not overclocking to 2.0 GHz will allow me to push 900 Mbps through SQM is something I will see tonight.

I never chased the download bandwidth in my entire life of using broadband Internet (about 20 years now). Once my download bandwidth exceeded 100 Mbps (around 2013-2014) and I had to replace my Cisco ASA5505 (which could only handle up to 100 Mbps of routed throughput), I realized that anything above 100 Mbps is just a number for most applications. If one could get 100 Mbps of symmetric throughput, this is all that most households would ever need. I vacation in rural Quebec in the summer, where 50 Mbps / 5 Mpbs is still a coveted bandwidth plan with not everyone able to get that kind of home service (be it cable, DSL, or cellular). The only provider who can give them higher than 100 Mbps is Starklink, but that's cost prohibitive for most households. Every summer after I spend a couple months in Quebec, I come home and realize that I'm seriously blessed being able to have Comcast as my ISP (all the shortcomings that Comcast has notwithstanding).

Because of me being rather conservative about paying for download bandwidth more than I need to, I've stayed on a mid-tier Comcast (internet-only) plan since 2013 when I moved to my current house. However, Comcast has been great with continually increasing the download bandwidth (and dropping the price of the mid-tier in my area to $50 all included in the past several years). This is, of course, is caused by the competition, as in my area, there is now AT&T fiber - not available in my neighborhood, but also recently with T-Mobile fixed 5G Internet coming to the area, which is priced at $50/month. I happen to be very close to a T-Mobile 5G tower, so I will test their service this week.

American ISPs increasing their download bandwidth continually and lowering the prices has led to me having to upgrade my routing hardware in the past decade several times. I went from a Cisco 831 router, to a Cisco ASA5505 firewall, to a pfSense box running on a Fitlet mini-computer with an AMD CPU, to a Raspberry Pi 4B, and now to a NanoPi R6S - all in less than 10 years.

So, I'm not particularly chasing the bandwidth, and I'm certainly not paying hundreds of dollars per month just to get a higher download number. However, when I do get a higher bandwidth at no additional cost to me, I try to take full advantage of it. So, setting SQM to 500 Mbps when Comcast now gives me 800 Mbps (officially) and over 900 Mbps (according to my speed tests), is not something I am willing to do. I will try to find affordable hardware that can outpeform the bandwidth offered to me at a mid-tier service plan by my ISP.

Fair enough, you seem to be enjoying the process and then tinkering is just fine.

However checking of /proc/interrups and /proc/softirq is something I would include in my tinkering if I were you as any improvement in CPU placement will be on top of any improvement by higher frequency. I guess I have nothing more productive to contribute, so good luck and enjoy the "ride".

1 Like

So, here are my results with the Raspberry Pi 4B overclocked with the following settings added to the end of the /boot/config.txt file.

over_voltage=6
arm_frequency=2000

I can shape up to 920000 on the downstream bandwidth in SQM with no errors showing on the Ookla Speedtest app for macOS. The most bandwidth I was able to get with SQM shaping applied was 886 Mbps with SQM (Layer Cake) applied.

  • When I set the shaping download bandwidth to 930000, I get occasional tests with a very low error rate, but most tests are error free.
  • With 935000 configured in SQM, I start getting around 0.6% of errors consistently.
  • When I set the shaping download bandwidth to 940000 in SQM, I start getting around 1.2% to 1.6% of packet loss.

This has been tested multiple times and the results are consistent.

Without an SQM policy applied, I get about 908 Mbps of download bandwidth in the Ookla Speedtest app, but once it went up to 918 Mbps. The error rate without an SQM policy applied is between 1% to 1.8%.

Before I overclocked the Raspberry Pi 4B, I was getting (according to one of my earlier posts in this thread):

##############
What I’m learning, though, is that the Raspberry Pi 4B (at least its first generation release that has 1.5 GHz clocked cores) can’t handle SQM-based shaping when it needs to push over 800 Mbps bidirectionally (like in my case). Additionally, with SQM disabled and the bidirectional bandwidth exceeding 900 mbps, the Raspberry Pi drops about 1.2 - 1.5% of packets during a speed test. Re-enabling SQM stops the packet drop, but can’t forward more than 810 mbps bidirectionally. I’ve only discovered it in the last few days, but this behavior is consistent across versions 21.02.1 and 22.03.3.

Tonight I will try overclocking the Raspberry Pi 4B to 2.0 GHz to see if it can handle above 810 Mbps of bidirectional traffic on a single Gigabit Ethernet interface with SQM shaping enabled. I will also test the overclocked Raspberry Pi 4B with SQM disabled to see if it can forward 900 and above Mbps bidirectionally without dropping packets.
################

So, my conclusion is as follows:

Overclocking the Raspberry Pi 4B's CPU cores to 2 GHz definitely improves the capability of SQM to forward download traffic out of a shaped queue at a higher throughput. Before being overclocked, the Raspberry Pi 4B couldn't route download traffic at a rate higher than 810 Mbps. After being overclocked, it was able to forward the download traffic bandwidth out of a shaped queue with up to 886 Mbps (with 0% errors and 2 ms latency shown in the Ookla Speedtest app for macOS). Additionally, I can reach the shaping rates (935000 and 940000) that result in the Ookla download speed test showing errors (0.6% and 1.2 - 1.6%, respectively), which tells me that the shaping policy works even when the download bandwidth is set up to 940000 (since I start getting errors, which I don't get when the download bandwidth in SQM is set to 920000). The reason my throughput doesn't increase above 886 Mbps no matter what shaping bandwidth is configured in SQM is something that I can't explain, especially because when no shaping policiy is applied, I get the download bandwidth above 900 Mbps consistently (with errors, though).

My conclusion: It definitely makes sense to overclock the Raspberry Pi 4B when running OpenWRT (v. 22.03.3 in this case) in order to boost the amount of shaped download throughput (if the ISP can deliver above 810 Mbps in throughout).