Packet Steering

dillingspam · May 25, 2024, 5:22pm

Hi, would packet steering need to be enabled under interfaces -> global network options for these settings to take effect?

echo 2 > /proc/irq/35/smp_affinity
echo 8 > /proc/irq/36/smp_affinity

echo e > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo e > /sys/class/net/eth1/queues/rx-0/rps_cpus
echo e > /sys/class/net/eth2/queues/rx-0/rps_cpus
echo e > /sys/class/net/eth4/queues/rx-0/rps_cpus

anon72830772 · May 25, 2024, 5:32pm

If LuCI (graphical interface) is installed:
Network > Interfaces > Global network options,
At the Packet Steering option, select/click Enabled

Someone more knowledgeable may explain difference(s) between Enabled vs Enabled (all CPUs)

darksky · May 25, 2024, 6:39pm

Just a guess but Enabled may just be on one CPU vs multithreaded.

egc · May 25, 2024, 7:12pm

You seem to have a 4 core CPU (cores 0-3) (2 is core 1, 8 is core 3).
e designates all cores (1+2+4+8).

If your CPU can deal with irqbalance I would use irqbalance, but some recent cores do not like switching while processing and you need to set things manually on startup.
Check the threads of your router.

About packet steering I think it depends on your usage, do some speed testing while the router is operational.

brada4 · May 25, 2024, 7:47pm

It heavily depends on the netcard, some run one irq per queue and irqbalance is sufficient
Enable means affining queues to 1st hyperthreads of cores, should usually be better performing than artificially torturing half-speed pseudocores.

moeller0 · May 25, 2024, 8:15pm

The biggest effect of enabling receive packet steering typically is, that receive and transmit processing (think qdiscs like traffic shapers) is way more likely to end up on different CPUs, so for SQM on multicore routers packet steering tends to be a win, but the details matter, i.e. how many and which CPUs are used.
As @egc explained the value is interpreted as hexadecimal representation of a bitmap.

Addendum: OpenWrt packet steering defaulted to exempting the CPUs involved in processing interfaces interrupts from packet steering. This is a decent approach for devices with many CPUs, but for routers with 2 or 4 cores this can be suboptimal. Especially on dual core devices that often means only one CPU can be used for packet steering... this is why the "on all CPUs" option can be helpful.

jmv20101 · May 26, 2024, 9:19am

Hmm. e would amount to all cores but core 0. (2^1+2^2+2^3=e) Your rps_cpus won't survive long, as the steering scripts will overwrite them.
Furthermore, are you running snapshot, because I'm having some problems with the current iteration of the steering scripts and workings. https://github.com/openwrt/openwrt/issues/15445

NickC · December 20, 2024, 7:31pm

Interesting find. For my MT6000 (4 cores), if I use SQM, CAKE completely bogs-down core 3 (100%) when processing rules at about 500 Mbps. With packet steering ON across all CPU cores, the load gets spread, and usually CPU is the first to get to 100% utilization at 800 Mbps.

So it does seem to help with SQM tasks, but now it gets me wondering if I can direct SQM/Cake to run on a dedicated core manually instead of having packet steering enabled for the whole router.

ktmakwana · December 27, 2024, 6:30pm

I have the Flint 2 and I see better latency without the packet steering.

My cpus get spread across at 300 MB.

With packet steering on I see higher CPU usage so I have it off. Maybe the MT76 driver already has packet steering enabled?

NickC · December 28, 2024, 11:44am

In my testing, the exact opposite is true. Packet steering massively improves processing and drops the latency at speed over 500mbps

KFO · December 28, 2024, 12:22pm

If you allow me to ask ...

What are you meaning by latency the ping in ms ? or what .

I had same issue on BPI-R4 when I active packet steering i'm not getting full speeds so I need to IRQ manually for CPUs .. to get full speeds

ktmakwana · December 28, 2024, 7:57pm

I saw your other post and it seems you are also using DSCP Classify with kmod package.

I wonder if DSCP Classify is benefiting from the packet steering setting of GUI. Btw it's the setting that has a checkbox right? Its either on or off?

brada4 · December 28, 2024, 10:03pm

dscp marking happens on same CPU, so yes, 10 cpus - 10 marking queues.

NickC · December 29, 2024, 9:11am

Seed test from Cloudflare, waveform bufferbloat test + ping (1000 packets at least) under typical network usage.

The issue with not using packet steering is that at speeds above 500Mbps, my CPU1 which handles the ETH1 (wan) port and apparently SQM/cake gets overwhelmed and starts to drop packets/add latency. This gets solved when I enable packet steering as the ETH1 and SQM/cake run on separate CPUs and the bottleneck happens at very near line speed (960Mpbs).

I think you can achieve the same effect with manual/IRQ scheduling. It just so happens that for the Flint 2 it seems to work out-of-the-box.

KrypteX · May 22, 2025, 8:08am

Is Packet Steering required or helpful if I have Hardware Flow Offloading enabled (I don't use SQM) on AX59U (Mediatek Filogic 830 - MT7986A 4 cores @ 2 GHz)?

By default Packet Steering is enabled and HFO disabled for this device. I prefer having HFO enabled, so I was wondering if Packet Steering should be left enabled in this case? Will it hinder performance or latency?

Note: I have a PPPoE connection on WAN, I'm not sure if it has anything to do with my question.

brada4 · May 22, 2025, 9:11am

Hardware offload works independently from CPU, fairly indifferent whether few packets passing cpu are steered to other cores.
You can leave it on, because for both software paths it is important to have more cpu power reading and rewriting packet headers.

NickC · May 22, 2025, 11:27am

I'd imagine it benefits certain CPU-demanding features, not the HWO part. In my case, I do see a benefit with it set to 1 (enabled, but not on all CPUs) for SQM and DSCP tagging. Possibly even firewall rulesets, DNS, etc might benefit.

brada4 · May 22, 2025, 3:06pm

Sqm does not work with hwo. Offload takes dscp & 0x7 ie only 8 cos values as absolute priorities. Something like pfifo_fast.