Irqbalance isn't working? (RPi4)

I'm running snapshot on RPi4 having irqbalance 1.9.0-6 package installed.
etc>config>irqbalance:

config irqbalance 'irqbalance'
	option enabled '1'

	# Level at which irqbalance partitions cache domains.
	# Default is 2 (L2$).
	#option deepestcache '2'

	# The default value is 10 seconds
	#option interval '10'

	# List of IRQ's to ignore
	#list banirq '36'
	#list banirq '69'

It's enabled and running yet it seems like it's not spreading internet traffic & interrupts among all 4 cores. Here is the screenshot:

I even tested with and without qosify, flow offload, packet steering.
Is this normal?

You are correct, not working as you expect. I ended adding this to my /etc/rc.local:

# Move RPS/XPS to CPU3 and CPU4 only (disable packet steering or it gets overwritten)
echo c > /sys/class/net/eth0/queues/rx-*/rps_cpus
echo c > /sys/class/net/eth0/queues/tx-*/xps_cpus
echo c > /sys/class/net/eth1/queues/rx-*/rps_cpus
#echo c > /sys/class/net/eth1/queues/tx-*/xps_cpus # not available in eth1 (USB NIC)

# eth0 (LAN NIC) in CPU2 (rx/tx) to help improve bandwidth (might hurt latency?)
echo 2 > /proc/irq/39/smp_affinity
echo 2 > /proc/irq/40/smp_affinity
1 Like

irqbalance is designed to move each distinct interrupt to the 'least busy' core per sampling period.

It's not possible to "split" single-interrupt (read: single NIC inbound, for instance) traffic across multiple cores, it is necessarily serial delivery. You can flip that interrupt around cores as often as you like, but that really just makes a bad problem worse.

2 Likes

I uninstalled irqbalance, enabled packet steering (Although it didn't do anything) then did:

echo 8 >/proc/irq/18/smp_affinity
echo 8 >/proc/irq/32/smp_affinity
echo 2 >/proc/irq/39/smp_affinity     
echo 4 >/proc/irq/40/smp_affinity 

Result:

Is this wrong or need more changes?

It should be fine if that is the distribution you want. I don't see a problem with it; test it.

1 Like

came here to report back. At the moment I couldn't get snapshot to work with samba4, so I'm running 22.03.2 firmware with packet steering off and following code in /etc/rc.local

echo 8 >/proc/irq/18/smp_affinity
echo 8 >/proc/irq/32/smp_affinity
echo 2 >/proc/irq/39/smp_affinity     
echo 4 >/proc/irq/40/smp_affinity 

Performance is awesome, speed is stable. I'm lovin' it :wink:

2 Likes

I added a PR to master for arm64 (aarch64) routers like RT3200, RPi4 etc.

The new package version will be 1.9.2-2

(If it works, I will backport it to 22.03)

2 Likes

Thank you. I'll try it out soon and report back
Update: @hnyman Installed snapshot along with irqbalance

base-files bcm27xx-gpu-fw brcmfmac-nvram-43455-sdio busybox ca-bundle cypress-firmware-43455-sdio dnsmasq dropbear e2fsprogs firewall4 fstools iwinfo kmod-brcmfmac kmod-fs-vfat
kmod-nft-offload kmod-nls-cp437 kmod-nls-iso8859-1 kmod-r8169 kmod-sound-arm-bcm2835 kmod-sound-core kmod-usb-hid kmod-usb-net-lan78xx libc libgcc libustream-mbedtls logd
mkf2fs mtd netifd nftables odhcp6c odhcpd-ipv6only opkg partx-utils ppp ppp-mod-pppoe procd procd-seccomp procd-ujail uci uclient-fetch urandom-seed wpad-basic-wolfssl kmod-usb-net-rtl8152
kmod-usb-net-asix kmod-usb-net-asix-ax88179 luci-ssl luci-app-upnp tcpdump kmod-netem qosify kmod-sched kmod-tcp-bbr bash curl luci-proto-relay ca-certificates block-mount fdisk irqbalance

irqbalance:

config irqbalance 'irqbalance'
	option enabled '1'

	# Level at which irqbalance partitions cache domains.
	# Default is 2 (L2$).
	#option deepestcache '2'

	# The default value is 10 seconds
	#option interval '10'

	# List of IRQ's to ignore
	#list banirq '36'
	#list banirq '69'

Restarted router and still nothing.

irqbalance version says 1.9.2.-1

You have the old package.
Build from sources by yourself, or wait until buildbot has built it. (may take 2-3 days).

1 Like

same problem does not seem to work

@hnyman installed irqbalance on OpenWrt 22.03.3 build. It's working thank you!

  1. But why do we have to enable it from the file option enabled '1' even when it shows enabled in Luci startup?
  2. Seems like it doesn't use CPU2 I maybe wrong. But the idle cpu usage is 81% at load. while if we use manual echo 8 >/proc/irq/18/smp_affinity echo 8 >/proc/irq/32/smp_affinity echo 2 >/proc/irq/39/smp_affinity echo 4 >/proc/irq/40/smp_affinity then it performs better and with less cpu usage (~85% ideal cpu usage)
    What do you think?

It is two different things:

  • the startup list in LuCI shows all services that have their init scripts enabled. irqbalance among them. That is a pretty low-level on/off toggling, which does not survive sysupgrade.
  • most applications offer detailed config via the uci config file and do also offer there a separate enable/disable config option.

Irqbalance is no magic. It has some guessing logic about the role of the IRQs, but it might not correctly recognize all IRQs, and may leave them unhandled.

2 Likes

In my case, manually assigned IRQ affinity works better than irqbalance, then would you recommend it? Or should I stick with irqbalance because it has more pros than cons compared to manual?

I have recently noted with ipq807x/DL-WRX36 that the dynamic IRQ assignments from irqbalance may make the router crash, while manual affinity assignment done once seems to work well.

Irqbalance is no magic. It has started from the x86 world, and may not be perfect for the ARM chips.

3 Likes

Hey!
Any Idea why I cannot balance IRQ between the 4 cores of my Raspberry PI 4B?

I have installed irqbalance latest version (1.9.2-2).
after that I've set enabled '1' in the config file
and also checked the Packet Steering option in LuCi.

But it seems it still's using cpu #0.

           CPU0       CPU1       CPU2       CPU3
 25:          0          0          0          0     GICv2  29 Level     arch_timer
 26:      10448       5295      22314       3954     GICv2  30 Level     arch_timer
 29:       3610          0          0          0     GICv2  65 Level     fe00b880.mailbox
 32:         12          0          0          0     GICv2 153 Level     uart-pl011
 35:         32          0          0          0     GICv2 114 Level     DMA IRQ
 42:          1          0          0          0     GICv2  66 Level     VCHIQ doorbell
 43:      15744          0          0          0     GICv2 158 Level     mmc1, mmc0
 50:    1515577          0          0          0     GICv2 189 Level     eth0
 51:     366783          0          0          0     GICv2 190 Level     eth0
 57:          0          0          0          0     GICv2 175 Level     PCIe PME, aerdrv
 58:     891159          0          0          0  BRCM STB PCIe MSI 524288 Edge      xhci_hcd
IPI0:          0          0          0          0  CPU wakeup interrupts
IPI1:          0          0          0          0  Timer broadcast interrupts
IPI2:       1052       1118       1313       1114  Rescheduling interrupts
IPI3:       4299     282170     294255     466883  Function call interrupts
IPI4:          0          0          0          0  CPU stop interrupts
IPI5:       3549       1823      14761       1045  IRQ work interrupts
IPI6:          0          0          0          0  completion interrupts
Err:          0

My WAN interface is on eth0 and I have 2 lan networks, one in eth1 an the other on eth2.

This isn't a solution but a mere workaround.
at startup try this code:

echo 8 >/proc/irq/29/smp_affinity
echo 8 >/proc/irq/43/smp_affinity
echo 2 >/proc/irq/50/smp_affinity
echo 4 >/proc/irq/51/smp_affinity