Irqbalance throwing errors on some routers after latest build following git pull

I build my own images from source.
My previous build was: OpenWrt 23.05-SNAPSHOT r23630-842932a63d / LuCI openwrt-23.05 branch git-23.315.63824-5a81162
With this build irqbalance was enabled and it was working on all my routers.
content of /etc/config/irqbalance:

config irqbalance 'irqbalance'
	option enabled '1'

After today's git pull the build is OpenWrt 23.05-SNAPSHOT r23645-7606dac661 / LuCI openwrt-23.05 branch git-23.315.63824-5a81162

After flashing this new build, irqbalance started throwing the following errors:

Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 44 affinity: Invalid argument
Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 54 affinity: Invalid argument
Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 46 affinity: Invalid argument
Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 53 affinity: Invalid argument
Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 48 affinity: Invalid argument
Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 49 affinity: Invalid argument
Dec  7 15:46:07 vespero /usr/sbin/irqbalance: Cannot change IRQ 50 affinity: Invalid argument

on the following routers:
Linksys EA8300, Linksys MR8300, Netgear R7800.

It continues to work fine on Asus TUF AX4200, Linksys EA8500, BT Home Hub 5

What was changed and why did irqbalance stop working on some of the routers listed above?

I have the same problem on my r7800 router:

Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 44 affinity: Invalid argument
Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 54 affinity: Invalid argument
Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 46 affinity: Invalid argument
Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 53 affinity: Invalid argument
Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 48 affinity: Invalid argument
Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 49 affinity: Invalid argument
Thu Dec  7 01:15:21 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 50 affinity: Invalid argument
Thu Dec  7 01:15:31 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 44 affinity: Invalid argument
Thu Dec  7 01:15:31 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 54 affinity: Invalid argument
Thu Dec  7 01:15:31 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 46 affinity: Invalid argument

The reason seems to be upstream irqbalance changes, regarding meson build system (which was pushed to upstream by @neheb and is still consider as a sideshow, as discussed in https://github.com/Irqbalance/irqbalance/pull/279)

There are post-1.9.3 fixes for meson upstream, and I will try them.
Apparently the current code in 1.9.3 requires dependency adjustments for meson.

OT I guess…

The project author’s reasoning is flawed. If the meson stuff is hidden away in a contrib directory, nobody will find and use meson.

Anyway, meson is much easier to patch than autoconf files.

Hi Hnyman,

thank you for your effort. Irqbalance is a really important package for the performance, I hope you manage to fix it.

All the best

Seems that the error is not actually about the meson side (although also that might need attention, as we are still using your meson 1.9.0 build file, instead of the 1.9.3)

The error is due to thinking some errors as transient and not disabling them at the (first) failure.

We are specifically in some 23.05 device suffering from EINVAL error.
this is from my R7800, log and strace for one interrupt (48):

Sat Dec  9 10:17:27 2023 daemon.warn irqbalance: Cannot change IRQ 53 affinity: Invalid argument
Sat Dec  9 10:17:27 2023 daemon.warn irqbalance: Cannot change IRQ 24 affinity: I/O error
Sat Dec  9 10:17:27 2023 daemon.warn irqbalance: IRQ 24 affinity is now unmanaged
Sat Dec  9 10:17:27 2023 daemon.warn irqbalance: Cannot change IRQ 48 affinity: Invalid argument


open("/proc/irq/48/smp_affinity", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 6
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6ed9000
ioctl(6, TIOCGWINSZ, 0xbeb134e8)        = -1 ENOTTY (Not a tty)
writev(6, [{iov_base="00000001", iov_len=8}, {iov_base=NULL, iov_len=0}], 2) = -1 EINVAL (Invalid argument)
close(6)                                = 0
munmap(0xb6ed9000, 4096)                = 0
clock_gettime64(CLOCK_REALTIME, {tv_sec=1702109847, tv_nsec=340549081}) = 0
sendto(5, "<28>Dec  9 08:17:27 irqbalance: "..., 80, 0, NULL, 0) = 80
writev(1, [{iov_base="Cannot change IRQ 48 affinity: I"..., iov_len=47}, {iov_base="\n", iov_len=1}], 2Cannot change IRQ 48 affinity: Invalid argument
) = 48

Changing EINVAL to be a permanent error fixes it, and reverts to the earlier behaviour:

--- a/activate.c
+++ b/activate.c
@@ -98,11 +98,11 @@ error:
 	case ENOSPC: /* Specified CPU APIC is full. */
 	case EAGAIN: /* Interrupted by signal. */
 	case EBUSY: /* Affinity change already in progress. */
-	case EINVAL: /* IRQ would be bound to no CPU. */
 	case ERANGE: /* CPU in mask is offline. */
 	case ENOMEM: /* Kernel cannot allocate CPU mask. */
 		/* Do not blacklist the IRQ on transient errors. */
 		break;
+	case EINVAL: /* IRQ would be bound to no CPU. */
 	default:
 		/* Any other error is considered permanent. */
 		info->flags |= IRQ_FLAG_AFFINITY_UNMANAGED;

I will probably add that as a quick fix for 23.05, but I wonder why the unpatched 1.9.3 seemed to work in main/master build for R7800.

EDIT:

6 Likes

I pulled the latest and I saw your patches. It is all good now. Thank you for fixing it quickly!

How long will it take until the package can be installed normally via the package manager?

Depends on queuing luck in the buildbot round-robin logic...
Can take 1-3 days before your architecture is built and irqbalance 1.9.3-2 is downloadable .

And 23.05 is separate buildbot:

Hi,

Having irqbalance issues in DAP-2610 + snapshot.

root@DAP2610:/# opkg list-installed | grep -i irqbal
irqbalance - 1.9.3-2
root@DAP2610:/#

Thu Dec 14 09:54:54 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 26 affinity: I/O error
Thu Dec 14 09:54:54 2023 daemon.warn /usr/sbin/irqbalance: IRQ 26 affinity is now unmanaged
Thu Dec 14 09:54:54 2023 daemon.warn /usr/sbin/irqbalance: Cannot change IRQ 57 affinity: Invalid argument
Thu Dec 14 09:54:54 2023 daemon.warn /usr/sbin/irqbalance: IRQ 57 affinity is now unmanaged

Any idea about?

Thanks,

Normal.
Those two IRQs are just recognized as unmanageable and marked so "is now unmanaged", so that there is no error on each run at every 10 seconds.

Thanks for your help @hnyman , in previous OpenWRT versions there was no error/log about irqbalance behaviour.

Now will be prompting log's (no problem for me)?

Thanks,

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.