I have a gigabit uplink and was enjoying an even spread across all CPUs of the softIRQ load on my router(s). I noticed that this went away with an upgrade on one of them to 21.02, and appears to still be gone with 22.03. None of the recommended IRQ balancing solutions (irqbalance, SMP affinity, etc.) accomplish the same thing. That is, in 19.07, all of the interrupts of a single IRQ were spread across all available cores. The graphical representation of this is with htop. On a b1300 with 19.07, I would see nearly 900mbs and all four cores nearly maxed. With 21.02 or 22.03, I see a single core maxed, and bandwidth pegged at about 250mbs. The oddest thing, is that /proc/irq/default_smp_affinity is set to a value of 'f' which appears to include all 4 cores. However, I cannot alter the value for /proc/irq/default_smp_affinity or smp_affinity for any of /proc/irq/*/smp_affinity.
I suspect that what we have in 21.02 and 22.03 is more efficient on a multi-core system with multiple irq's, and irrelevant for most openwrt deployments on single-core routers. However, for those of us trying to eek out every last pktPerSec on a multiCore router, this is not ideal.
I'm hoping to learn of a runtime configuration, or even a kernel build configuration that will bring back the old behavior. I've spent considerable time searching for solutions have yet to find anyone even discussing this change.
This might help: https://openwrt.org/docs/guide-user/advanced/load_balancing_-_tuning_smp_irq
Have you tried enabling packet steering in Network -> Interfaces -> Global network options
?
1 Like
I do love 'may help or hinder network speed'. Those are the only 2 options....
packet_steering does not appear to make any difference. All softirq's are hitting CPU0.
the load balancing page is helpful, but unfortunately I've tried things there. Most of those configs are for how to direct different irq's to different CPU's, and what I'm after is the even distribution of all IRQs across all CPUs w/o any specific steering like 19.07 did.
It's VERY unlikely you had four Pi cores lit at or below gigabit line rate because unless they were 64-byte packets that's simply not possible.
That being the case, without getting into the peculiarities of USB-based interfaces and how they are bridged to the CPU, one of the issues is likely that you installed irqbalance but didn't actually turn it on - the latter being a necessary step.
Sorry for any confusion, but the scenario you reference was an GL.inet b1300. Also arm, but not a pi. I'm not looking for what irqbalance does, but more what 19.07 did, which seemed to automatically spread all of the irq load across all CPUs. I'm starting to think that the difference is entirely within the linux kernel itself.
Here's the interrupt spread after fresh boot and 1 speedtest. In this test, it's got the default network setup and is just doing NAT, but no pppoe, so the max speed was up around 900mbs, but 100% of the load was on CPU0:
b1300:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
26: 12488 6314 8944 69521 GIC-0 20 Level arch_timer
30: 96373 0 0 0 GIC-0 270 Level bam_dma
31: 2518504 0 0 0 GIC-0 127 Level 78b5000.spi
32: 0 0 0 0 GIC-0 239 Level bam_dma
33: 8 0 0 0 GIC-0 139 Level msm_serial0
50: 16 0 0 0 GIC-0 200 Level ath10k_ahb
67: 15 0 0 0 GIC-0 201 Level ath10k_ahb
68: 34268 0 0 0 GIC-0 97 Edge edma_eth_tx0
69: 1 0 0 0 GIC-0 98 Edge edma_eth_tx1
70: 36599 0 0 0 GIC-0 99 Edge edma_eth_tx2
71: 6 0 0 0 GIC-0 100 Edge edma_eth_tx3
72: 366 0 0 0 GIC-0 101 Edge edma_eth_tx4
73: 24 0 0 0 GIC-0 102 Edge edma_eth_tx5
74: 10 0 0 0 GIC-0 103 Edge edma_eth_tx6
75: 2 0 0 0 GIC-0 104 Edge edma_eth_tx7
76: 194 0 0 0 GIC-0 105 Edge edma_eth_tx8
77: 11 0 0 0 GIC-0 106 Edge edma_eth_tx9
78: 20 0 0 0 GIC-0 107 Edge edma_eth_tx10
79: 0 0 0 0 GIC-0 108 Edge edma_eth_tx11
80: 113 0 0 0 GIC-0 109 Edge edma_eth_tx12
81: 10 0 0 0 GIC-0 110 Edge edma_eth_tx13
82: 27 0 0 0 GIC-0 111 Edge edma_eth_tx14
83: 0 0 0 0 GIC-0 112 Edge edma_eth_tx15
84: 40186 0 0 0 GIC-0 272 Edge edma_eth_rx0
86: 45529 0 0 0 GIC-0 274 Edge edma_eth_rx2
88: 23243 0 0 0 GIC-0 276 Edge edma_eth_rx4
90: 35559 0 0 0 GIC-0 278 Edge edma_eth_rx6
100: 1 0 0 0 msmgpio 5 Edge keys
101: 0 0 0 0 msmgpio 63 Edge keys
102: 0 0 0 0 GIC-0 164 Level xhci-hcd:usb1
103: 0 0 0 0 GIC-0 168 Level xhci-hcd:usb3
IPI0: 0 0 0 0 CPU wakeup interrupts
IPI1: 0 0 0 0 Timer broadcast interrupts
IPI2: 16528 398273 1163267 71042 Rescheduling interrupts
IPI3: 1222 75416 163610 8413 Function call interrupts
IPI4: 0 0 0 0 CPU stop interrupts
IPI5: 4766 1446 1780 1500 IRQ work interrupts
IPI6: 0 0 0 0 completion interrupts
Err: 0
The vast majority of the interrupts did not occur on a network interface.
For anyone with the same situation finding this thread, I think have it figured out.
CONCLUSION: I'm not sure if it was /etc/hotplug.d/net/20-smp-tune or something else in 19.07, but it appears that while there was no IRQ SMP magic, all 8 network IRQs were evenly spread across the 4 CPUs in 19.07. Both 21.02 and 22.03 by default (and even with packet_steering enabled) do not accomplish this. however, by playing around with different values in /proc/irq/??/smp_affinity_list, speedtesting, and checking the distribution in /proc/interrupts, I was able to come up with a boot script that does a decent job of spreading the IRQ love across all CPUs:
# rebalance IRQ/CPU distribution for b1300:
for i in 70 86
do
echo '1-3' > /proc/irq/$i/smp_affinity_list
done
for i in 68 72 84
do
echo '2-3' > /proc/irq/$i/smp_affinity_list
done
for i in 76 88 90 31
do
echo '3-3' > /proc/irq/$i/smp_affinity_list
done
for i in 70 86 68 72 84 76 88 90
do
cat /proc/irq/$i/smp_affinity_list
done
YMMV, but /proc/interrupts will tell the story, and you can assign different IRQs according to your needs. I'm using smp_affinity_list because it was easy for me to think of the correct values, and while 2-3 technically means that the IRQ can use CPUs 2 and 3, 99% of the interrupts will land on CPU2 (the first value in the range, 2-3). Some IRQ's can only use CPU0, so you'll have to figure out which if any of those you have and move the others onto CPUs 1-3.