From general linux performance testing on multi-cpu systems you might want to look on per core utilization, between user/system and specifically irq/sirq (parts of system). Often times a single core is 100% utilized with IRQ handling, being the system bottleneck, while other cores are somewhat idle.
I've used mpstat
and top
to get it, ie this is my 1 cpu router doing ~240Mbi/sec and choking on software irqs (last column) at the peak of doing test against fast.com:
# while sleep 2; do top -n1 -b |grep "CPU"|grep -vE "grep|PPID"; done
CPU: 0.0% usr 0.0% sys 0.0% nic 100% idle 0.0% io 0.0% irq 0.0% sirq
CPU: 0.0% usr 0.0% sys 0.0% nic 83.3% idle 0.0% io 0.0% irq 16.6% sirq
CPU: 0.0% usr 0.0% sys 0.0% nic 91.6% idle 0.0% io 0.0% irq 8.3% sirq
CPU: 6.6% usr 0.0% sys 0.0% nic 13.3% idle 0.0% io 0.0% irq 80.0% sirq
CPU: 0.0% usr 8.3% sys 0.0% nic 0.0% idle 0.0% io 0.0% irq 91.6% sirq
CPU: 0.0% usr 0.0% sys 0.0% nic 0.0% idle 0.0% io 0.0% irq 100% sirq
CPU: 0.0% usr 0.0% sys 0.0% nic 90.9% idle 0.0% io 0.0% irq 9.0% sirq
CPU: 0.0% usr 0.0% sys 0.0% nic 100% idle 0.0% io 0.0% irq 0.0% sirq
Check if your router supports top -1
to show per CPU distribution, then you might be able to pinpoint the bottleneck, ie on this 4 core x64 linux a single core does all irqs, utilizing up to 17% of a single core for software irqs (si, second to last column):
$ while sleep 2; do top -n1 -b -1 |grep "%Cpu"|grep -vE "grep|PPID";echo; done
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 5.6 us, 5.6 sy, 0.0 ni, 83.3 id, 0.0 wa, 5.6 hi, 0.0 si, 0.0 st
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 5.9 us, 0.0 sy, 0.0 ni, 94.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 6.2 us, 6.2 sy, 0.0 ni, 87.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni, 93.3 id, 0.0 wa, 0.0 hi, 6.7 si, 0.0 st
%Cpu0 : 6.2 us, 12.5 sy, 0.0 ni, 81.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni, 82.4 id, 0.0 wa, 0.0 hi, 17.6 si, 0.0 st
%Cpu0 : 6.7 us, 0.0 sy, 0.0 ni, 93.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 11.8 us, 5.9 sy, 0.0 ni, 82.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni, 82.4 id, 0.0 wa, 0.0 hi, 17.6 si, 0.0 st
%Cpu0 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 5.9 us, 11.8 sy, 0.0 ni, 82.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2 : 5.9 us, 11.8 sy, 0.0 ni, 82.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni, 87.5 id, 0.0 wa, 0.0 hi, 12.5 si, 0.0 st
%Cpu0 : 0.0 us, 6.2 sy, 0.0 ni, 93.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 0.0 us, 0.0 sy, 0.0 ni, 94.4 id, 0.0 wa, 0.0 hi, 5.6 si, 0.0 st
%Cpu2 : 5.6 us, 16.7 sy, 0.0 ni, 77.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 0.0 us, 0.0 sy, 0.0 ni, 88.2 id, 0.0 wa, 0.0 hi, 11.8 si, 0.0 st