Comparative Throughput Testing Including NAT, SQM, WireGuard, and OpenVPN

From general linux performance testing on multi-cpu systems you might want to look on per core utilization, between user/system and specifically irq/sirq (parts of system). Often times a single core is 100% utilized with IRQ handling, being the system bottleneck, while other cores are somewhat idle.

I've used mpstat and top to get it, ie this is my 1 cpu router doing ~240Mbi/sec and choking on software irqs (last column) at the peak of doing test against fast.com:

# while sleep 2; do top -n1 -b |grep "CPU"|grep -vE "grep|PPID"; done
CPU:  0.0% usr  0.0% sys  0.0% nic  100% idle  0.0% io  0.0% irq  0.0% sirq
CPU:  0.0% usr  0.0% sys  0.0% nic 83.3% idle  0.0% io  0.0% irq 16.6% sirq
CPU:  0.0% usr  0.0% sys  0.0% nic 91.6% idle  0.0% io  0.0% irq  8.3% sirq
CPU:  6.6% usr  0.0% sys  0.0% nic 13.3% idle  0.0% io  0.0% irq 80.0% sirq
CPU:  0.0% usr  8.3% sys  0.0% nic  0.0% idle  0.0% io  0.0% irq 91.6% sirq
CPU:  0.0% usr  0.0% sys  0.0% nic  0.0% idle  0.0% io  0.0% irq  100% sirq
CPU:  0.0% usr  0.0% sys  0.0% nic 90.9% idle  0.0% io  0.0% irq  9.0% sirq
CPU:  0.0% usr  0.0% sys  0.0% nic  100% idle  0.0% io  0.0% irq  0.0% sirq

Check if your router supports top -1 to show per CPU distribution, then you might be able to pinpoint the bottleneck, ie on this 4 core x64 linux a single core does all irqs, utilizing up to 17% of a single core for software irqs (si, second to last column):

$ while sleep 2; do top -n1 -b -1 |grep "%Cpu"|grep -vE "grep|PPID";echo; done
%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  5.6 us,  5.6 sy,  0.0 ni, 83.3 id,  0.0 wa,  5.6 hi,  0.0 si,  0.0 st

%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  5.9 us,  0.0 sy,  0.0 ni, 94.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  6.2 us,  6.2 sy,  0.0 ni, 87.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni, 93.3 id,  0.0 wa,  0.0 hi,  6.7 si,  0.0 st

%Cpu0  :  6.2 us, 12.5 sy,  0.0 ni, 81.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni, 82.4 id,  0.0 wa,  0.0 hi, 17.6 si,  0.0 st

%Cpu0  :  6.7 us,  0.0 sy,  0.0 ni, 93.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 11.8 us,  5.9 sy,  0.0 ni, 82.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni, 82.4 id,  0.0 wa,  0.0 hi, 17.6 si,  0.0 st

%Cpu0  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  5.9 us, 11.8 sy,  0.0 ni, 82.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :  5.9 us, 11.8 sy,  0.0 ni, 82.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni, 87.5 id,  0.0 wa,  0.0 hi, 12.5 si,  0.0 st

%Cpu0  :  0.0 us,  6.2 sy,  0.0 ni, 93.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :  0.0 us,  0.0 sy,  0.0 ni, 94.4 id,  0.0 wa,  0.0 hi,  5.6 si,  0.0 st
%Cpu2  :  5.6 us, 16.7 sy,  0.0 ni, 77.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :  0.0 us,  0.0 sy,  0.0 ni, 88.2 id,  0.0 wa,  0.0 hi, 11.8 si,  0.0 st
4 Likes