Help confirm/fix High CPU Soft IRQ with many connections/speedtest.net (20.02.2 x64)

opkg update
opkg install htop
htop
*f2 for settings
*check Detailed CPU time. Soft IRQ color is Magenta -the pinkish color.
*f10 done

*run below iperf3 command with 100 conenctions..
*from other SSH session inside OpenWRT and keep eye on htop,
*does one CPU core max out?

iperf3 -P 100 -c queen.cisp.co.za

*now run same command via host behind OpenWRT and monitor htop of OpenWRT.. any CPU core max out now?

*Does anyone use x86 or x64 and have same issue?
*If someone behind OpenWRT uses speedtest.net or iper3 -P100 then one router CPU core maxes out with 95% of Soft IRQ ?

*If same iper3 -P100 testing from OpenWRT then CPUs does not swet.

*What could be the issue here? iptables involved?

*How to configure OpenWRT so CPUs will not swet if clients use the router and open few hundred connections. Even simple speedtest.net

####### More background info..
Im running 20.02.2 x64 self compiled for few pcie realtek cards.. these drivers are not involved in testings on virtualbox tho. And for mdadm raid1 booting which should not influence CPU usage on network loads. Can SELinux influence networking speeds ?

In virtualbox with virtio for all network interfaces - to validate, precreate configs and test to replace pfSense on real hardware later.

Noticed a weird thing.

iperf3 from or to router get about 10Gbps all 8 cores of opensese are almost idle about 5%ish seldom.
This i thought is superb as pfSense cant even handle virtio drivers properly.. has sub 100mbps speeds with same setup.

Via client virtualbox that uses DHCP of OpenWRT virtualbox.. Client goes to speedtest.com.. one core on OpenWRT gets maxed out on every download test. Upload part of speedtest cores are around 10-15%.

This seemed extremely abnormal as I knew iperf3 got 10Gbps speed and CPU cores did not swet.

It is a Ryzen cpu 8 cores to the virtual box of OpenWRT virtualbox and the outside internet connection is 100mbps and via Virtualbox guest gets 60mbps but host.. the real connection speed is aroun 91mbps so OpenWRT as a router and 1 core maxed does block off some downlaod speed.. not upload speed tho.. which is around 94mbps on host and on guest behind OpenWRT (and core does not amx out).
This seemed worse than pfSense virtualbox (with e1000 drivers not virtio.. no such issue.. did not reach 1Gbps with iperf3 but the 100mbit internet testings were fine, no cores overloading).

What did not affect the issue. any state.. Enable/Disable

  • Packet Steering
  • Software flow offloading
  • irqbalance (with blank config just enabled/disabled)
  • installing/uninstalling mwan3
  • setting firewall config file back to default

So I started to suspect maybe it is not bandwith but connection amount.
So I tested iperf3 -P 100 to internet host from OpenWRT but it worked fine no cores swetting.
As I tested same from behind router.. virtualbox using OpenWRT as router then OpenWRT had one CPU core maxed again with Soft IRQ. Magenta color in htop.

Im out of ideas what to try next - any ideas are welcome.

Install the package sysstat.

To trace CPU usage -

pidstat -T TASK 2 (which snapshots every 2 seconds)

To log the results (name the file whatever you want) -

pidstat -T TASK 2 | tee -a CPUtrace.txt

CTRL-C to end the trace.

Log file will be in the root directory.

1 Like

I had issues with a virtual x86_64 with a 1 Gbps connection (I would only get about 200 Mbps). There are known issues with virtual network cards:

This is a known problem: [Solved] X86_64 regular traffic slow but Wireguard wow

1 Like

At least my ivy-bridge celeron 1037u with two Intel 82574L (e1000e) ethernet cards remains pretty bored (400/200 MBit/s ftth, plain ethernet/ DHCP, SQM/ cake), it doesn't even clock up all the way (stays on ~800 MHz).

2 Likes

Thank you for replies,
Its good to know this might probably be virtualization only issue, but I cant test on hardware yet :-/ and there is a possibility of needing a virtualized router/firewall also in the future so I keep digging this.

More testing.

So today I changed OpenWRT and its client vm adapter type to Intel PRO/1000 MT Desktop
(Virtualbox version is 6.1.30)
Booted all up.
Checked stuff..
Packet steering Enabled
Software flow offloading Enabled

  • iperf3 -P 100 -c queen.cisp.co.za

  • speedtestn.net

  • Both Result: 94Mbps down (with speedest also up) ,
    2 cores 20% ish
    1 core at 10% ish
    few 4%
    Hm this seems okish I guess..
    No more 100% or 80% on 1 core.
    All load shown as Soft IRQ still.

I read the post [lleachii] sent and applied to all vm interfaces.
ethtool -K ethX tso off
ethtool -K ethX sg off
ethtool -K ethX generic-receive-offload off

Ran same test, got similar results.

Also did iperf3 between OpenWRT vm and its client vm.
2.86 Gbps.. CPU cores dont swet at all.
(not 10 Gbps as with virtio interfaces, but intel driver should emulate 1G card, how its 2.8G is werid? )

With -P 100 aka hundred connections.. 2 cores kinda max out.
SUM 1.42 Gbps.
So the problem still exist with many conenctions AND high bandwith, but 100mbps connection works fine with those Intel virtualbox adapters.

So I closed vm's, change all interfaces back to virtio.
Same test.

iperf3 between OpenWRT vm and its client vm.
Single connection, I guess this load 2cores around 50% seems fine for 11Gbps ^-^
For 100 connections..
Some cores 70%
This does not seem fine as last SUM is 447Mbps, its sub Intel 1Gb emulated interface performance.
For this speed this CPU load seems too much.

Made pidstat -T TASK 2 results for the worst cases.
Not sure how to interpret those 2sec then %CPU=2 is actually100%?

There is still this case that if you run iperf3 -P100 with same virtio drivers from router SSH shell itself.. then the load is fine (many cores around 10+%) and speed also 91-95Mbps
So its not just directly virtio driver fault, but the path the same type of iperf3 connections goes inside OpenWRT? maybe iptables involved? forward path (client traffic LAN to WAN) is slow somehow with virtio drivers specifically with many conenctions somehow?

If client vm sends same iperf3 -P100 to internet via OpenWRT .
They are conencted via "VirtualBox Host-Only Ethernet Adapter" as LAN adapter on OpenWRT and on client vm.

And if OpenWRT does same test directly it connects via its own localhost interface i guess to WAN interface which is virtualbox "Bridged Adapter".. THEN CPU LOAD IS OK.
I mean the client also accesses internet via same WAN "Bridged Adapter" but must first use OpenWRT LAN adapter which is "VirtualBox Host-Only Ethernet Adapter".

PS: also tried those with virtio adapters:
ethtool -K ethX tso off
ethtool -K ethX sg off
ethtool -K ethX generic-receive-offload off
Didnt change this behaviour.

It seems better to use virtual Intel adapter option in virtualbox, slightly more "stable"ish results.
Not sure what else to try, very mich would like 10+Gbit internal virtual interface speeds, but can not sacrifice sub 100mbps multi connection CPU problems for it.

Btw if in OpenWRT vm, iperf3 to itself localhost then speeds are.
-P100 48.1 Gbits/sec (2cores maxed)
1 connection 59.1 Gbits/sec (1core maxed 1core 60%)
So issue is not CPU related i guess.

Could VmWare or other virtualization software work better? on Windows 10.

I haven't tested OpenWrt (and the SpeedTest.net software license doesn't allow)...but yes...I've personally received 4 Gbps+ (~ connection's max) on an x86_64 VM on an ESXi host...I'm not sure about VMware Workstation Player.

I thought that a host-only adapter could not reach the Internet. Somehow you found it does though, which must be by having the host NAT and route to its Internet connection(*) That may not be very optimized.

  • This could be investigated with traceroute.

The bridge adapter is closer to what would happen running a dedicated bare metal machine with an Ethernet port.

To really benchmark stuff you should set up another X86 box (preferably bare metal) running an iperf3 server to simulate the Internet.

I think ESXi is like XEN, its "lower level" than KVM or Virtualbox.
Could also just be that ESXi host is not using windows networking at any level^^
Good to know.

https://docs.oracle.com/en/virtualization/virtualbox/6.0/user/network_hostonly.html
"When host-only networking is used, Oracle VM VirtualBox creates a new software interface on the host which then appears next to your existing network interfaces."

It seems to work similar to linux bridge..
On host side there is VirtualBox Host-Only interface.
On virtualbox side there is eth0 on client and on OpenWRT vm (LAN interface for OpenWRT).
So total 3 conenctions on same "bridge"

On windows aka host side on interface i set static ip.. remove gateway and dns and set only IP and subnet to SSH/webgui the OpenWRT host.
So windows don't use it for "internet" access but I could.. btw I did try (removed gateway from real interface also) and same result 1 core maxes on OpenWRT if host access internet via it.
It still used the same interface to access OpenWRT tho. Difference being its windows not linux client ^^

So I installed VmWare Player after hours for "personal use" :wink:
Very similar setup it creates VMnet1 like virtualbox created "VirtualBox Host-Only interface" on windows side.
On vm side "host-only" option did not work so i chose custom and "VMnet1 (Host-only)"

So results with Vmware player 16

Between OpenWRT vm and client vm

*iperf3 default aka 1 connection..
6.2Gbps CPU 1 core was 100%
Not good compared to vitualbox virtio which can go 11Gbps and no cores swet.
But better than virtualbox Intel adabter that had 2.8Gbps but low CPU usage.

*iperf3 -P100
4.2Gbps
CPU load all 4 cores 60% quite high.
"Very Good" considering virtualbox virtio got 400Mbps and most cores 70%
virtualbox Intel option got 1.42Gbps and 2cores maxed out.

Internet test.. client vm via OpenWRT vm

*speedtest.net
2cores 13% not 100% and speed 95mbps up and down
Seems as good as virtualbox Intel adapters.
Virtualbox virtio adapters, then one core maxes out always.

*iperf3 -P100 -p 17001 -c queen.cisp.co.za
1core 40% 3cores 25% SUM receiver 94.3Mbps
Cpu utilization seems higher than with virtualbox Intel adapters. But no ores max out as with virtio drivers.

Never imagined to discover such cacophony in performance of basic virtual interfaces.