Hi,
I have two APU2's, configured as access points, connected to 1 GB ethernet.
Testing with iperf3 shows 900Mbits/sec when sending data from APU2, which is fair enough.
But when receiving, it can only achieve 300 MBits/sec.
It should be noted that when sending, CPU usage goes to 100% on a single CPU (the box has 4 cores), but when receiving, it never goes over 30% of CPU usage on a single core.
There is a similar thread but no solution there either: APU2c4 with 19.07 poor ethernet performance
Running fresh OpenWrt 22.03.0, but boxes had the same issue with earlier releases, too.
Any hints?
Thanks,
shpokas
I have run more or less the same tests last year on 2 APU2 boards with clients behind them (I mean, iperf3 was launched on 2 computers that were connected to 2 APU2 openwrt boxes, like this:
PC --- APU2 (firewall, nat, vpn) --- APU2 (firewall, nat, vpn) --- PC
My test was made to measure VPN performance, not ethernet performance, but I also measured ethernet performance.
My results (from memory) are that ethernet performance is fine at 970+ megabit with low cpu use (when iperf3 is NOT running on the APU box itself), openvpn performance is about 300 mbit with one CPU core maxed up (openvpn is not multicore) and wireguard performance is about 700 mbit.
I know my results are not what you are asking for, but I'd say that for a real life scenario, where traffic is not generated or terminated on the firewall the APU2 works fine.
I'd also suppose that the issue you are seeing is something related to how iperf3 works and its use of a single core.
But why would command iperf3 -c 10.26.0.1 -R
work differently from iperf3 -c 10.26.0.1
First one is instructing iperf3 server to send data, last one is receiving data from iperf server.
And I don't have anything inbetween, both iperf server and APU2 is on the same network segment 10.26.0.0/24 with firewall disabled on APU2.
I don't know for sure, but maybe it has something to do with the number of interrupts per second? Something that is not directly related to CPU time gets maxed up and becomes a bottleneck.
Look at the second row from the top of the "top" command, where the totals are shown. what are the totals while running the tests? (run a long test and allow about 5 seconds after starting it for the values to stabilize before reding them)
One way to quickly asses the load is to calculate 100 - %idle as some tools do not report irq/sirq.
However with a multicore router looking at busybox' top aggregate load numbers (combined for all CPUs) is not that helpful, however htop (installable package should be available) will report load data for each CPU individually and can be configured to also show sirq.
85% idle on a four core CPU can mean anything from each CPU itself loaded 15% or a single CPU loaded 60%. Even that would not be alarming, but it illustrates that top's aggreage single load percentage is not the best tool to detect/diagnose issues that might be caused by overload of individual CPUs.
I already told in the first post that individual, single core usage never goes over 30% when receiving data.
I would love to see 100% usage on a single core because that gives 900 MBits/sec when iperf data is sent to server.
Well, what are alternatives to iperf3? it seems this tool is dominating testing space, but I already have run into a bug with and old version of iperf3 that was distributed with Debian 9.
Are you running iperf on the APU? That is a meanigful test, albeit not a test of the APUs capabilities as router, for that you are better off with moving iperf server and client to different computers (just make sure these devices are not directly connected by a switch).
Yes you did, this is why I was surprised to see simple top output with one aggregate CPU line.
For htop you can and should enable detailed reporting for the CPU bars so you can see sirq immediately (can be done somewhere in the configuration). I expect this not to change the picture but it will give a better feel where the CPU spends its time.
I booted system rescue CD and got somewhat opposite results.
Now sending is slower than receiving, which goes at almost full wire speed.
I also ensured the interface used is the same in both OpenWRT and SystemRescue.
So it is clearly something in the OS, but it is not clear what exactly.
Well, I found what is the problem here - a BIOS setting caused this issue.
"PCIe power management features" - when enabled, causes significant throughput reduction, not only for wired, but for wireless transmission as well.
Not sure if this is a bug or a feature, but - DO NOT enable it!
I actually started with going back to old BIOS'es, because I had run out of ideas, but at the same time I sort of remembered that I've seen better throughput.
And old BIOSes did not show the problem, so I was determined to find which BIOS did introduce it until I found that it's not actually the BIOS itself, but a setting in it
It should be noted that for APU2, BIOS upgrade causes customizations to be reset to default values, it did help a little, too
u PCIe power management features - enables/disables PCI Express power management features like: ASPM, Common Clock, Clock Power Management (if supported by PCI Express endpoints). Enabling this option will reduce thepower consumption at the cost of performance drop of Ethenet controllersand WiFi cards
how did you get into the APU2 bios via minicom? I tried disabling macros in minicom and using F10 and minicom seems not to pass the F10 through to serial or coreboot isn't responding to it.
My hardware is an APU2 but the bios saysduring boot:
PC Engines apu1
coreboot build 20183108
BIOS version v4.8.0.3