We are facing this issue in Arcadyan AW1000 also which uses IPQ807x soc.
We can see through htop, every time we run ookla speedtest, the core0 will spike to 100% meanwhile the other 3 cores are like doing nothing which we believe what causes the bottleneck. The speed can't go beyond 700mbps.
To check what processes are maxing out the core0, we looked at /proc/interrupts and found that xhci-hcd:usb3 is the culprit.
To verify this, we set the irq of xhci-hcd:usb3 to run in the other cores, for exampe, core1 as so:
echo 2 > "/proc/irq/$(grep usb3 /proc/interrupts | awk -F: '{print $1}' | sed 's/^ //')/smp_affinity"
Indeed, core1 maxes out to 100% of usage each time we run ookla speedtest. Interestingly, even if we echo f in the command above which means to be run in all cores, xhci-hcd:usb3 still running in only 1 core.
In conclusion, we believe that the usb3 driver (we are not sure if this is the right term) doesn't support multithreading which causes the speed being capped below 700mbps.
As a workaround, we write this script to increase the speed until somebody can fix the usb3 driver at the kernel level.
#!/bin/sh
#script by Abi Darwish
INTERRUPT=$(ls /proc/irq/ | sed '/default/d')
USB3_NUMBER=$(grep usb3 /proc/interrupts | awk -F: '{print $1}' | sed 's/^ //')
for I in ${INTERRUPT}; do
if [[ ${I} = ${USB3_NUMBER} ]]; then
echo 2 > /proc/irq/${I}/smp_affinity 2>/dev/null
else
echo 1 > /proc/irq/${I}/smp_affinity 2>/dev/null
fi
printf "%-10s" ${I}:
cat /proc/irq/${I}/smp_affinity
done