Have you got the firmware loading onto the NSS cores? It's here https://github.com/qca/nss-firmware and if you look at the stock netgear firmware (or the QCA-NSS repos) there's an init script to load it into the CPU.
Looks like you're getting closer than I ever did, well done!
The interrupts "gp_timer" and 2x "ath10k_pci" are also active on CPU1.
I also took a quick look at the individual interrupts, I noticed that some of the interrupts have an smp_affinity of "3". What does that mean? Are there 4 CPU's?
It was an adaptation of the @dissent1 script. Yeah i forgot to change that lines and i've done it but also forgot to post it here. Sorry!. Nice @bouwew that you got to the point
Here the script corrected with the labels eth0 and eth1
If i have time i'll try to create a service script for irqbalance --oneshot option so we can execute it from the web front and add it as a service like set_cpu_affinity
#!/bin/sh /etc/rc.common
# First start irqbalance with the --oneshot option
# Try to balance manually both eth to core2 and wifi0 to core2 ifthey are not balanced correctly
# System -> startup -> Local Startup
# /usr/sbin/irqbalance --oneshot --debug > /var/log/irqbalance.log
# /etc/init.d/set_cpu_affinity
START=99
set_irq_affinity() {
local name="$1"
local val="$2"
case "$name" in
wifi0)
local irq_wifi0=`grep -E -m1 'ath10k_ahb|qcom-pcie-msi' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
[ -n "$irq_wifi0" ] || echo "$name irq not found."
echo "$val" > "/proc/irq/$irq_wifi0/smp_affinity"
;;
wifi1)
local irq_wifi1=`grep -E -m2 'ath10k_ahb|qcom-pcie-msi' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
[ -n "$irq_wifi1" ] || echo "$name irq not found."
echo "$val" > "/proc/irq/$irq_wifi1/smp_affinity"
;;
eth0)
local irq_eth0=`grep -E -m3 'eth0' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
[ -n "$irq_eth0" ] || echo "$name irq not found."
echo "$val" > "/proc/irq/$irq_eth0/smp_affinity"
;;
eth1)
local irq_eth1=`grep -E -m3 'eth1' /proc/interrupts | cut -d: -f1 | tail -n1 | tr -d ' '`
[ -n "$irq_eth1" ] || echo "$name irq not found."
echo "$val" > "/proc/irq/$irq_eth1/smp_affinity"
;;
*)
local irq=`grep -m 1 "$name" /proc/interrupts | cut -d: -f1 | sed 's, *,,'`
[ -n "$irq" ] || echo "$name irq not found."
echo "$val" > "/proc/irq/$irq/smp_affinity"
;;
esac
}
start() {
. /lib/functions.sh
set_irq_affinity eth0 2
set_irq_affinity eth1 2
set_irq_affinity wifi0 2
}
Yes, the firmware loaded. I extracted the firmware binaries from the latest Netgear firmware image. But the driver doesnāt seem to work right. Still trying to find out whatās wrong. Also i only managed to find the device tree Config for the first core. The second core still not active.
Sounds like you're making excellent progress - from memory it's a tough slog getting it all working, and it's going to be a pain keeping it all up to date, but you're doing impressive work - and I'll be tracking it closely!
If youāre interested, do try it out. I didnāt commit the firmware binaries as I think I donāt have the rights to post it online, until I saw again the read me on the link you posted earlier.
Iām still testing the driver at this stage, so currently copying the files manually into my routerās overlay partition. Will automate it once itās ready for use, although I donāt know how long that will take.
@fantom-x
I'm using master, it's a mix of hnyman's and escalade's builds. And, yes, isolcpus=1 is active.
Now that I come to think of it, escalade's build includes 2 sets of updates, made by dissent1, that are still present as pull-requests to master: https://github.com/openwrt/openwrt/pull/669 and https://github.com/openwrt/openwrt/pull/632
Maybe they are effecting my results somehow.
More experimenting to do during next weekend
BTW any chance (and time left also :-D) that you can update the irqbalance package from version 1.2 to 1.3?. If not i'll try to do it myself as in the new version some optimization of platform device irq detection has been made.
No, I will not be doing that upgrade. irqbalance currently requires external glib2 library that is large, so the version upgrade would increase the installed size quite much.
See https://github.com/Irqbalance/irqbalance/issues/40
Having full glib2 as dependency would increase size (with dependencies) by a megabyte.
One option might be to statically link the needed glib2 library parts, but I have not tried that.
i can agree with the probable answer for that question: none
Better to do it manually with the affinity script, at least i get better results, the spikes went down but are still there, although they always were not annoying.
While I was testing the nss drivers I thot Iād simulate the latency issue. There seem to be correlation between the spike and wireless network activity. Try disabling both wireless interfaces and see if you still see the spikes.
As for the nss drivers, Iāve made a wee bit more progress, I think. Found the 2nd NSS coreās details from Netgearās firmware source code. Still trying to work out the exact values by trial and error tho. The nss driver seems to be able to offload WiFi traffic as well, so if the latency spike is linked to WiFi, maybe the nss driver may help reduce the latency spikes.
I still experience the spikes with both the 5GHz and 2.4GHz networks disabled. My spikes are relatively infrequent: one ~80ms spike every 3-4 minutes maybe.