i am planning to push the ipq40xx split patch next week. this will move ipq40xx to v4.14
i have no plans o invest any time into ipq806x in the near future and it will remain on v4.9 for the time being.
Hi, rookie here, my ISP is using VLAN 500, and I'm currently using just a VLAN which is VLAN 500, CPU(0)(1) LAN 1 2 3 4 all untagged and WAN Tagged, and I found my ping time is better for about 2ms than originally 2 VLANs which put CPU(1) to LAN 4 (all untagged) as VLAN 1 and CPU(0) untagged WAN tagged as VLAN 500.
Will it cause any problems? and wonder why would they split it into 2 VLANs from the beginning
@fantom-x
Good job on solving the latency issue!
Would you mind sharing your "recipe" on how to get the latency-spikes down?
So us noobs can benefit as well?
Well, the key is to compile your own firmware with custom kernel boot parameter isolcpus=1. That takes CPU1 out and the scheduler no longer uses it. Then use a script a few posts above to move network IRQ’s to CPU1. I also lower priority for things like collectd, nlbwmon, uhttpd, etc which do not have any business running with default priority.
In the end CPU1 is used exclusively for the network interrupts while CPU0 is running everting else. I have not seen ether overloaded yet.
The most difficult step is to compile your firmware. I can share mine if that helps.
Thank you for you summary! I know how to create my own image, thanks to @hnyman and @escalade. Also, I know the function of the set_cpu_affinity script.
Unfortunately, setting a custom kernel boot parameter and lowering the priority of collectd, nlbwmon, uhttpd is completely new to me. Can you please provide some details on how to dot this?
That will be complicated if you're new to build a firmware. First thing i'll try to start in this thread
hnyman made it easy to compile your own image but you also need to change a parameter (isolcpus) with make menuconfig command
To change the priority you need to use the nice command. I advise to change the service scripts in /etc/init.d/uhttpd , collectd, and nlbwmon
By memory:
make kernel_menuconfig
Then look for Boot Parameters and then there is an item to set kernel parameters. Type isolcpus=1 there and rebuild.
Once you install your image, run cat /proc/cmdline to see this new parameter.
Deal with IRQ’s now.
Then you need to edit some startup files under /etc/Init.d/ (collectd, nlbwmon, uhttpd) by adding nice -n 19 to the line that starts the processes. If you get in trouble there, I will post more details in a few hours once I get to my computer.
Great, thanks for the detailed explanation! With this extra info I will manage to create a functioning build.
Some more details a promised are below. The procedure is manual, so extra attention is warranted. Any screw-ups are not my fault.
-
Start with @hnyman's build and only continue once you can built and deploy it.
-
Add a custom kernel boot parameter either via make kernel_menuconfig / Boot options / Default kernel command string : isolcpus=1 or by modifying this config file by adding one line:
grep isolcpus target/linux/ipq806x/config-4.9 CONFIG_CMDLINE="isolcpus=1"
- Build and deploy the image, then check that the new config is active:
cat /proc/cmdline
isolcpus=1
- Move wifi0, eth0, and eth1 to CPU1 and verify that it actually worked. I leave wifi1 on CPU0 and I do not care much about the 2.4GHz clients. Verify that the numbers in the CPU1 column are increasing.
cat /proc/interrupts | egrep "eth|qcom-pcie-msi|CPU0" CPU0 CPU1 97: 8296 57872293 GIC-0 67 Edge qcom-pcie-msi 98: 16322413 0 GIC-0 89 Edge qcom-pcie-msi 100: 1069 26137176 GIC-0 255 Level eth0 101: 511 9101453 GIC-0 258 Level eth1
- Add nice -n 19 to the services that should be running in the background. They will be running on CPU0, but they have absolutely no business to run with default priority.
grep nice /etc/init.d/* /etc/init.d/collectd: procd_set_param command nice -n 19 /usr/sbin/collectd -f /etc/init.d/nlbwmon: procd_set_param command nice -n 19 "$PROG" /etc/init.d/uhttpd: procd_set_param command nice -n 19 "$UHTTPD_BIN" -f
- Restar these services or just reboot. Verify that the change took affect (look for SN; N means nice; or use htop that they running nicer):
ps -w | egrep "collectd|nlbwmon|uhttpd" 1652 root 3256 SN /usr/sbin/uhttpd 1892 root 4104 SN /usr/sbin/collectd -f 2039 root 1460 SN /usr/sbin/nlbwmon
- Stop all services that you do not use. Here is what I do, but you may have use for some of them.
/etc/init.d/etherwake disable /etc/init.d/etherwake stop /etc/init.d/miniupnpd disable /etc/init.d/miniupnpd stop /etc/init.d/odhcpd disable /etc/init.d/odhcpd stop /etc/init.d/vsftpd disable /etc/init.d/vsftpd stop
-
Reboot just in case
-
Share your results. I am for one curious if this works for others.
-
For the super adventurous among us, run the following lines and add them to /etc/rc.local. This CPU takes 100 us to switch frequencies, which is quite long.
echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo "performance" > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
nice catch! i'll test it thanks
I managed to make a little more progress for the NSS cores. I think I've successfully activated one of the two NSS cores (I think). But it doesn't seem improve thruput or CPU utilisation. Output for /proc/interrupt shown below:
CPU0 CPU1
16: 31840 175695 GIC 18 Edge gp_timer
18: 33 0 GIC 51 Edge qcom_rpm_ack
19: 0 0 GIC 53 Edge qcom_rpm_err
20: 0 0 GIC 54 Edge qcom_rpm_wakeup
26: 0 0 GIC 241 Edge 29000000.sata
27: 26 135106 GIC 67 Edge qcom-pcie-msi
28: 23 148231 GIC 89 Edge qcom-pcie-msi
29: 175596 0 GIC 202 Edge adm_dma
30: 257827 0 GIC 255 Level
31: 0 553977 GIC 258 Level
32: 0 0 GIC 130 Level bam_dma
33: 0 0 GIC 128 Level bam_dma
34: 760205 0 GIC 245 Level nss
41: 2 0 msmgpio 6 Edge gpio-keys
89: 2 0 msmgpio 54 Edge gpio-keys
100: 2 0 msmgpio 65 Edge gpio-keys
104: 0 0 PCI-MSI 0 Edge aerdrv
105: 26 135106 PCI-MSI 1 Edge ath10k_pci
137: 0 0 PCI-MSI 0 Edge aerdrv
138: 23 148231 PCI-MSI 1 Edge ath10k_pci
170: 12 0 GIC 184 Level msm_serial0
171: 2 0 GIC 187 Level 1a280000.spi
172: 0 0 GIC 142 Level xhci-hcd:usb1
173: 0 0 GIC 237 Level xhci-hcd:usb3
IPI0: 0 0 CPU wakeup interrupts
IPI1: 0 0 Timer broadcast interrupts
IPI2: 5618 7556 Rescheduling interrupts
IPI3: 0 0 Function call interrupts
IPI4: 6190 40574 Single function call interrupts
IPI5: 0 0 CPU stop interrupts
IPI6: 2 0 IRQ work interrupts
IPI7: 0 0 completion interrupts
Err: 0
IRQ 30 & 31 used to be for the qca-nss-gmac driver, but once I loaded the qca-nss-drv driver, both IRQs stopped incrementing but a new one (IRQ34) appeared.
I guess the next step will probably to get the user land program running, so that it can start to delegate traffic to the NSS cores.
The qca-nss-drv driver needed addition device tree and linux kernel clock driver support, which I adapted from Chromium OS for Kernel 3.14, so some portion of the clock driver had to be disabled as it could not be compile. The NSS core driver support still need work. I only found the device tree config for the first NSS core.
Does anyone know how to extract device tree information from the existing Netgear firmware? I managed to extract the contents but the compiled device tree details does not seem to be available in the firmware. The Netgear firmware probably contains the device tree details for the second NSS core.
If anyone is interested in trying out a R7800 build with one NSS core activated, please let me know.
I'll check in the source codes into GitHub in a while.
Hi there,
I tried to set affinity of eth-interrupts manually (@hnyman builds 6394 and 6420), default_smp_affinity=1 and ethernet irqs to 2 to lessen spikes. Works fine some time, but after some minutes the affinities are garbled for irq 100 and 101. Which process is reseting the irq affinity? I haven't found a clue yet.
BTW: If setting as above (I don't care about wifi) the latency of my preferred ping target drops from 21-22 ms to 18-19 ms and spikes from 60-80 ms (max 100 ms) to about 30 ms. Also spikes are less in a given time.
Are you running irqbalance? Run the command below to check:
ps -w | grep irqbalance
Sh... the bloody obvious! Shame on me!
Yes, irqbalance was still active. Legacy from some trials several month ago.
Thanks!
there are some source of nss driver in uboot if you want
Sounds like maybe this setup fantom-x found should be the default for this router.
There has not been a single independent confirmation that this setup is working for anyone else but me yet.
@fantom-x
I've now made my own build.
While running the command: cat /proc/interrupts | egrep "eth|qcom-pcie-msi|CPU0" I noticed that the numbers in the CPU1-column where and stayed 0.
I noticed there is a mistake in your set_cpu_affinity script: at eth0) and eth1) in the text it says: $irq_wifi1.
After changing these bits of text to $irq_eth0 respectively $irq_eth1, it seems to work.
More info after some testing.
No, I am not the author of that script.
You can use this version of the script: Netgear R7800 exploration (IPQ8065, QCA9984)
UPDATE: Oh, I see now. I have decided to try that script and you are correct, there is an issue in that script just as you described.