Netgear R7800 exploration (IPQ8065, QCA9984)

I’ve been trying this combo for a little while. Little more aggressive, irqbalance enabled, global packet steering enabled, averaging 97% of the time at 800mhz with light home usage... the interrupts seem pretty balanced (If it doesn’t make sense / is stupid I’m open to feedback).


root@OpenWrt:~# uname -a
Linux OpenWrt 5.4.50 #0 SMP Thu Jul 9 11:01:20 2020 armv7l GNU/Linux

root@OpenWrt:~# cat /etc/rc.local
# Put your custom commands here that should be executed once
# the system init finished. By default this file does nothing.

echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq
echo 800000 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq
echo 20 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
echo 60 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor
echo 1000000 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate

echo 1 > /sys/devices/virtual/net/br-lan/queues/rx-0/rps_cpus
echo 0 > /sys/devices/virtual/net/eth0.2/queues/rx-0/rps_cpus
echo 1 > /sys/devices/virtual/net/eth1.1/queues/rx-0/rps_cpus
echo 0 > /sys/devices/virtual/net/ifb4eth0.2/queues/rx-0/rps_cpus
echo 1 > /sys/devices/virtual/net/lo/queues/rx-0/rps_cpus

exit 0


root@OpenWrt:~# cat /proc/interrupts
           CPU0       CPU1       
 16:    3852601    2178341     GIC-0  18 Edge      gp_timer
 18:       1452       6737     GIC-0  51 Edge      qcom_rpm_ack
 19:          0          0     GIC-0  53 Edge      qcom_rpm_err
 20:          0          0     GIC-0  54 Edge      qcom_rpm_wakeup
 26:          0          0     GIC-0 241 Level     ahci[29000000.sata]
 27:          0          0     GIC-0 210 Edge      tsens_interrupt
 30:     188226       3681     GIC-0 202 Level     adm_dma
 31:   11681556   11504074     GIC-0 255 Level     eth0
 32:   12039598   17363567     GIC-0 258 Level     eth1
 33:          0          0     GIC-0 130 Level     bam_dma
 34:          0          0     GIC-0 128 Level     bam_dma
 36:          0          0   PCI-MSI   0 Edge      aerdrv
 38:          0          0   PCI-MSI 134217728 Edge      aerdrv
 39:         13          0     GIC-0 184 Level     msm_serial0
 40:          2          0   msmgpio   6 Edge      keys
 41:          2          0   msmgpio  54 Edge      keys
 42:          2          0   msmgpio  65 Edge      keys
 43:          0          0     GIC-0 142 Level     xhci-hcd:usb1
 44:          0          0     GIC-0 237 Level     xhci-hcd:usb3
 45:         38          0   PCI-MSI 524288 Edge      ath10k_pci
 46:         32          0   PCI-MSI 134742016 Edge      ath10k_pci
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:     252551     787520  Rescheduling interrupts
IPI3:    5776514    3873354  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:      91678     158242  IRQ work interrupts
IPI6:          0          0  completion interrupts

How are you not getting any IRQ 45 & 46? Not using wifi?
From mine:

31:          4    3841293     GIC-0 255 Level     eth0
32:        237   52518910     GIC-0 258 Level     eth1
...
45:   37160470          0   PCI-MSI 524288 Edge      ath10k_pci
46:   16937144          0   PCI-MSI 134742016 Edge      ath10k_pci

I chose to just move IRQ31/32 to CPU1, I am not running irqbalance.

That was my main router (wifi turned off). Not sure if splitting it by irq or how I did is better. This what it looks like on a r7800 AP with the same settings (goal was to split the work- wifi from other functions):


root@OpenWrt:~# cat /proc/interrupts
           CPU0       CPU1       
 16:   10475230    3262708     GIC-0  18 Edge      gp_timer
 18:        185        296     GIC-0  51 Edge      qcom_rpm_ack
 19:          0          0     GIC-0  53 Edge      qcom_rpm_err
 20:          0          0     GIC-0  54 Edge      qcom_rpm_wakeup
 26:          0          0     GIC-0 241 Level     ahci[29000000.sata]
 27:          0          0     GIC-0 210 Edge      tsens_interrupt
 30:     187232         56     GIC-0 202 Level     adm_dma
 31:         30      42940     GIC-0 255 Level     eth0
 32:         99    5316328     GIC-0 258 Level     eth1
 33:          0          0     GIC-0 130 Level     bam_dma
 34:          0          0     GIC-0 128 Level     bam_dma
 36:          0          0   PCI-MSI   0 Edge      aerdrv
 38:          0          0   PCI-MSI 134217728 Edge      aerdrv
 39:         13          0     GIC-0 184 Level     msm_serial0
 40:          2          0   msmgpio   6 Edge      keys
 41:          2          0   msmgpio  54 Edge      keys
 42:          2          0   msmgpio  65 Edge      keys
 43:          0          0     GIC-0 142 Level     xhci-hcd:usb1
 44:          0          0     GIC-0 237 Level     xhci-hcd:usb3
 45:   15412016          0   PCI-MSI 524288 Edge      ath10k_pci
 46:   33585658          0   PCI-MSI 134742016 Edge      ath10k_pci
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:     155188    2828194  Rescheduling interrupts
IPI3:     920797    6022548  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:     136970     120890  IRQ work interrupts
IPI6:          0          0  completion interrupts
Err:          0

I don't believe ash will process this right but I know bash will:

find /sys/devices/ -name xps_cpus | while read f; do echo 1 > $f; done
find /sys/devices/ -name rps_cpus | while read f; do echo 2 > $f; done

That covers all devices in a generic way.

Have you tested this transmission vs receiving split compared to your settings:


devices/virtual/net/br-lan/queues/rx-0/rps_cpus = 3
devices/virtual/net/eth0.2/queues/rx-0/rps_cpus = 3
devices/virtual/net/eth1.1/queues/rx-0/rps_cpus = 3
devices/virtual/net/ifb4eth0.2/queues/rx-0/rps_cpus = 3
devices/virtual/net/lo/queues/rx-0/rps_cpus = 3

anybody having problems with setting MTU on eth0? this used to work but now seems not to.

@Ansuel i guess the DSA patches never made it into master, do you have a patch that changes R7800 to use DSA instead of swconfig?

i actually never propose dsa for ipq806x.. i'm waiting and following the support of dsa in openwrt (some email in the mailing list)
The fun part is that it's like 1 and half year that is use dsa instead of swconfig on my r7800
I'm taking time to push a big pr with some changes to the patches and try to propose dsa as mvebu platform switched to dsa even with the problem of converting the config from swconfig to dsa

Give me some minutes to check if i still have the dts with dsa config

@facboy if you want i can tell you what to change in the dts to use dsa...


&mdio0 {
	status = "okay";

	pinctrl-0 = <&mdio0_pins>;
	pinctrl-names = "default";
			
			switch@16 {
				compatible = "qca,qca8337";
				#address-cells = <1>;
				#size-cells = <0>;

				reg = <0x16>;

				ports {
					#address-cells = <1>;
					#size-cells = <0>;
					port@0 {
						reg = <0>;
						label = "cpu";
						ethernet = <&gmac1>;
						phy-mode = "rgmii-id";

 						fixed-link {
 							speed = <1000>;
 							full-duplex;
 						};
					};

					port@1 {
						reg = <1>;
						label = "lan1";
					};

					port@2 {
						reg = <2>;
						label = "lan2";
					};

					port@3 {
						reg = <3>;
						label = "lan3";
					};

					port@4 {
						reg = <4>;
						label = "lan4";
					};

					port@5 {
						reg = <5>;
						label = "wan";
					};

					/*
					port@6 {
						reg = <6>;
						label = "cpu";
						ethernet = <&gmac2>;
						phy-mode = "sgmii";

 						fixed-link {
 							speed = <1000>;
 							full-duplex;
 						};
					};*/
				};
			};
};

(this is for 5.4 also i can't remember if rgmii-id is needed or not... if you want to do some test check speed with rgmii-id and with rgmii, in phy-mode)

and you should remove CONFIG_AR8216_PHY=y from the target config

(also the board.d needs to change)

the board.d i can find in old posts?

stupid q, i can't brick the router with this, right? i don't have serial :O. it will be recoverable with tftp?

totally recoverable with tftp... also the router will still be accessible with wifi so i would advice to try this with a sysupgrade to access and change the config (remove the switch in network config and add to br-lan lan1 lan2 lan3 lan4

and change eth0 in the wan interface to wan

02_network in board.d

#!/bin/sh
#
# Copyright (c) 2015 The Linux Foundation. All rights reserved.
# Copyright (c) 2011-2015 OpenWrt.org
#

. /lib/functions/uci-defaults.sh
. /lib/functions/system.sh

board_config_update

board=$(board_name)

case "$board" in
buffalo,wxr-2533dhp |\
compex,wpq864 |\
netgear,d7800 |\
netgear,r7500 |\
netgear,r7500v2 |\
qcom,ipq8064-ap148 |\
tplink,vr2600v)
	ucidef_add_switch "switch0" \
		"1:lan" "2:lan" "3:lan" "4:lan" "6@eth1" "5:wan" "0@eth0"
	;;
qcom,ipq8064-ap161)
	ucidef_set_interface_lan "eth1 eth2"
	ucidef_add_switch "switch0" \
		"0:lan" "1:lan" "2:lan" "3u@eth1" "6:wan" "4u@eth0"
	;;
linksys,ea8500)
	hw_mac_addr=$(mtd_get_mac_ascii devinfo hw_mac_addr)
	ucidef_add_switch "switch0" \
		"0@eth0" "1:lan" "2:lan" "3:lan" "4:lan" "5:wan"
	ucidef_set_interface_macaddr "lan" "$hw_mac_addr"
	ucidef_set_interface_macaddr "wan" "$hw_mac_addr"
	;;
nec,wg2600hp)
	ucidef_add_switch "switch0" \
		"2:lan" "3:lan" "4:lan" "5:lan" "6@eth1" "1:wan" "0@eth0"
	;;
netgear,r7800 |\
tplink,c2600)
	ucidef_set_interfaces_lan_wan "lan1 lan2 lan3 lan4" "wan"
	;;
qcom,ipq8064-db149)
	ucidef_set_interface_lan "eth1 eth2 eth3"
	ucidef_add_switch "switch0" \
		"1:lan" "2:lan" "3:lan" "4:lan" "6u@eth1" "5:wan" "0u@eth0"
	;;
zyxel,nbg6817)
	hw_mac_addr=$(mtd_get_mac_ascii 0:APPSBLENV ethaddr)
	ucidef_add_switch "switch0" \
		"1:lan" "2:lan" "3:lan" "4:lan" "6@eth1" "5:wan" "0@eth0"
	ucidef_set_interface_macaddr "lan" "$(macaddr_add $hw_mac_addr 2)"
	ucidef_set_interface_macaddr "wan" "$(macaddr_add $hw_mac_addr 3)"
	;;
*)
	echo "Unsupported hardware. Network interfaces not intialized"
	;;
esac

board_config_flush

exit 0

k thx. not sure how i will test the speed, i only have one pc with ethernet, the laptop has to use a rubbish usb ethernet adapter that crashes periodically lol.

if you want test i would advice you to check the nss project... we make lots of progress and now we have something stable that offload all the traffic and use the cores

yes, i'm watching the thread. was trying to stay more or less on master, too busy for tinkering atm...of course then i decided to try this DSA thing :joy_cat:.

well it works, seems to be running ok on DSA. unfortunately i still can't set the MTU on eth0, now it 'works' on the 'wan' interface (gets set to 1508) and PPPoE negotiates a 1500 MTU, but then nothing actually responds. eth0 still shows MTU as 1500.

You really don't have eth0
You have wan and lan1......
I can't understand if wan or Lan traffic works

i assume eth0 represents the cpu port (if that's what it's called?) as everything is wan@eth0, lan1@eth0 etc.

this is the output of ifconfig...it's normal for wan and lan1, lan2, etc to have the same hw-address? anyway if i set the wan MTU to 1508 then wan gets a 1508 MTU, eth0 stays at 1500, pppoe-wan becomes 1500 but then ICMP > 1492 disappears down a black hole and the public web becomes unusable. I didn't specifically check traffic on the LAN actually, but ICMP > 1492 from the router to the public web also fails.

          inet addr:192.168.1.1  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::9e3d:cfff:feef:3a48/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2057450 errors:0 dropped:12321 overruns:0 frame:0
          TX packets:5812416 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:667008470 (636.1 MiB)  TX bytes:7928823930 (7.3 GiB)

eth0      Link encap:Ethernet  HWaddr <SAME>
          inet6 addr: fe80::9e3d:cfff:feef:3a48/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:7225195 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4155054 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:4267840137 (3.9 GiB)  TX bytes:2786618066 (2.5 GiB)
          Interrupt:31

ifb4wan   Link encap:Ethernet  HWaddr E6:98:CA:97:7A:58
          inet6 addr: fe80::e498:caff:fe97:7a58/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:5269168 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5269168 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:32
          RX bytes:7324516052 (6.8 GiB)  TX bytes:7324516052 (6.8 GiB)

lan1      Link encap:Ethernet  HWaddr <SAME>
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

lan2      Link encap:Ethernet  HWaddr <SAME>
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:460 (460.0 B)

lan3      Link encap:Ethernet  HWaddr <SAME>
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1199185 errors:0 dropped:9 overruns:0 frame:0
          TX packets:1783441 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:421936444 (402.3 MiB)  TX bytes:2037052781 (1.8 GiB)

lan4      Link encap:Ethernet  HWaddr <SAME>
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1045 errors:0 dropped:0 overruns:0 frame:0
          TX packets:148573 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:93157 (90.9 KiB)  TX bytes:33392767 (31.8 MiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:15131 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15131 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1292124 (1.2 MiB)  TX bytes:1292124 (1.2 MiB)

pppoe-wan Link encap:Point-to-Point Protocol
          inet addr:  P-t-P:  Mask:255.255.255.255
          UP POINTOPOINT RUNNING NOARP MULTICAST  MTU:1492  Metric:1
          RX packets:5870497 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2126512 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:3
          RX bytes:7865027991 (7.3 GiB)  TX bytes:651666947 (621.4 MiB)

wan       Link encap:Ethernet  HWaddr <SAME>
          inet addr:192.168.0.100  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::9e3d:cfff:feef:3a48/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:6023690 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2233403 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8025114178 (7.4 GiB)  TX bytes:709906441 (677.0 MiB)

is anyone else getting opp/regulator dmesg bootup crashdumps on recent master (non-fatal)? Rebasing my dts ... probably missed something...

sample
[    0.108748] qcom-pcie 1b500000.pci: 1b500000.pci supply vdda not found, using dummy regulator
[    0.108873] qcom-pcie 1b500000.pci: 1b500000.pci supply vdda_phy not found, using dummy regulator
[    0.108979] qcom-pcie 1b500000.pci: 1b500000.pci supply vdda_refclk not found, using dummy regulator
[    0.109276] qcom-pcie 1b500000.pci: host bridge /soc/pci@1b500000 ranges:
[    0.109301] qcom-pcie 1b500000.pci: Parsing ranges property...
[    0.109346] qcom-pcie 1b500000.pci:    IO 0x0fe00000..0x0fefffff -> 0x0fe00000
[    0.109386] qcom-pcie 1b500000.pci:   MEM 0x08000000..0x0fdfffff -> 0x08000000
[    0.236285] qcom-pcie 1b500000.pci: PCIe controller is not set to bridge type (hdr_type: 0xff)!
[    0.236315] qcom-pcie 1b500000.pci: cannot initialize host
[    0.236393] ------------[ cut here ]------------
[    0.236432] WARNING: CPU: 0 PID: 1 at drivers/regulator/core.c:2044 _regulator_put.part.3+0x17c/0x180
[    0.236444] Modules linked in:
[    0.236463] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.50 #0
[    0.236473] Hardware name: Generic DT based system
[    0.236509] [<c030f954>] (unwind_backtrace) from [<c030b96c>] (show_stack+0x14/0x20)
[    0.236531] [<c030b96c>] (show_stack) from [<c0959420>] (dump_stack+0x94/0xa8)
[    0.236564] [<c0959420>] (dump_stack) from [<c031e7c0>] (__warn+0xb4/0xd0)
[    0.236587] [<c031e7c0>] (__warn) from [<c031e82c>] (warn_slowpath_fmt+0x50/0x90)
[    0.236612] [<c031e82c>] (warn_slowpath_fmt) from [<c067069c>] (_regulator_put.part.3+0x17c/0x180)
[    0.236637] [<c067069c>] (_regulator_put.part.3) from [<c06706d0>] (regulator_put+0x30/0x48)
[    0.236658] [<c06706d0>] (regulator_put) from [<c0670718>] (regulator_bulk_free+0x30/0x4c)
[    0.236681] [<c0670718>] (regulator_bulk_free) from [<c06a4c28>] (release_nodes+0x1b0/0x204)
[    0.236711] [<c06a4c28>] (release_nodes) from [<c06a09c0>] (really_probe+0x124/0x37c)
[    0.236735] [<c06a09c0>] (really_probe) from [<c06a10c4>] (device_driver_attach+0x6c/0x74)
[    0.236756] [<c06a10c4>] (device_driver_attach) from [<c06a112c>] (__driver_attach+0x60/0xd0)
[    0.236777] [<c06a112c>] (__driver_attach) from [<c069ecbc>] (bus_for_each_dev+0x6c/0x9c)
[    0.236797] [<c069ecbc>] (bus_for_each_dev) from [<c069fe28>] (bus_add_driver+0x1dc/0x1ec)
[    0.236820] [<c069fe28>] (bus_add_driver) from [<c06a1740>] (driver_register+0x84/0x11c)
[    0.236845] [<c06a1740>] (driver_register) from [<c03027a4>] (do_one_initcall+0x90/0x1fc)
[    0.236869] [<c03027a4>] (do_one_initcall) from [<c0c00fe4>] (kernel_init_freeable+0x1c8/0x274)
[    0.236895] [<c0c00fe4>] (kernel_init_freeable) from [<c0970b40>] (kernel_init+0x8/0x114)
[    0.236916] [<c0970b40>] (kernel_init) from [<c03010e8>] (ret_from_fork+0x14/0x2c)
[    0.236929] Exception stack(0xdd43bfb0 to 0xdd43bff8)
[    0.236954] bfa0:                                     00000000 00000000 00000000 00000000
[    0.236992] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.237023] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    0.237046] ---[ end trace b6f315b519952b2c ]---

ah well i got it working in the end. i tried backporting some dsa stuff from 5.7 but then i couldn't set the MTU at all...some missing op definitions i think that 5.4 must bypass or uses defaults for.

anyway, the qca8k driver does not enable jumbo frames during initialization (which the old AR8327 driver does). i patched the qca8k driver and now the 'old' MTU override from UCI works.

can you tell me what you patched... i can try to propose them upstream if they are not too hacky... also i read somewhere that the mtu was changeable but i think that this feature got dropped in the dsa driver