Netgear R7800 exploration (IPQ8065, QCA9984)

Don’t have anything useful regarding your crashes.

Any performance improvement or better balancing with dividing up the load this way between your CPUs?

What does your interrupts look like, ex:


root@OpenWrt:~# cat /proc/interrupts
           CPU0       CPU1       
 16:      78919     119167     GIC-0  18 Edge      gp_timer
 18:       1513        884     GIC-0  51 Edge      qcom_rpm_ack
 19:          0          0     GIC-0  53 Edge      qcom_rpm_err
 20:          0          0     GIC-0  54 Edge      qcom_rpm_wakeup
 26:          0          0     GIC-0 241 Level     ahci[29000000.sata]
 27:          0          0     GIC-0 210 Edge      tsens_interrupt
 30:     188381         50     GIC-0 202 Level     adm_dma
 31:     163072      71718     GIC-0 255 Level     eth0
 32:      72078     219517     GIC-0 258 Level     eth1
 33:          0          0     GIC-0 130 Level     bam_dma
 34:          0          0     GIC-0 128 Level     bam_dma
 36:          0          0   PCI-MSI   0 Edge      aerdrv
 38:          0          0   PCI-MSI 134217728 Edge      aerdrv
 39:          9          0     GIC-0 184 Level     msm_serial0
 40:          2          0   msmgpio   6 Edge      keys
 41:          2          0   msmgpio  54 Edge      keys
 42:          2          0   msmgpio  65 Edge      keys
 43:          0          0     GIC-0 142 Level     xhci-hcd:usb1
 44:          0          0     GIC-0 237 Level     xhci-hcd:usb3
 45:         32          0   PCI-MSI 524288 Edge      ath10k_pci
 46:         31          0   PCI-MSI 134742016 Edge      ath10k_pci
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:      30697      13469  Rescheduling interrupts
IPI3:      63283      66232  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:      69398      83086  IRQ work interrupts
IPI6:          0          0  completion interrupts

@robimarko To keep the nss topic clear...

Looking at the code (and something I notice also earlier....) the reset code is totally missing. In the original qca code, the first thing the driver do on init is assert the gpio reset. For some reason this is missing in the upstream code. Still I can't understand why declaring it, triggers the reset.

Hm, as far as I understand any PCI-E driver should reset any device it founds.
At least that's what I have seen is done, so without perst GPIO, you cant even find the device.

reset-gpio is a generic Linux property that is usually handled by the subsystem core.

I may have stumbled onto the cause. The pcie-qcom.c driver file for 5.4 has this line:

pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_HIGH);

For 4.14, it’s this instead:

pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_LOW);

I’ll test it out later and see if this is the root cause of pcie bus not resetting properly with the 5.4 kernel.

Well that would cause the issues since they changed the default polarity while I doubt that DTSI in OpenWrt was updated with the changed polarity.
You can simply change the polarity in OpenWrt DTSI to test

Yeah, intending to test it like that.

Wtf testing right now... anyway in the linux-msm code it's HIGH so???


No luck changing GPIOD_OUT_HIGH to GPIOD_OUT_LOW same problem. But another test would be good

Probably better to change it in the dtsi file. Not sure if the polarity is changed in more than one location in the code.

Anyway i found the solution...
The fix proposed consist in deleting the perst definition and adding reset gpio
Problem is that the driver doesn't do anything with a reset-gpio so the real solution is removing the perst...

Now i remember that in the original code there is a special implementation for perst gpio that in short defined only ONE gpio for a particular soc...

I tried removing the perst gpio and i notice that pci0 fails to load but pci1 actually load correctly...
Defining perst gpio only for pci0 make them load correctly.

This would explain the special implementation in the original code...

(wonder if the gpio is connected to all the pcie line?)

(also in newer kernel the definition for gpio changed
perst-gpios to perst-gpio and since the perst is optional the pci port was never actually reset)


The related original code (what a mess)

	if (machine_is_ipq806x_rumi3()) {
		rst[0] = rst[1] = rst[2] = -1;
		pwr[0] = pwr[1] = pwr[2] = -1;
		no_vreg[0] = no_vreg[1] = no_vreg[2] = 1;
	}

	if (machine_is_ipq806x_db147()) {
		rst[1] = -1;
		pwr[1] = -1;
		no_vreg[1] = 1;
		rst[2] = PCIE_1_RST_GPIO;
		pwr[2] = PCIE_1_PWR_EN_GPIO;
	}

	if (machine_is_ipq806x_ap148() || machine_is_r7600() ||
		machine_is_ipq806x_ap148_1xx()) {
		rst[2] = -1;
		pwr[2] = -1;
		no_vreg[2] = 1;
		msm_pcie_platform_data[1].force_gen1 = 1;
	}

@robimarko any idea how to identify board based on db147 or ap148 or rumi3 ?
r7800 looks like db147 since pci1 doesn't require gpio reset (and in db147 that is set to -1 so disabled)

I reported the same crash.
https://bugs.openwrt.org/index.php?do=details&task_id=3204

The easiest way would be to simply extract the DTB used by stock FW and you will have the board identifier there.
You can also take a look at the stock bootlog but sometimes the full revision is missing from the model there

          CPU0       CPU1
 16:    3025902    3055042     GIC-0  18 Edge      gp_timer
 18:        441          0     GIC-0  51 Edge      qcom_rpm_ack
 19:          0          0     GIC-0  53 Edge      qcom_rpm_err
 20:          0          0     GIC-0  54 Edge      qcom_rpm_wakeup
 26:          0          0     GIC-0 241 Level     ahci[29000000.sata]
 27:          0          0     GIC-0 210 Edge      tsens_interrupt
 30:     305431      34438     GIC-0 202 Level     adm_dma
 31:    7243987    8450144     GIC-0 255 Level     eth0
 32:     162856    1385119     GIC-0 258 Level     eth1
 33:          0          0     GIC-0 130 Level     bam_dma
 34:          0          0     GIC-0 128 Level     bam_dma
 36:          0          0   PCI-MSI   0 Edge      aerdrv
 38:          0          0   PCI-MSI 134217728 Edge      aerdrv
 39:          8          0     GIC-0 184 Level     msm_serial0
 40:          2          0   msmgpio   6 Edge      keys
 41:          2          0   msmgpio  54 Edge      keys
 42:          2          0   msmgpio  65 Edge      keys
 43:         14          0     GIC-0 142 Level     xhci-hcd:usb1
 44:          0          0     GIC-0 237 Level     xhci-hcd:usb3
 45:    9240190          0   PCI-MSI 524288 Edge      ath10k_pci
 46:   15014243          0   PCI-MSI 134742016 Edge      ath10k_pci
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:     455879    3907609  Rescheduling interrupts
IPI3:     899281    1361270  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:        899        847  IRQ work interrupts
IPI6:          0          0  completion interrupts
Err:          0

That's what it looks like a little under 24 hours.

not doable since we should check every target we have :frowning:

Yeah, well there is not another way

Hey, not sure if this is the proper way to ask things but I just got a hold of an Nighthawk AC2600 (R7800) and I was wondering if I should run OpenWRT or Stock (V1.0.2.68) - I am a pretty basic user so I am just looking for the best performance I can get without much manual work, and I was just wondering if the OpenWRT firmware is way better to use for performance then the stock version.

ATM stock seems to have better performance.

Whatever performance is meant ... I see ...best. Ok, I agree ;- )

Expect no more than 50% of throughput you can get from the stock firmware. If throughput is what you are after, then stay away.

1 Like

Security is much better though, latency as well. so it all depends on how much bandwidth he needs. if his internet connection is <200mbit, Openwrt will be just fine.

1 Like

Agree, but you are missing the time/effort investment required to get everything running: even setting up a guest wifi is not for fainthearted. The poster came here with the only criteria: performance. It is clear that there is no understanding of what running an OpenWrt router means and why do that. A Google WiFi router is a better choice here.

1 Like