Netgear R7800 exploration (IPQ8065, QCA9984)

How hot does your device run?
cut -c1-2 /sys/devices/virtual/thermal/*/temp
Mine normally stays under 60C though I can see some of the temps hit 61-62C occasionally on hot days. I run it locked at 1.7GHz w/ performance governor.

Not that hot i guess:

$ cut -c1-2 /sys/devices/virtual/thermal/*/temp
58
57
57
55
57
56
58
58
58
53
59

@facboy do you have some strange service in your home? I don't remember having this type of problem. Also could be related to enabling jumbo frames and changing mtu? Can you try to disable that?

yeah i did try disabling the MTU override and it didn't seem to make a difference. i swapped out the upstream modem and it seems to have been more stable, so perhaps it was a hardware fault with the upstream modem.

bit strange though, the modem itself never crashed as the DSL link had been up 15 days.

For those who get better results with the old/mainline firmware, the 3.9.0.2 line of firmware keeps getting updates but the 3.10 does not:
https://github.com/kvalo/ath10k-firmware/commits/master/QCA9984/hw1.0

Does anyone understand what these versions mean? 3.9.0.2 must be popular with some device maker, e.g. UniFi APs or similar?

I can attempt to test; however, there have been issues with "wifi speed" and packet loss when multiple clients are putting the wifi network under load (reported here and I think I observe similar symptoms here). I'm hoping today's patch from @nbd will resolve that so I'd like to test with and without your patch from the current head of master (via a cherry pick) if that works. If not, please let me know.

EDIT: It builds, loads, and is running (I verified that a few of the changes in your commit are in the build dir). I'll let it go for a day or so.

I still see the wifi issue I observed before this build for multiple clients putting the wifi network under load (so @nbd's patch must be unrelated to that issue); however, irtt/netperf for only one wifi client at a time seem fine (at least the same as before).

EDIT1: After 24 hr up with this commit on the r7500v2 configured as an AP only, I haven't observed a significant difference from prior builds.

git log

74f3dac4f057    Ansuel Smith    Sat Aug 8 16:50:04 2020 +0200   ipq806x: replace pci patchset with upstream version
17d16e093f4a    Josef Schlehofer        Fri Aug 21 17:17:47 2020 +0200  curl: update to version 7.72.0
9c5128854eaf    Felix Fietkau   Fri Aug 21 20:44:25 2020 +0200  kernel: backport a fix for a regression that broke IRQ affinity on ARM
010682067b65    Felix Fietkau   Fri Aug 21 18:06:50 2020 +0200  mac80211: add missing return code checks in AQL improvements
...
r7500v2 # uname -a
Linux r7500v2 5.4.59 #0 SMP Sat Aug 22 03:12:38 2020 armv7l GNU/Linux
r7500v2 # uptime
 08:44:04 up 1 day, 37 min,  load average: 0.00, 0.05, 0.06

@Ansuel any idea whats happening here?

  1. pcie 1b5 wont init (no wireless)... since the first bulk dtsi integration on master...
  2. same dts ( method ) on early 5.4 worked fine... 4.19 has always been fine...

so something has been removed from that pcie-1b5 node / related gpio's or made more specific?

do you have this with initramfs image or with normal bootup?

If i remember correctly it could be that some gpio reset has been removed (wrongly?)

the warning are not related... it's just pcie that is not reset(/init) correctly.

Also is this on r7800????

yes, i've tested with initramfs... it's a rt2600ac which is very close to the nbg6817...

one big difference is the past ( early 5.4 and 4.19 )...

gpio 33/64 is not used for leds ( something else ) > gpio 6+7+8 are...

26 is not used for gpio either...

But I do see that the numbers are now switched around a little?... perhaps the relevant new number for the above 33/64/26 is now defined as something in the new dtsi?

In the past i did not have to touch the pcie nodes at all, a copy exactly of what is in the nbg6817 always worked...

did you test without a initramfs image? pci is broken due to uboot skipping pci init using bootm.

Funny enough I'm backtracking what is causing this and trying to fix this right now.

1 Like

yeah i've tested with and without initramfs... i think it might have something to do with the dtsi pcie0-2 < gpio 3/63 etc. numbering changes... persist-gpio/reset-gpio or the mdio0 pinmux's

i.e.; similar areas on 4.19

		pcie0: pci@1b500000 {
			perst-gpios = <&qcom_pinmux 3 GPIO_ACTIVE_LOW>;

		pcie2: pci@1b900000 {
			perst-gpios = <&qcom_pinmux 63 GPIO_ACTIVE_LOW>;

mdio > gpio0 vs gpio0 + gpio1

etc. etc.

Can you confirm that by using the old dtsi instead of the patch everything works as expected?

so remove

cat target/linux/ipq806x/patches-5.4/083-ipq8064-dtsi-additions.patch

then

cat target/linux/ipq806x/files-4.19/arch/arm/boot/dts/qcom-ipq8065-rt2600ac.dts > target/linux/ipq806x/files-5.4/arch/arm/boot/dts/qcom-ipq8065-rt2600ac.dts

wont there be a missing ipq8064.dtsi include then?

you need to restore the ipq8064 dtsi BEFORE the additions patch

yup... thought so... will see what happens... thanks for the help... :100:

i still can't undestrand why by removing the perst-gpio pcie works in initramfs

nope... too complicated for me with everything moved around... if it were just a dts + dtsi I probably could have helped a little... patches / includes / skeletons jumping around the place is way over my head...

i'm content knowing that it's more of a general thing and not something that is uniq to my board... :smiley:

try to remove the gpio reset from the dts (by directly editing in the build dir) and try to compile an image... (make sure the make process doesn't repatch the linux dir)

static void ipq_pci_gpio_fixup(void)
{
	unsigned int machid;
	/* get machine type from SMEM and set in env */
	machid = gd->bd->bi_arch_number;

	gpio_func_data_t *gpio_0 = gboard_param->pcie_cfg[0].pci_rst_gpio;
	gpio_func_data_t *gpio_1 = gboard_param->pcie_cfg[1].pci_rst_gpio;
	gpio_func_data_t *gpio_2 = gboard_param->pcie_cfg[2].pci_rst_gpio;

	if (machid == MACH_TYPE_IPQ806X_RUMI3) {
		gpio_0->gpio = -1;
		gpio_1->gpio = -1;
		gpio_2->gpio = -1;
	} else if (machid == MACH_TYPE_IPQ806X_DB147) {
		gpio_1->gpio = -1;
		gpio_2->gpio = PCIE_1_RST_GPIO;
	} else if ((machid == MACH_TYPE_IPQ806X_AP148) ||
				(machid == MACH_TYPE_IPQ806X_AP148_1XX )) {
		gpio_2->gpio = -1;
	}
}

This function handles the pci reset gpio...

Anyway some news about pcie problem... i'm investigating the problem and i found what bootipq does more than the simple bootm...

This is the code

void board_pci_deinit()
{
	int i;
	pcie_params_t 		*cfg;
	gpio_func_data_t	*gpio_data;

	for (i = 0; i < PCI_MAX_DEVICES; i++) {
		cfg = &gboard_param->pcie_cfg[i];
		gpio_data = cfg->pci_rst_gpio;

		if (gpio_data->gpio != -1)
			gpio_tlmm_config(gpio_data->gpio, 0, GPIO_INPUT,
					GPIO_NO_PULL, GPIO_2MA, GPIO_OE_DISABLE);
		writel(0x7d, cfg->pcie_rst);
		writel(1, cfg->parf + PCIE20_PARF_PHY_CTRL);
		pcie_clock_shutdown(cfg->pci_clks);
	}

	ipq_wifi_pci_power_disable();
}

I tested the 2 writel but no luck and still the same error but i notice something else...

The pci_power_disable... for all the day I tought the function used the reset gpio (that we have declared in the dts) BUT i notice that it's using other gpio...

from the header file i have this

#define PCIE_RST_GPIO           3
#define PCIE_1_RST_GPIO         48
#define PCIE_2_RST_GPIO         63
#define WIFI_PCIE_1_POWER_GPIO	9
#define WIFI_PCIE_2_POWER_GPIO	26

and the function in uboot first assert/deassert the gpio for the power and then assert the gpio for the reset...

and the power gpio code is missing from our driver... problem is that in ipq8064 dtsi gpio 9 and 26 are tied to the leds mux and in r7800 gpio 9 is used for power and 26 is used for esata activity so... A BIG WTF...

Also interesting... looking at the code netgear uboot source lacks completely of the power gpio???

1 Like

9 is 5g activity led for me... power (led) for nbg...
26 is 5g led on nbg

maybe the pins were port ok... i thought i saw some changes around compatible/s ... just guessing but my regulator started spitting errors prior to the pcie... so something like inititalizing the clocks / regulator @ 10000000???

yeah, i've got the uboot source... oem uses bootipq... i used it a few times initially, but bootm was working ok for me...