Netgear R7800 exploration (IPQ8065, QCA9984)

Some update about me trying to make 5.4 work ahah

The watchdog driver is broken ( this is the cause https://github.com/torvalds/linux/commit/36375491a439565402d1cb2cf12955c11f2ed5a6#diff-6d265cdf741d7c24a8dcb8f5b9523e39)
The pcie driver is broken (cause https://github.com/torvalds/linux/commit/0b24134f7888175c9638e6fd1900e23e44fc172f#diff-b9f99e11e7520bb90ac922c0208df20c)
This 2 problem can be solved by reverting this 2 patch

Leds and button are broken as they added more checks in the driver and now a gpio can't be redefined.
gpio-button-hotplug is broken
tsense driver is broken (I already reworked it a little but it's not enough as they changed a lot in the tsense common driver)

As I'm really stupid, I'm testing all this thing with dsa and not with normal driver so... ALSO ethernet comunication is broken ( the driver register but i can't comunicate with the router )

3 Likes

impressive that you even figured out the causes already. Your hard work is much appreciated!

More progress.. found out why leds and button are broken...
In 5.4
qcom-ipq8064-v1.0.dtsi have defined leds and buttons node... and every dtsi include v1.0...

Devs will decide what to do about this... For now i will just """clear""" it by providing an empty file. This way we overwrite the one the kernel provide.

About tsense. The driver is fixed and now all works.


Still trying to find why eth connection doesn't work... Could be me, my image that is too basic or some unlucky commits that broke dhcp comunication

2 Likes

Impressive, I'm happy to see things happening for this device.
I guess something we will never see is MU-MIMO being supported... but the Openwrt support it still top notch.

I'm start to thinking that the problem with eth traffic not working is related to a problem with generic patch for 5.4 (as they are still WIP)... Will try to ask the dev in IRC chat

I started experiencing frequent WiFi (ath10k-ct) crashes in the last couple of weeks (master), but it happens without restart as well. These two below are from two consecutive wifi restarts. Is it stable for everyone else in master?

kern.err kernel: [33857.368031] ath10k_pci 0001:01:00.0: firmware crashed! (guid 673512c0-69f4-45db-9b7b-c387e90471ff)
kern.warn kernel: [33857.449673] ath10k_pci 0001:01:00.0: in crash-regs-harder
kern.warn kernel: [33858.694189] ath10k_pci 0001:01:00.0: in crash-regs-harder, firmware did not provide indicator: 0xdeadbeef
kern.warn kernel: [33858.961116] ath10k_pci 0001:01:00.0: cannot restart a device that hasn't been started

kern.err kernel: [35356.204932] ath10k_pci 0001:01:00.0: firmware crashed! (guid 4904a29b-bdd8-49da-940d-fe5009126fc5)
kern.warn kernel: [35356.286681] ath10k_pci 0001:01:00.0: in crash-regs-harder
kern.warn kernel: [35357.527703] ath10k_pci 0001:01:00.0: in crash-regs-harder, firmware did not provide indicator: 0xdeadbeef
kern.warn kernel: [35357.781011] ath10k_pci 0001:01:00.0: cannot restart a device that hasn't been started

UPDATE: Full log for one of the crashes:

kern.err kernel: [35356.204932] ath10k_pci 0001:01:00.0: firmware crashed! (guid 4904a29b-bdd8-49da-940d-fe5009126fc5)                            
kern.info kernel: [35356.204970] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe                
kern.info kernel: [35356.212809] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 0                                    
kern.info kernel: [35356.225044] ath10k_pci 0001:01:00.0: firmware ver 10.4b-ct-9984-fW-012-6acc9b999 api 5 features mfp,peer-flow-ctrl,txstatus-n
kern.info kernel: [35356.237998] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 85498734                                
kern.info kernel: [35356.259630] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 32 raw 0 hwcrypto 1                                    
kern.warn kernel: [35356.278781] ath10k_pci 0001:01:00.0: failed to get memcpy hi address for firmware address 4: -16         
kern.err kernel: [35356.278810] ath10k_pci 0001:01:00.0: failed to read firmware dump area: -16                                                
kern.warn kernel: [35356.286681] ath10k_pci 0001:01:00.0: in crash-regs-harder                                                
kern.warn kernel: [35357.527703] ath10k_pci 0001:01:00.0: in crash-regs-harder, firmware did not provide indicator: 0xdeadbeef
kern.err kernel: [35357.527733] ath10k_pci 0001:01:00.0: Copy Engine register dump:                                                          
kern.err kernel: [35357.536321] ath10k_pci 0001:01:00.0: [00]: 0x0004a000 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.541980] ath10k_pci 0001:01:00.0: [01]: 0x0004a400 3735928559 3735928559 3735928559 3735928559                          
kern.err kernel: [35357.551064] ath10k_pci 0001:01:00.0: [02]: 0x0004a800 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.559936] ath10k_pci 0001:01:00.0: [03]: 0x0004ac00 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.568778] ath10k_pci 0001:01:00.0: [04]: 0x0004b000 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.577649] ath10k_pci 0001:01:00.0: [05]: 0x0004b400 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.586490] ath10k_pci 0001:01:00.0: [06]: 0x0004b800 3735928559 3735928559 3735928559 3735928559             
kern.err kernel: [35357.595332] ath10k_pci 0001:01:00.0: [07]: 0x0004bc00 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.604203] ath10k_pci 0001:01:00.0: [08]: 0x0004c000 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.612986] ath10k_pci 0001:01:00.0: [09]: 0x0004c400 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.621916] ath10k_pci 0001:01:00.0: [10]: 0x0004c800 3735928559 3735928559 3735928559 3735928559       
kern.err kernel: [35357.630757] ath10k_pci 0001:01:00.0: [11]: 0x0004cc00 3735928559 3735928559 3735928559 3735928559       
kern.warn kernel: [35357.639635] ath10k_pci 0001:01:00.0: failed to get memcpy hi address for firmware address 8: -28       
kern.err kernel: [35357.648467] ath10k_pci 0001:01:00.0: failed to dump debug log area: -28                                                      
kern.warn kernel: [35357.657311] ath10k_pci 0001:01:00.0: failed to get memcpy hi address for firmware address 72: -28      
kern.warn kernel: [35357.663595] ath10k_pci 0001:01:00.0: failed to get memcpy hi address for firmware address 72: -28      
kern.warn kernel: [35357.672681] ath10k_pci 0001:01:00.0: failed to get memcpy hi address for firmware address 76: -28      
kern.warn kernel: [35357.681550] ath10k_pci 0001:01:00.0: failed to get memcpy hi address for firmware address 76: -28      
kern.warn kernel: [35357.690394] ath10k_pci 0001:01:00.0: failed to read firmware RAM BSS memory from 4291136 (48848 B): -28
kern.warn kernel: [35357.699233] ath10k_pci 0001:01:00.0: failed to read firmware ROM BSS memory from 4197376 (12552 B): -28
kern.warn kernel: [35357.709185] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.717847] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.725126] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.732360] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.739704] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.747015] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.754294] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.761584] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.768886] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon      
kern.warn kernel: [35357.781011] ath10k_pci 0001:01:00.0: cannot restart a device that hasn't been started

I also got a hard crash the other night.(using imagebuilder with 19.07.0 final). Happened up until 18.06.2 if I remember correctly. Had fewer crashes after this as well (and the router always restarted).

With 19.07.0 final everything seemed more or less stable before it suddenly hard locked. (Sometimes I have problems reaching the Luci gui, but it seems less of a problem than the 18 series).

Just switched out the ct firmware. if the legacy firmware is stable, then I'm sticking with this until the ct is beyond any doubt stable..which may never happen :smiley:

This is recent development for me. 19.07 and master as of a couple of weeks ago were stable.

It used to be possible to fallback to the non-ct firmware in case of wifi issues, but I just tried that and wifi crash crash every few seconds. So back to the ct one now...

I used the imagebuilder and removed the ct and added the legacy ones. It`s too soon to tell, but my impression is that everything is rock solid now. (On 19.7 final atm).

There's a new beta ct-firmware out that you guys can try out. The changelog says that it fixes a crash: Jan 16, 2020: Fix crash that is probably related to AP rekey problem.

You can find the firmwares here: https://www.candelatech.com/downloads/ath10k-9984-10-4b/ath10k-fw-beta/

The one without htt would be firmware-5-ct-full-community.bin, and the one with htt is firmware-5-ct-full-htt-mgt-community.bin.

And you can install the firmware in the following way:

1 Like

How do I know if I am using htt or not?

The current file in that location is firmware-5.bin: should not the new one be named the same?

If you're currently running ath10k-ct firmware you can check the dmesg. Search for htt-mgt-CT. If you can find it then you're running htt firmware, as I am:

Tue Jan 14 22:05:32 2020 kern.info kernel: [ 14.050561] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fH-013-dd670a2c7 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,htt-mgt-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 7ca1f8c0

In the end it doesn't really matter all that much which of the firmwares you choose. I believe the default is non-htt, but I have always run the htt variant.

You're running vanilla ath10k it seems. When you're running ath10k-ct firmware the file is named ct-firmware-5.bin. I haven't tried renaming the file to something else, so I don't know whether it will cause any issues.

Looks like mine (the default) is not HTT

dmesg | grep "firmware ver"
[   14.010998] ath10k_pci 0000:01:00.0: firmware ver 10.4b-ct-9984-fW-012-6acc9b999 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 2f261949
[   23.350753] ath10k_pci 0001:01:00.0: firmware ver 10.4b-ct-9984-fW-012-6acc9b999 api 5 features mfp,peer-flow-ctrl,txstatus-noack,wmi-10.x-CT,ratemask-CT,regdump-CT,txrate-CT,flush-all-CT,pingpong-CT,ch-regs-CT,nop-CT,set-special-CT,tx-rc-CT,cust-stats-CT,txrate2-CT,beacon-cb-CT,wmi-block-ack-CT,wmi-bcn-rc-CT crc32 2f261949
dmesg | grep "firmware ver" | grep htt

I am running the default CT firmware and file name in the package is firmware-5.bin.

opkg files ath10k-firmware-qca9984-ct
Package ath10k-firmware-qca9984-ct (2019-10-03-d622d160-1) is installed on root and has the following files:
/lib/firmware/ath10k/QCA9984/hw1.0/firmware-5.bin
/lib/firmware/ath10k/QCA9984/hw1.0/board-2.bin

What is the benefit of htt?

https://lists.openwrt.org/pipermail/openwrt-devel/2018-March/011596.html

Ok, just name the new firmware the same. Most likely it will work just fine.

EDIT: I just tried renaming the firmware from ct-firmware-5.bin to firmware-5.bin and everything works just fine. I also checked the Makefile of the firmware, and as you can see the htt variant is named ct-firmware-5.bin and the non-htt variant firmware-5.bin:

define Package/ath10k-firmware-qca9984-ct/install
	$(INSTALL_DIR) $(1)/lib/firmware/ath10k/QCA9984/hw1.0
	$(INSTALL_DATA) \
		$(PKG_BUILD_DIR)/QCA9984/hw1.0/board-2.bin \
		$(1)/lib/firmware/ath10k/QCA9984/hw1.0/board-2.bin
	$(INSTALL_DATA) \
		$(DL_DIR)/$(call CT_FIRMWARE_FILE,QCA9984) \
		$(1)/lib/firmware/ath10k/QCA9984/hw1.0/firmware-5.bin
endef
define Package/ath10k-firmware-qca9984-ct-htt/install
	$(INSTALL_DIR) $(1)/lib/firmware/ath10k/QCA9984/hw1.0
	$(INSTALL_DATA) \
		$(PKG_BUILD_DIR)/QCA9984/hw1.0/board-2.bin \
		$(1)/lib/firmware/ath10k/QCA9984/hw1.0/board-2.bin
	$(INSTALL_DATA) \
		$(DL_DIR)/$(call CT_FIRMWARE_FILE_HTT,QCA9984) \
		$(1)/lib/firmware/ath10k/QCA9984/hw1.0/ct-firmware-5.bin
endef

1 Like

Is the below going to work fine or do I need other command/reboot to make firmware active?

wget https://www.candelatech.com/downloads/ath10k-9984-10-4b/ath10k-fw-beta/firmware-5-ct-non-commercial-full-htt-mgt.bin
cp /lib/firmware/ath10k/QCA9984/hw1.0/firmware-5.bin ~
mv firmware-5-ct-non-commercial-full-htt-mgt.bin /lib/firmware/ath10k/QCA9984/hw1.0/firmware-5.bin
wifi down
wifi up

BTW thank you for that. Since a few days I am using 802.11r and HTT seams to be required for that one.

I can't comment on whether you will need to reboot or not, but you should use the community firmware instead of the non-commercial.

If I understand correctly based on https://www.candelatech.com/ath10k-10.1.php as long as I am not using it for anything commerce related I should be fine. Or?

The non-commercial firmware from Candela Technologies does support multiple station vifs connecting to a single AP (really, it supports rx-software-crypt, which is the enabling feature). The non-commercial firmware is NOT freely available. It is restricted to non-commerical use unless you arrange a commercial-use license with Candela Technologies. Contact sales@candelatech.com for additional information on this topic.

I guess so, but don't quote me on that. It's just that the community variant is what's used by OpenWrt, and that's why I pointed it out.

Copying the old firmware to a "safe place" is quite unnecessary, as the original one will be in /rom in any case. Now just copy an extra copy of it to the overlay...