Netgear R7800 exploration (IPQ8065, QCA9984)


#476

The r4750 "official" build (Aug 30) and the r4723, r4767, and r4773 hnyman builds have the MAC error problem on my router. The MAC error occurred so frequently on r4773 that the web GUI was useless. Had to use TFTP to restore r3498 (4.4 kernel) and all is well again. I'm beginning to think that there may be a timing "hole" or "window of vulnerablity" when using the Atheros Wi-Fi drivers with the 4.9 kernel.


#477

Weird enough, but I have similar issues on 2.4 ghz only - broken GUI - wrong styles and buttons (looks indeed like frames corruption), but 5ghz is perfectly fine


#478

Please, post your ath10k init log, it could be different qca9984 revisions in the same model (for different countries?) that may require different board data files like @chunkeey suspects.

edit: @hnyman does your 3498 build includes cal -> pre-cal fix? If yes then it’s not board data after all.


#479

Yes. My 17.01 build includes the calibration data reading fix.
But official 17.01 releases do not.


#480
BusyBox v1.25.1 () built-in shell (ash)

 _________
/        /\      _    ___ ___  ___

/ LE / \ | | | | | |
/ DE / \ | |
| || |) | |
/
______/ LE \ |||/|| lede-project.org
\ \ DE /
\ LE \ / -----------------------------------------------------------
\ DE \ / Reboot (17.01-SNAPSHOT, r3498-dc8392f6a1)
_
/ -----------------------------------------------------------

root@R7800RT1:~# dmesg|grep ath
[ 22.521630] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[ 22.521740] ath10k_pci 0000:01:00.0: enabling bus mastering
[ 22.522304] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 23.094867] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 23.094935] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 23.110276] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00082 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 f301de65
[ 25.419779] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1 crc32 751efba1
[ 31.271638] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[ 31.345670] ath: EEPROM regdomain: 0x0
[ 31.345688] ath: EEPROM indicates default country code should be used
[ 31.345702] ath: doing EEPROM country->regdmn map search
[ 31.345718] ath: country maps to regdmn code: 0x3a
[ 31.345731] ath: Country alpha2 being used: US
[ 31.345742] ath: Regpair used: 0x3a
[ 31.352198] ath10k_pci 0001:01:00.0: enabling device (0140 -> 0142)
[ 31.352326] ath10k_pci 0001:01:00.0: enabling bus mastering
[ 31.352922] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 31.532623] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 31.532658] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 31.544246] ath10k_pci 0001:01:00.0: firmware ver 10.4-3.4-00082 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 f301de65
[ 33.823970] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 751efba1
[ 39.685447] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[ 39.815945] ath: EEPROM regdomain: 0x0
[ 39.815966] ath: EEPROM indicates default country code should be used
[ 39.815981] ath: doing EEPROM country->regdmn map search
[ 39.816001] ath: country maps to regdmn code: 0x3a
[ 39.816016] ath: Country alpha2 being used: US
[ 39.816029] ath: Regpair used: 0x3a

===================================================================

Here's another dump using different firmware.

BusyBox v1.26.2 () built-in shell (ash)

 _________
/        /\      _    ___ ___  ___

/ LE / \ | | | | | |
/ DE / \ | |
| || |) | |
/
______/ LE \ |||/|| lede-project.org
\ \ DE /
\ LE \ / -----------------------------------------------------------
\ DE \ / Reboot (SNAPSHOT, r4767-9adfeccd84)
_
/ -----------------------------------------------------------

root@R7800RT1:~# dmesg|grep ath
[ 19.464311] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[ 19.464396] ath10k_pci 0000:01:00.0: enabling bus mastering
[ 19.464848] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 19.643682] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
[ 19.643732] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 36.978538] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 36.978594] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 36.996727] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00082 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 f301de65
[ 39.282406] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1 crc32 751efba1
[ 45.131241] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[ 45.211861] ath: EEPROM regdomain: 0x0
[ 45.211871] ath: EEPROM indicates default country code should be used
[ 45.211878] ath: doing EEPROM country->regdmn map search
[ 45.211889] ath: country maps to regdmn code: 0x3a
[ 45.211897] ath: Country alpha2 being used: US
[ 45.211904] ath: Regpair used: 0x3a
[ 45.217376] ath10k_pci 0001:01:00.0: enabling device (0140 -> 0142)
[ 45.217498] ath10k_pci 0001:01:00.0: enabling bus mastering
[ 45.218036] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 45.413744] ath10k_pci 0001:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0001:01:00.0.bin failed with error -2
[ 45.413784] ath10k_pci 0001:01:00.0: Falling back to user helper
[ 45.676852] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 45.676895] ath10k_pci 0001:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 45.691371] ath10k_pci 0001:01:00.0: firmware ver 10.4-3.4-00082 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 f301de65
[ 47.966629] ath10k_pci 0001:01:00.0: board_file api 2 bmi_id 0:2 crc32 751efba1
[ 53.816432] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[ 53.902735] ath: EEPROM regdomain: 0x0
[ 53.902746] ath: EEPROM indicates default country code should be used
[ 53.902752] ath: doing EEPROM country->regdmn map search
[ 53.902765] ath: country maps to regdmn code: 0x3a
[ 53.902775] ath: Country alpha2 being used: US
[ 53.902783] ath: Regpair used: 0x3a

=====================================================================

I see the error -2 on DD-WRT firmwares and so far nothing horrible has happened to my routers.
Is the "user helper" loading the pre-cal data or are the Atheros drivers assuming built-in defaults?

Update 1
After searching the all-wise, all-knowing internet, I can now answer my own question.
Error -2 means the kernel could not directly find the pre-cal data and firmware files and called upon a user space helper to find them which was successful. It seems strange that the 4.9 kernel had trouble with this, but the 4.4 kernel didn't.

  • Magnetron1.1

#481

This a great post from another thread on issues with this device when using the 4.9 kernel. I'm just bringing it over but the credit all goes to Megnetron1.1:

Following @LuisGC idea about using ethtool, I was able to overcome the MAC error problem with the r4773 build.
Install ethtool (opkg install ethtool) and experiment with the br-lan interface since it is the center point for most of the other interfaces.

First list how this interface is configured before changing anything:

root@R7800RT1:~# ethtool -k br-lan

Features for br-lan:
rx-checksumming: off [fixed]
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [requested on]
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: on
	tx-tcp-mangleid-segmentation: on
	tx-tcp6-segmentation: on
udp-fragmentation-offload: off [requested on]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: on [fixed]
tx-gso-robust: off [requested on]
tx-fcoe-segmentation: off [requested on]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [requested on]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]
hw-tc-offload: off [fixed]

Notice how most of the tx settings are "on". Try setting them "off".

root@R7800RT1:~# ethtool --offload br-lan tx off

Actual changes:
tx-checksumming: off
	tx-checksum-ip-generic: off
tcp-segmentation-offload: off
	tx-tcp-segmentation: off [requested on]
	tx-tcp-ecn-segmentation: off [requested on]
	tx-tcp-mangleid-segmentation: off [requested on]
	tx-tcp6-segmentation: off [requested on]

Now re-run the "tree /" test (which fails every time for me on r4xxx builds).

root@R7800RT1:~# tree /
The entire tree is displayed followed by
12302 directories, 61309 files

Although I took a "shotgun" approach, so far these changes haven't had much impact on performance, but could possibly be refined.

Unfortunately the ethtool changes won't survive a router power-cycle or reboot.
To fix that, on LuCi go to System => Startup and in the box under Local Startup insert the ethtool syntax.

# Put your custom commands here that should be executed once
# the system init finished. By default this file does nothing.
ethtool --offload br-lan tx off
exit 0

These changes should hold unless the router's firmware is overwritten, then the ethtool will need re-installing.

That's the good news; now for the bad news which has nothing to do with ethtool.
Has anyone noticed that the r4xxx builds (4.9 kernel) have poor Wi-Fi performance? Install iperf3 (opkg install iperf3) and test with one of my octacore HP laptops configured as an iperf3 server (iperf3 -s) on the 5GHz radio. Router is still on r4773. The router antennas are 4 feet from the laptop's Wi-Fi dongle.

root@R7800RT1:~# iperf3 -c 192.168.16.121

Connecting to host 192.168.16.121, port 5201
[  5] local 192.168.16.1 port 44032 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.25 MBytes  27.2 Mbits/sec   45   14.1 KBytes       
[  5]   1.00-2.00   sec  3.15 MBytes  26.4 Mbits/sec   50   11.3 KBytes       
[  5]   2.00-3.00   sec  3.29 MBytes  27.6 Mbits/sec   48   29.7 KBytes       
[  5]   3.00-4.00   sec  3.76 MBytes  31.5 Mbits/sec   42   7.07 KBytes       
[  5]   4.00-5.00   sec  3.37 MBytes  28.3 Mbits/sec   49   7.07 KBytes       
[  5]   5.00-6.00   sec  2.73 MBytes  22.9 Mbits/sec   42   8.48 KBytes       
[  5]   6.00-7.00   sec  2.49 MBytes  20.9 Mbits/sec   44   7.07 KBytes       
[  5]   7.00-8.00   sec  2.86 MBytes  24.0 Mbits/sec   42   14.1 KBytes       
[  5]   8.00-9.00   sec  3.36 MBytes  28.1 Mbits/sec   46   7.07 KBytes       
[  5]   9.00-10.00  sec  2.98 MBytes  25.0 Mbits/sec   35   9.90 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  31.2 MBytes  26.2 Mbits/sec  443             sender
[  5]   0.00-10.00  sec  31.1 MBytes  26.1 Mbits/sec                  receiver

iperf Done.

Notice the low transfer and high retry count.

What tcp congestion algorithms are on this router?

root@R7800RT1:~# sysctl -a 2>/dev/null|grep congestion
net.ipv4.tcp_allowed_congestion_control = cubic reno
net.ipv4.tcp_available_congestion_control = cubic reno
net.ipv4.tcp_congestion_control = cubic

So "cubic" is the default; try "reno".

root@R7800RT1:~# iperf3 -c 192.168.16.121 -C reno

Connecting to host 192.168.16.121, port 5201
[  5] local 192.168.16.1 port 44040 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  3.99 MBytes  33.5 Mbits/sec   54   15.6 KBytes       
[  5]   1.00-2.00   sec  3.73 MBytes  31.3 Mbits/sec   49   9.90 KBytes       
[  5]   2.00-3.00   sec  3.73 MBytes  31.3 Mbits/sec   55   11.3 KBytes       
[  5]   3.00-4.00   sec  3.85 MBytes  32.3 Mbits/sec   47   11.3 KBytes       
[  5]   4.00-5.00   sec  3.36 MBytes  28.1 Mbits/sec   54   8.48 KBytes       
[  5]   5.00-6.00   sec  3.42 MBytes  28.7 Mbits/sec   52   24.0 KBytes       
[  5]   6.00-7.00   sec  3.54 MBytes  29.7 Mbits/sec   51   15.6 KBytes       
[  5]   7.00-8.00   sec  3.17 MBytes  26.6 Mbits/sec   58   2.83 KBytes       
[  5]   8.00-9.00   sec  3.79 MBytes  31.8 Mbits/sec   54   18.4 KBytes       
[  5]   9.00-10.00  sec  3.73 MBytes  31.3 Mbits/sec   50   7.07 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  36.3 MBytes  30.5 Mbits/sec  524             sender
[  5]   0.00-10.00  sec  36.1 MBytes  30.3 Mbits/sec                  receiver

iperf Done.

Higher transfer but higher retry count.

Now try a wired connection.

root@R7800RT1:~# iperf3 -c 192.168.16.196

Connecting to host 192.168.16.196, port 5201
[  5] local 192.168.16.1 port 38042 connected to 192.168.16.196 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   101 MBytes   847 Mbits/sec    0    595 KBytes       
[  5]   1.00-2.01   sec   110 MBytes   922 Mbits/sec    0    622 KBytes       
[  5]   2.01-3.01   sec  97.5 MBytes   813 Mbits/sec    0    822 KBytes       
[  5]   3.01-4.01   sec   101 MBytes   855 Mbits/sec    0    822 KBytes       
[  5]   4.01-5.00   sec   101 MBytes   852 Mbits/sec    0    933 KBytes       
[  5]   5.00-6.01   sec   112 MBytes   936 Mbits/sec    0    933 KBytes       
[  5]   6.01-7.00   sec   101 MBytes   855 Mbits/sec    0    984 KBytes       
[  5]   7.00-8.00   sec   100 MBytes   842 Mbits/sec    0    984 KBytes       
[  5]   8.00-9.01   sec  78.8 MBytes   654 Mbits/sec    0   1.07 MBytes       
[  5]   9.01-10.00  sec   109 MBytes   920 Mbits/sec    0   1.07 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1013 MBytes   849 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1013 MBytes   849 Mbits/sec                  receiver

iperf Done. <== Very good.

===========================================================================

Re-install r3498 build (4.4 kernel) and iperf3.

root@R7800RT1:~# cat /etc/os-release|grep -i rel
LEDE_RELEASE="LEDE Reboot 17.01-SNAPSHOT r3498-dc8392f6a1"

root@R7800RT1:~# uname -mrvos
Linux 4.4.83 #0 SMP Sun Aug 27 16:31:45 2017 armv7l GNU/Linux

root@R7800RT1:~# iperf3 -c 192.168.16.121

Connecting to host 192.168.16.121, port 5201
[  4] local 192.168.16.1 port 42076 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  17.3 MBytes   145 Mbits/sec    0    411 KBytes       
[  4]   1.00-2.00   sec  19.1 MBytes   161 Mbits/sec    0    544 KBytes       
[  4]   2.00-3.00   sec  17.8 MBytes   149 Mbits/sec    0    632 KBytes       
[  4]   3.00-4.00   sec  18.3 MBytes   154 Mbits/sec    0    740 KBytes       
[  4]   4.00-5.00   sec  17.5 MBytes   146 Mbits/sec    0    740 KBytes       
[  4]   5.00-6.00   sec  18.1 MBytes   152 Mbits/sec    0    740 KBytes       
[  4]   6.00-7.00   sec  16.7 MBytes   140 Mbits/sec    0    861 KBytes       
[  4]   7.00-8.01   sec  18.7 MBytes   156 Mbits/sec    0   1.05 MBytes       
[  4]   8.01-9.00   sec  17.9 MBytes   151 Mbits/sec    0   1.22 MBytes       
[  4]   9.00-10.00  sec  17.0 MBytes   142 Mbits/sec    0   1.29 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   178 MBytes   150 Mbits/sec    0             sender
[  4]   0.00-10.00  sec   176 MBytes   147 Mbits/sec                  receiver

iperf Done.

Now that's more like it.

Try the "reno" algorithm

root@R7800RT1:~# iperf3 -c 192.168.16.121 -C reno

Connecting to host 192.168.16.121, port 5201
[  4] local 192.168.16.1 port 42080 connected to 192.168.16.121 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  22.0 MBytes   184 Mbits/sec    0   2.76 MBytes       
[  4]   1.00-2.01   sec  19.7 MBytes   165 Mbits/sec    0   3.14 MBytes       
[  4]   2.01-3.00   sec  20.0 MBytes   168 Mbits/sec    0   4.96 MBytes       
[  4]   3.00-4.05   sec  20.0 MBytes   160 Mbits/sec    0   6.01 MBytes       
[  4]   4.05-5.00   sec  19.4 MBytes   172 Mbits/sec    0   6.01 MBytes       
[  4]   5.00-6.00   sec  19.9 MBytes   167 Mbits/sec    0   6.01 MBytes       
[  4]   6.00-7.00   sec  19.5 MBytes   164 Mbits/sec    0   6.01 MBytes       
[  4]   7.00-8.00   sec  18.0 MBytes   151 Mbits/sec    0   6.01 MBytes       
[  4]   8.00-9.00   sec  19.5 MBytes   163 Mbits/sec    0   6.01 MBytes       
[  4]   9.00-10.00  sec  19.9 MBytes   167 Mbits/sec    0   6.01 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   198 MBytes   166 Mbits/sec    0             sender
[  4]   0.00-10.00  sec   195 MBytes   163 Mbits/sec                  receiver

iperf Done.

Here "reno" has better transfer than the default "cubic".

Now try a wired connection.

root@R7800RT1:~# iperf3 -c 192.168.16.196

Connecting to host 192.168.16.196, port 5201
[  4] local 192.168.16.1 port 52298 connected to 192.168.16.196 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec   112 MBytes   942 Mbits/sec    0    478 KBytes       
[  4]   1.00-2.00   sec   111 MBytes   935 Mbits/sec    0    478 KBytes       
[  4]   2.00-3.00   sec   107 MBytes   897 Mbits/sec    0    499 KBytes       
[  4]   3.00-4.00   sec   112 MBytes   942 Mbits/sec    0    499 KBytes       
[  4]   4.00-5.00   sec   107 MBytes   895 Mbits/sec    0    522 KBytes       
[  4]   5.00-6.00   sec   112 MBytes   941 Mbits/sec    0    522 KBytes       
[  4]   6.00-7.00   sec   106 MBytes   892 Mbits/sec    0    522 KBytes       
[  4]   7.00-8.01   sec   108 MBytes   894 Mbits/sec    0    522 KBytes       
[  4]   8.01-9.00   sec   106 MBytes   897 Mbits/sec    0    588 KBytes       
[  4]   9.00-10.00  sec   112 MBytes   941 Mbits/sec    0    588 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec  1.07 GBytes   918 Mbits/sec    0             sender
[  4]   0.00-10.00  sec  1.07 GBytes   916 Mbits/sec                  receiver

	[  4]   0.00-10.00  sec  1.07 GBytes   916 Mbits/sec                  receiver
 
iperf Done. <== Very good.

=========================================================================

These are my results, but YMMV.
Magnetron1.1


Build for Netgear R7800
#482

The interesting part here is that both 17.01 and master share the same compat wireless 2017.01.31, and in 17.01 @hnyman has applied calibration fix and newer wireless firmware. There’s not a big difference in wireless drivers between branches relating to general mac80211 and ath10k, so maybe this issue lies somewhere outside of wireless driver?
Another point is that different people with same hardware revisions receive different result - I have only 2.4ghz corrupted, while @Magnetron1.1 experience it on both bands


#483

Just a wild guess, but could these commits be somehow related
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/drivers/net/wireless/ath/ath10k/htt_rx.c?h=v4.9.47&id=9e19e13261423eeb4398177001daa874c2128aa4
that is followed by
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.9.47&id=2f38c3c01de945234d23dd163e3528ccb413066d


#484

Is anyone testing latest @hnyman ‘s build that includes all 3 fixes?


#485

Yes. I'm still having MAC errors which can be fixed using ethtool and still experiencing poor Wi-Fi performance on the r4786 build. Are the 2 patches you mentioned yesterday incorporated in this build?


#486

I’m talking about lede-r4773-12930fc045-20170831-rxring-buffersize-pcie-test
I don’t know if @hnyman included those patches into 4786, but anyway it’s not about corrupted frames, it’s about rx ring buffer corruption.

Meanwhile I’ve been talking to dd-wrt dev that had similar issue 2 years ago, he insisted on trying these changes no matter that we are experiencing it on wireless:
http://svn.dd-wrt.com/changeset/28281
http://svn.dd-wrt.com/changeset/28287

According to changes it does seem like packet alignment issue, so maybe reverting these also worths a try as a second option
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.9.47&id=9e19e13261423eeb4398177001daa874c2128aa4
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.9.47&id=2f38c3c01de945234d23dd163e3528ccb413066d


#487

I've been using LEDE firmwares for about a month and don't recall ever seeing the "rx ring buffer corruption" problem. On the r4773 build, I have seen where the ath10k_pci driver reported a firmware crash on the 5GHz radio, produced a firmware register dump, and several unsuccessful restarts on the radio. I think an iperf3 test was in progress at the time and hung. Rebooting the router solved the problem.

Hopefully, @brainslayer 's two patches (or their equivalent) will find their way into LEDE.


#488

Nope, I’ve found out that these similar changes are already in our device tree, so that’s not the case...


#489

For people looking for ethernet performance: backport this patch. Gives me ~20mbps on iperf. Not much but i'll take it.

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c?h=next-20170905&id=6ad20165d376fa07919a70e4f43dfae564601829


#490

Hi. Im trying to get back to stock Netgear Firmware from LEDE using TFTP and getting stuck. Using the instructions provided here, I was able to get the Router to switch to TFTP mode( I see the power LEDE flashing white), and can ping the router @192.168.1.1. When I try to transfer the Netgear firmware img file, my TFTP client in Windows shows that Flash has happened succesfully, but I don't see the router restarting. if I manually restart the router, it goes back to LEDE. Not sure how to fix this issue. Can someone please help? My LEDE configuration is also screwed up and its not able to connect to the internet. At this point, I just want to go back to stock.


#491

If you still can access the LEDE either from LuCi GUI or from console, you can easily clear the faulty config to make the router look like just after the flash, default config. Just select reset from LuCi, or issue 'firstboot' command on console and reboot. That should clear the config.

Regarding TFTP, it has worked like a charm for me several times. Tftp completion is just the completion of the file transfer, but the actual flashing starts then and takes 1-2 minutes. The router should boot automatically.

Possible errors:

  • you have wrong image file and the flashing routine discards it
  • Your file transfer fails, e.g. you do not have a fixed IP address on the PC
  • you have not really triggered the tftp mode correctly. Flashing white light sounds correct, but still a possibility

I have used the tftp2 GUI tool from ddwrt site.


#492

I recently bought TP-Link C2600 with IPQ8064.
I wanted to use it with my 250/20 Mb/s connection with SQM, yet for some reason this router can't properly utilise this connection. I've seen reports in this thread from other users saying something similar.

I also came across this post: Best Wireless AC router for LEDE?
If it's true then I'm hugely disappointed.

Users in this thread also indicated problems with SQM:
Post 386
Post 422

Can anyone advice me what should I do? I bought it from Amazon so I have 30 days to return it.


#493

I don't have windows, but in case it's helpful, here is the exact command I ran in linux using tftp-hpa to flash stock firmware:

tftp -v -m binary 192.168.1.1 -c put R7800-V1.0.2.32.img

I got the file by downloading and unpacking this zip archive.

For the R7800, it takes a couple of minutes after the tftp succeeds for the router to flash and reboot the new firmware, so make sure you wait.

One weird thing I noticed is that flashing firmware by tftp doesn't completely wipe the device. In particular, when I did this there was still an old password on the router. So it seems holding down reset while booting is only for entering tftp mode, not for actually resetting the router. However, once the router has booted the new firmware, you can go back to factory settings by holding the reset button for 7-10 seconds.

I really like this aspect of the R7800, which essentially makes it very hard to brick. Once you've figured out the particulars of your tftp client. Note that the netgear page suggests putting in a username and password, which confused me because my tftp client doesn't do authentication, but it turns out you don't need that at all.


#494

Try building a version with dissent1's fastpath patch applied.


#495

was any successful in doing this? I'm looking for some performance statistics on exactly this. I rely on my router too much for day to day operation so binge flashing is out of the question for me...:wink: