Netgear R7800 exploration (IPQ8065, QCA9984)

Yeah, seems so

Unfortunately, it doesn't compile for me:

/lede1701/build_dir/target-arm_cortex-a15+neon-vfpv4_musl-1.1.16_eabi/linux-ipq806x/shortcut-fe/sfe_ipv4.c:1366:5: error: 'struct sk_buff' has no member named 'fast_forwarded'

Does that look familiar?

Edit: I see that this error is because the patches in hack-4.4 hasn't been applied. I didn't manually copy the patches to the patches-4.4 directory, which I should have. I'll leave this post here anyway, even if it shows my utter lack of understanding... :frowning:

Edit2: It doesn't seem to affect my performance issue. If anything, it actually got worse (but that could be just a coincidence, though, as I didn't test much). Ah well, thanks for suggesting it anyway.

@avx @mroek @steom
You might be interested to test reverting the ath10k buffer reduction that was done in March in master. That might help with performance issues.

The background is that the ath10k buffer size reduction was introduced a bit sneakily into ipq806x with a commit improving support for QCA4019. (The commit title talks about QCA4019 but does not mention that ath10k buffers get reduced for all chips):
https://git.lede-project.org/?p=source.git;a=commit;h=cc189c0b7fa015978b04bb663a75b1da726376b5

I tried to initiate discussion about that action later, but that got no traction as there was no real proof that the buffer reduction caused harm in a significant way. If there would be proof, the action might hopefully be retracted.

I have made a R7800 test build from the current master that reverts the ath10k buffer size reductions:

Downloadable from my build's dir:

  • revert buffer size: lede-r4694-e7373e489d-20170811-ath10k-buffer-test
  • normal : lede-r4694-e7373e489d-20170811

Ps. If anybody wants to try the same in his own master build, it is just about deleting these two patches that were introduced by that commit:

package/kernel/mac80211/patches/960-0010-ath10k-limit-htt-rx-ring-size.patch
package/kernel/mac80211/patches/960-0011-ath10k-limit-pci-buffer-size.patch

I'll test it some time during the weekend, but I'm skeptical as to whether it will fix the issues. In my case, even just making changes to the 5GHz wifi settings would randomly crash the router completely (causing it to reboot). The buffer changes would most likely only affect stability while doing transfers, and shouldn't matter much when just poking around in the settings.

I couldn't wait, so I tested it just now. Bad news though, performance on wifi is still abysmal. I did the same test as before, and download speed was 20-30 Mbit/s on 5 GHz wifi. Upload speed was actually quite OK (better than before, and on par with stable), but just one time. I repeated the test, but when upload was about to start, something went wrong. The router didn't crash, but the phone lost wifi connectivity and the upload was aborted. The log had this:

Fri Aug 11 21:51:33 2017 kern.warn kernel: [ 273.360252] ath10k_pci 0000:01:00.0: rx ring became corrupted: -5

So as far as I'm concerned, wifi is useless in master, both with and without those two patches.

I posted a new thread about the multicast performance issues I'm seeing, and I would appreciate it if anyone could help me diagnose that issue. Everything is now working correctly (after I fixed the bug with the query messages), except for the performance issue where the router either drops or reorders the multicast UDP packets.

Hi,
I have installed latest hnyman build r4694 with virtually all default settings and then scanned my system in the Shields Up service
https://www.grc.com/x/ne.dll?bh0bkyd2
And I got following results

NO PORTS were found to be OPEN. Ports found to be STEALTH were: 25, 80, 135, 137, 138, 139, 445, 543 Other than what is listed above, all ports are CLOSED. TruStealth: FAILED - NOT all tested ports were STEALTH, - NO unsolicited packets were received, - A PING REPLY (ICMP Echo) WAS RECEIVED.

Please advice, is this state safe enough or I should to close or hide those ports according to their recommendations?

Just follow this:

1 Like

This has nothing to do with R7800, but with firewall in general. So, wrong discussion thread...

You already have all ports closed (or dropping traffic). No traffic gets through.

You might read wiki discussion about the stealth "DROP" or closed "REJECT":
https://lede-project.org/docs/user-guide/firewall_configuration#implications_of_drop_vs_reject

@hnyman
Hi, when you upload new builds in your dropbox, where can I see what was changed compared with previous version?

Is it in *-status.txt file?

Usually there are no changes from me, but just the global changes in main sources and feeds like Luci and packages. You need to check the changelings in those repos.

1 Like

Hi Hyman, following you from the beginning with WNDR3700v2 and now decided to follow also with R7800 which I bought a few days ago.
I want to thank you for the great work you are doing on this router.
I'm successfully compiling your build for R7800 LEDE snapshots and successfully upgraded the firmware from stock to LEDE without problems.
Unfortunayely having problems with leds. In fact 2ghz and 5ghz are wrongly driving the wifi on/off and wps leds instead of the rigth ones.
I read the thread and at the beginning was said that your build has a workaround for this problem, but it seems not to have worked for me. I obviously followed all steps for the building of the the compiling environment.
Can you please explain how to apply the workaround so that it will stay there each time I build a new release?
Thanks in advance for your kind help.
EDIT: I misunderstood the workaround. I realized that 2 and 5ghz leds still not being supported by current drivers.

Good that you figured it out.
Sadly the proper wifi LEDs in R7800 still can't be controlled by opensource ath10k drivers, so I use the wifi on/off and wps LEDs as the workaround to have at least some wifi activity indication.

@hnyman @mroek @tetsuo55 wonder if it fixes rx ring buffer corruption
https://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git/commit/?h=pending&id=f35a7f91f66af528b3ee1921de16bea31d347ab0

Interesting. Can we get this added to the master tree?

2 Likes

@Magnetron1.1
Could you post syslog output of ath10k wireless cards initialisation, maybe there are different revisions

@hnyman
Hi,i download "R7800-lede-r4751-4b3ffecf2b-20170828-1827-sqfs-sysupgrade.tar" from you dropbox. but 5G wireless still has problems.

[ 59.884126] br-lan: port 2(wlan0) entered blocking state
[ 59.889192] br-lan: port 2(wlan0) entered forwarding state
[ 1990.348762] ath10k_pci 0000:01:00.0: rx ring became corrupted: -5
[ 4939.372771] device wlan0 left promiscuous mode
[ 4939.372871] br-lan: port 2(wlan0) entered disabled state
[ 4944.432613] ath10k_pci 0000:01:00.0: failed to flush transmit queue (skip 0 ar-state 1): 0
[ 4944.473604] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
[ 4944.473649] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
[ 4944.479421] ath10k_pci 0000:01:00.0: peer-unmap-event: unknown peer id 1
[ 4944.818420] ath10k_pci 0000:01:00.0: firmware crashed! (uuid 3fb1a044-2ae6-4e78-93fc-efa57c4eb515)
[ 4944.818456] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 4944.826344] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 4944.838678] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00082 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 f301de65
[ 4944.844935] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1 crc32 751efba1
[ 4944.857842] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1
[ 4944.877168] ath10k_pci 0000:01:00.0: failed to get memcpy hi address for firmware address 4: -16
[ 4944.877192] ath10k_pci 0000:01:00.0: failed to read firmware dump area: -16
[ 4944.885076] ath10k_pci 0000:01:00.0: Copy Engine register dump:
[ 4944.891698] ath10k_pci 0000:01:00.0: [00]: 0x0004a000 3735928559 3735928559 3735928559 3735928559
[ 4944.897665] ath10k_pci 0000:01:00.0: [01]: 0x0004a400 3735928559 3735928559 3735928559 3735928559
[ 4944.906692] ath10k_pci 0000:01:00.0: [02]: 0x0004a800 3735928559 3735928559 3735928559 3735928559
[ 4944.915533] ath10k_pci 0000:01:00.0: [03]: 0x0004ac00 3735928559 3735928559 3735928559 3735928559
[ 4944.924405] ath10k_pci 0000:01:00.0: [04]: 0x0004b000 3735928559 3735928559 3735928559 3735928559
[ 4944.933247] ath10k_pci 0000:01:00.0: [05]: 0x0004b400 3735928559 3735928559 3735928559 3735928559
[ 4944.942046] ath10k_pci 0000:01:00.0: [06]: 0x0004b800 3735928559 3735928559 3735928559 3735928559
[ 4944.950960] ath10k_pci 0000:01:00.0: [07]: 0x0004bc00 3735928559 3735928559 3735928559 3735928559
[ 4944.959800] ath10k_pci 0000:01:00.0: [08]: 0x0004c000 3735928559 3735928559 3735928559 3735928559
[ 4944.968672] ath10k_pci 0000:01:00.0: [09]: 0x0004c400 3735928559 3735928559 3735928559 3735928559
[ 4944.977516] ath10k_pci 0000:01:00.0: [10]: 0x0004c800 3735928559 3735928559 3735928559 3735928559
[ 4944.986385] ath10k_pci 0000:01:00.0: [11]: 0x0004cc00 3735928559 3735928559 3735928559 3735928559
[ 4945.035173] ath10k_pci 0000:01:00.0: cannot restart a device that hasn't been started
[ 4951.251155] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 7
[ 4951.251182] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 1
[ 4951.257687] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 2
[ 4951.265217] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 8
[ 4951.272777] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 9
[ 4951.280269] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 11
[ 4951.287897] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 12
[ 4951.295426] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 14
[ 4951.303087] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 15
[ 4951.310663] ath10k_pci 0000:01:00.0: received tx completion for invalid msdu_id: 16
[ 4951.474607] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 4951.498492] br-lan: port 2(wlan0) entered blocking state
[ 4951.498544] br-lan: port 2(wlan0) entered disabled state

@hnyman do you have buffer sizes restored in that build?
@tetsuo55 is running my build for more than 7 days without issues already, while before it has been 1-2 days till crash

@hnyman
Hi Hyman,following "R7800-lede-r4767-9adfeccd84-20170830-2114-sqfs-sysupgrade". Config 5G wireless of (option htmode 'VHT80')then carsh,but VHT40 is ok.

cat /proc/version
Linux version 4.9.45 (perus@ub1704) (gcc version 5.4.0 (LEDE GCC 5.4.0 r4767-9adfeccd84) ) #0 SMP Wed Aug 30 17:35:56 2017

cat /etc/config/wireless
onfig wifi-device 'radio0'
option type 'mac80211'
option hwmode '11a'
option path 'soc/1b500000.pci/pci0000:00/0000:00:00.0/0000:01:00.0'
option country 'CN'
option channel '44'
option htmode 'VHT80'

config wifi-iface 'default_radio0'
option device 'radio0'
option network 'lan'
option mode 'ap'
option ssid 'Virtual-5G'
option encryption 'psk-mixed'
option key 'xxxxxxxx'
option wps_pushbutton '0'
option macfilter 'allow'
list maclist '48:3C:0C:7F:C0:00'
list maclist '00:23:24:F8:2F:2C'
list maclist '88:70:8C:49:04:F0'
list maclist '54:25:EA:97:C4:29'

config wifi-device 'radio1'
option type 'mac80211'
option hwmode '11g'
option path 'soc/1b700000.pci/pci0001:00/0001:00:00.0/0001:01:00.0'
option country 'CN'
option channel '6'
option htmode 'HT40'

config wifi-iface 'default_radio1'
option device 'radio1'
option network 'lan'
option mode 'ap'
option ssid 'Virtual'
option encryption 'psk-mixed'
option key 'xxxxxxxx'
option wps_pushbutton '0'
option macfilter 'allow'
list maclist 'D4:97:0B:8D:93:74'
list maclist 'A0:04:60:11:79:DA'
list maclist '88:70:8C:49:04:F0'
list maclist '48:3C:0C:7F:C0:00'
list maclist '54:25:EA:97:C4:29'
list maclist '10:0B:A9:22:4C:C4'
list maclist '8C:91:09:FA:1D:A7'

[14305.000415] br-lan: port 3(wlan0) entered blocking state
[14305.005797] br-lan: port 3(wlan0) entered forwarding state
[14315.775700] ------------[ cut here ]------------
[14315.775752] WARNING: CPU: 0 PID: 0 at net/core/dev.c:5214 net_rx_action+0x11c/0x2a8
[14315.779389] Modules linked in: pppoe ppp_async pptp pppox ppp_mppe ppp_generic iptable_nat ipt_REJECT ipt_MASQUERADE xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_esp xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_HL xt_DSCP xt_CLASSIFY usbserial slhc nf_reject_ipv4 nf_nat_rtsp nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_log_ipv4 nf_defrag_ipv4 nf_conntrack_rtsp nf_conntrack_rtcache nf_conntrack_netlink iptable_mangle iptable_filter ipt_ah ipt_ECN ip_tables crc_ccitt fuse sch_cake act_skbedit act_mirred em_u32 cls_u32 cls_tcindex cls_flow cls_route cls_fw sch_tbf sch_htb sch_hfsc sch_ingress ath10k_pci ath10k_core ath mac80211
[14315.856637] cfg80211 compat ledtrig_usbport xt_set ip_set_list_set ip_set_hash_netiface ip_set_hash_netport ip_set_hash_netnet ip_set_hash_net ip_set_hash_netportnet ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6t_NPT ip6t_MASQUERADE nf_nat_masquerade_ipv6 ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat nf_conntrack ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables x_tables msdos ip_gre gre ifb sit tunnel4 ip_tunnel tun vfat fat hfsplus cifs nls_utf8 nls_iso8859_15 nls_iso8859_1 nls_cp850 nls_cp437 nls_cp1250 sha1_generic md5 md4 usb_storage leds_gpio xhci_plat_hcd xhci_pci xhci_hcd dwc3 dwc3_of_simple ohci_platform ohci_hcd phy_qcom_dwc3 ahci ehci_platform sd_mod ahci_platform libahci_platform libahci libata scsi_mod ehci_hcd gpio_button_hotplug ext4 jbd2 mbcache exfat crc32c_generic
[14315.943362] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.45 #0
[14315.943622] Hardware name: Generic DT based system
[14315.949390] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[14315.954238] [] (show_stack) from [] (dump_stack+0x7c/0x9c)
[14315.962133] [] (dump_stack) from [] (__warn+0xbc/0xec)
[14315.969157] [] (__warn) from [] (warn_slowpath_null+0x1c/0x24)
[14315.976016] [] (warn_slowpath_null) from [] (net_rx_action+0x11c/0x2a8)
[14315.983576] [] (net_rx_action) from [] (__do_softirq+0xd0/0x204)
[14315.991817] [] (__do_softirq) from [] (irq_exit+0x94/0x104)
[14315.999810] [] (irq_exit) from [] (__handle_domain_irq+0x90/0xb4)
[14316.006835] [] (__handle_domain_irq) from [] (gic_handle_irq+0x50/0x94)
[14316.014826] [] (gic_handle_irq) from [] (__irq_svc+0x6c/0x90)
[14316.022967] Exception stack(0xc0763f60 to 0xc0763fa8)
[14316.030628] 3f60: 00000001 00000000 00000000 c021a420 00000000 c0762000 c0764fe4 00000001
[14316.035668] 3f80: c075ea30 00000000 c0763fb8 00000001 00000000 c0763fb0 c020f510 c020f514
[14316.043807] 3fa0: 60000013 ffffffff
[14316.051970] [] (__irq_svc) from [] (arch_cpu_idle+0x2c/0x38)
[14316.055273] [] (arch_cpu_idle) from [] (cpu_startup_entry+0xe8/0x198)
[14316.062912] [] (cpu_startup_entry) from [] (start_kernel+0x36c/0x3f0)
[14316.071090] ---[ end trace 19de91d94220c248 ]---
[14316.080276] ath10k_pci 0000:01:00.0: rx ring became corrupted: -5